Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pkCTm-0005NV-5x for pgsql-hackers@arkaria.postgresql.org; Wed, 05 Apr 2023 23:23:50 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pkCTk-0004Yq-Uu for pgsql-hackers@arkaria.postgresql.org; Wed, 05 Apr 2023 23:23:48 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pkCTk-0004Yh-Hv for pgsql-hackers@lists.postgresql.org; Wed, 05 Apr 2023 23:23:48 +0000 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1pkCTh-0018oa-2d for pgsql-hackers@postgresql.org; Wed, 05 Apr 2023 23:23:47 +0000 Received: by mail-pj1-x102a.google.com with SMTP id h12-20020a17090aea8c00b0023d1311fab3so38877327pjz.1 for ; Wed, 05 Apr 2023 16:23:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680737022; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=F3jzDfSKzWvhGg18m9EU9hvtqXrD94WQ5YABw1jjkEo=; b=QjukLBLE/xFa3PWXDh69cCbmuKzdftPzATKI/xjyrrZVCQzpdvDKMnusIAu85w7ALa 8ocRHiBukNEuDvRov33TIxKPb0v02JKYU2b/PoZzRRacblYCqbI1c7ywGnH2FSZ66UZ+ cdjH1v9PwnE1hoBOHZci5kBpz0ro84D9e0wcd40g0M6QiyKwr6t3lzSRHHNbOw6WN9sx 77Klxrz8H6UvZz4vQ9cXzcvgkCSKm7EWT1qDzQxthKftNyfrzI+bGGE32Q2L7ZZY82CA MKwPPJGPIa/bMc+d14NthI7rkN4zH4OHnN0b1cNil6nazHFNRrICIXlELiSHsIF7/wOl jSog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680737022; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=F3jzDfSKzWvhGg18m9EU9hvtqXrD94WQ5YABw1jjkEo=; b=5HNHy2rjnCX0SvToFp55ljlOL9nRW13z2OEFWHdiXfI+lDTadg64M8K2cTWjQ3Aw1L bvFwP7yPcxgRHyu8GIwjc879u4mTvqWp+kVk23pLOXQPNwJuZz8cIMfvhfMyh9mAScc5 TteDbCSIivrMee0GS+0ZFsXJdeHBmjYdIwNmAazOX7LowuC1HryaAn6ndV6LR2gWEuLH uvDzlZbagILJUay08M/LE1xmkb0309CHEpCZCaXyIRsc5dpwY8tTs/ayQ1pr7m8/KIPa 113wlOJSIA5K8NhX+fLpVLLE57nyMOvh/A+VXxnmtxefCWU2qcQ96mLItfrCEsQCzgsp W27Q== X-Gm-Message-State: AAQBX9eMhaLWCXf4u+EW1RJhH8g3mpBqefXgr0MFHF6R8/wHbC8EuwHi 5sWuC3J7o03IZKg7n9MbtOGyBX/Ox5clX9PwLKw= X-Google-Smtp-Source: AKy350Zgi15gW+TqURi2tQftpLIM5V61w2n9f51coCYwkFxBWhFdg0nquOlMpgd4moq+vr9LYhRcyr3SaFJXgVS3Fp4= X-Received: by 2002:a17:90a:fc85:b0:23d:2de7:717d with SMTP id ci5-20020a17090afc8500b0023d2de7717dmr3012601pjb.0.1680737022549; Wed, 05 Apr 2023 16:23:42 -0700 (PDT) MIME-Version: 1.0 References: <20221221101846.7zsi7lscnm5ediik@alvherre.pgsql> <1350682.1671635927@sss.pgh.pa.us> <4191508.1674157166@sss.pgh.pa.us> <349124.1674185509@sss.pgh.pa.us> <20230207180855.xy5m4puwh5gzd7xy@awork3.anarazel.de> <1274885.1680558101@sss.pgh.pa.us> In-Reply-To: From: Amit Langote Date: Thu, 6 Apr 2023 08:23:31 +0900 Message-ID: Subject: Re: generic plans and "initial" pruning To: Tom Lane Cc: Alvaro Herrera , Andres Freund , David Rowley , Jacob Champion , PostgreSQL-development , Robert Haas Content-Type: multipart/alternative; boundary="00000000000030aa3605f89f1557" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000030aa3605f89f1557 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Apr 4, 2023 at 10:29=E2=80=AFPM Amit Langote wrote: > On Tue, Apr 4, 2023 at 6:41=E2=80=AFAM Tom Lane wrote= : > > A few concrete thoughts: > > > > * I understand that your plan now is to acquire locks on all the > > originally-named tables, then do permissions checks (which will > > involve only those tables), then dynamically lock just inheritance and > > partitioning child tables as we descend the plan tree. > > Actually, with the current implementation of the patch, *all* of the > relations mentioned in the plan tree would get locked during the > ExecInitNode() traversal of the plan tree (and of those in > plannedstmt->subplans), not just the inheritance child tables. > Locking of non-child tables done by the executor after this patch is > duplicative with AcquirePlannerLocks(), so that's something to be > improved. > > > That seems > > more or less okay to me, but it could be reflected better in the > > structure of the patch perhaps. > > > > * In particular I don't much like the "viewRelations" list, which > > seems like a wart; those ought to be handled more nearly the same way > > as other RTEs. (One concrete reason why is that this scheme is going > > to result in locking views in a different order than they were locked > > during original parsing, which perhaps could contribute to deadlocks.) > > Maybe we should store an integer list of which RTIs need to be locked > > in the early phase? Building that in the parser/rewriter would provide > > a solid guide to the original locking order, so we'd be trivially sure > > of duplicating that. (It might be close enough to follow the RT list > > order, which is basically what AcquireExecutorLocks does today, but > > this'd be more certain to do the right thing.) I'm less concerned > > about lock order for child tables because those are just going to > > follow the inheritance or partitioning structure. > > What you've described here sounds somewhat like what I had implemented > in the patch versions till v31, though it used a bitmapset named > minLockRelids that is initialized by setrefs.c. Your idea of > initializing a list before planning seems more appealing offhand than > the code I had added in setrefs.c to populate that minLockRelids > bitmapset, which would be bms_add_range(1, list_lenth(finalrtable)), > followed by bms_del_members(set-of-child-rel-rtis). > > I'll give your idea a try. After sleeping on this, I think we perhaps don't need to remember originally-named relations if only for the purpose of locking them for execution. That's because, for a reused (cached) plan, AcquirePlannerLocks() would have taken those locks anyway. AcquirePlannerLocks() doesn't lock inheritance children because they would be added to the range table by the planner, so they should be locked separately for execution, if needed. I thought taking the execution-time locks only when inside ExecInit[Merge]Append would work, but then we have cases where single-child Append/MergeAppend are stripped of the Append/MergeAppend nodes by setrefs.c. Maybe we need a place to remember such child relations, that is, only in the cases where Append/MergeAppend elision occurs, in something maybe esoteric-sounding like PlannedStmt.elidedAppendChildRels or something? Another set of child relations that are not covered by Append/MergeAppend child nodes is non-leaf partitions. I've proposed adding a List of Bitmapset field to Append/MergeAppend named 'allpartrelids' as part of this patchset (patch 0001) to track those for execution-time locking. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com --=20 Thanks, Amit Langote EDB: http://www.enterprisedb.com --00000000000030aa3605f89f1557 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Apr 4, 2023 at 10:29=E2=80=AFPM Amit Langote <amitlangote09@gmail.com= > wrote:
> On Tue, Apr 4, 2023 at 6:41=E2=80=AFAM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > A few concrete thoughts:
> >
> > * I understand that your plan now is to acquire locks on all the<= br> > > originally-named tables, then do permissions checks (which will > > involve only those tables), then dynamically lock just inheritanc= e and
> > partitioning child tables as we descend the plan tree.
>
> Actually, with the current implementation of the patch, *all* of the > relations mentioned in the plan tree would get locked during the
> ExecInitNode() traversal of the plan tree (and of those in
> plannedstmt->subplans), not just the inheritance child tables.
> Locking of non-child tables done by the executor after this patch is > duplicative with AcquirePlannerLocks(), so that's something to be<= br> > improved.
>
> > That seems
> > more or less okay to me, but it could be reflected better in the<= br> > > structure of the patch perhaps.
> >
> > * In particular I don't much like the "viewRelations&quo= t; list, which
> > seems like a wart; those ought to be handled more nearly the same= way
> > as other RTEs.=C2=A0 (One concrete reason why is that this scheme= is going
> > to result in locking views in a different order than they were lo= cked
> > during original parsing, which perhaps could contribute to deadlo= cks.)
> > Maybe we should store an integer list of which RTIs need to be lo= cked
> > in the early phase?=C2=A0 Building that in the parser/rewriter wo= uld provide
> > a solid guide to the original locking order, so we'd be trivi= ally sure
> > of duplicating that.=C2=A0 (It might be close enough to follow th= e RT list
> > order, which is basically what AcquireExecutorLocks does today, b= ut
> > this'd be more certain to do the right thing.)=C2=A0 I'm = less concerned
> > about lock order for child tables because those are just going to=
> > follow the inheritance or partitioning structure.
>
> What you've described here sounds somewhat like what I had impleme= nted
> in the patch versions till v31, though it used a bitmapset named
> minLockRelids that is initialized by setrefs.c.=C2=A0 Your idea of
> initializing a list before planning seems more appealing offhand than<= br> > the code I had added in setrefs.c to populate that minLockRelids
> bitmapset, which would be bms_add_range(1, list_lenth(finalrtable)), > followed by bms_del_members(set-of-child-rel-rtis).
>
> I'll give your idea a try.

After sleeping on this, I think we perhaps don't need to remember origi= nally-named relations if only for the purpose of locking them for execution= .=C2=A0 That's because, for a reused (cached) plan, AcquirePlannerLocks= () would have taken those locks anyway.

AcquirePlannerLocks() doesn't lock inheritance children because they wo= uld be added to the range table by the planner, so they should be locked se= parately for execution, if needed.=C2=A0 I thought taking the execution-tim= e locks only when inside ExecInit[Merge]Append would work, but then we have= cases where single-child Append/MergeAppend are stripped of the Append/Mer= geAppend nodes by setrefs.c.=C2=A0 Maybe we need a place to remember such c= hild relations, that is, only in the cases where Append/MergeAppend elision= occurs, in something maybe esoteric-sounding like PlannedStmt.elidedAppend= ChildRels or something?

Another set of child relations that are not covered by Append/MergeAppend c= hild nodes is non-leaf partitions.=C2=A0 I've proposed adding a List of= Bitmapset field to Append/MergeAppend named 'allpartrelids' as par= t of this patchset (patch 0001) to track those for execution-time locking.<= /div>


--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
--00000000000030aa3605f89f1557--