Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vdCqs-00FKVH-0H for pgsql-hackers@arkaria.postgresql.org; Tue, 06 Jan 2026 19:36:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vdCqp-00AFaa-0c for pgsql-hackers@arkaria.postgresql.org; Tue, 06 Jan 2026 19:36:19 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vdCqo-00AFaR-1u for pgsql-hackers@lists.postgresql.org; Tue, 06 Jan 2026 19:36:19 +0000 Received: from mail-ed1-x52f.google.com ([2a00:1450:4864:20::52f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vdCqn-004aEK-1u for pgsql-hackers@lists.postgresql.org; Tue, 06 Jan 2026 19:36:18 +0000 Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-6503b561daeso1989445a12.1 for ; Tue, 06 Jan 2026 11:36:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767728175; x=1768332975; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2nwenQeQlSEAvL6cOrI1NfSaLFer3U9meNtiLWmvqKE=; b=LiUkDlLuhxQ9NaTRzDmlbgKTwbQjxX3hK4ppKQAdzoraADAofzmxPphWqeTXt4Q+hN CSgp1arB1nlKmzOqZlOJrjLEIsO46XRd+9iPfwvpiZEX2GJhEsQL6gd6kZ50HYQ+JLQn OjfOquTlm3KwR58iXqHpvcV3FDj/HBQboh8Il7bmUb3bYnmEWpDd3l5sM5yexoPj6Awm bzKJi9+Wh2hGf4MLZwxaJ99SSRFvNt1lMQW87aTBc1Teb9nM3euBMHWoEtOiiUoWABRb LB8Lqh5RR8twxd/e2592GKQNWTy0BCIJCc8+42MDdMR1WN9h8V7jM3UL5pCPGf63v2B3 eCRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767728175; x=1768332975; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2nwenQeQlSEAvL6cOrI1NfSaLFer3U9meNtiLWmvqKE=; b=dnTstdvDGW4A67P87Bvy7CWYkpNnTqIGUFXh3nhHZK78dgiOEETWbzhtE8UZZ/Vgtt J0YxYdUpiuxj4P8OammEiT7AX1uM0F/2dEIAesyICNyYZVfqU2yiGvSf1mLLwKV+2SIZ as7Kc1cIGu6DIzHcisXUTWfS2BfL2snxyD5k8QvgC68o7yRXdDe6PGdOiZA7arBoil/h oiaEyVxvClANwSgIwwUgCM0Ke1eFnSRbH4Fl3I9xC8Nva/6CLcHHMCqQ4ZI2/Xz36XuO yI9LykYSjAe7I6Y9anIBNOrZQ7N9Y3gEahbd9cg673hgZkRVpHOvM5f98eMXComc0rSC 82Eg== X-Forwarded-Encrypted: i=1; AJvYcCX1AXPHQ7nIxzcNczwabLoK/Cvah8iObvSbgs89PwfJxTmi0yfIeWe4utfZH2+uoiZuSfVeGiz9tzsfUACQ@lists.postgresql.org X-Gm-Message-State: AOJu0YzvZI5ekKsAiEEsbdHWfSTNTu2p/vUnqbLdyIhTYYwAtPUFrxJk 5Seo+FCQpx3m+uwncKp3p7hajO08/l6Ut04wKC2lCPyF083oYkPJBaT7lvr6bVnyKpiCm/FaUtc yMA05Z+ePy5Wn/eLqD4WeLKIBjOHf3qA= X-Gm-Gg: AY/fxX4LTuBY1m1ufdDhTjlJLP1pppqBkimvXxTAJSSdrv7lXcHLt/RKKZW9XXJ9xh9 1vO8caCsO7NHZM13K0WN0d7xpNTGkBKKsQzmpVmBXIlhXrzw/j9VVjiG5le7uNMukEJ4m37VhHG 4qGUJSk4A+gDAFzN2Jne+vRAPLy1vE2LO0uykuYGoCWQBtSkl+veLSGP/dIVWR1sYBgUBexdQC8 a+nfHa6sCOVfWUeY02TdgCz1XWlnv2EdRK8WNAOl7BJVyn4LDgREY/1qcfrxzfYBjfCNbg8GnL1 S+N/fDhj4b5AzvXIXvZOTIOrvAA= X-Google-Smtp-Source: AGHT+IGaqqOQI7NLnJgUe+79lh7IwwGf71V6FWku5G5oZbHyXCNjEfyi37sAOdLB0/dg4hhqMgRofuH1/d1R0FVUAZ0= X-Received: by 2002:a17:907:3e10:b0:b84:201e:cd3f with SMTP id a640c23a62f3a-b8444c3fb5bmr25726166b.11.1767728175079; Tue, 06 Jan 2026 11:36:15 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Robert Haas Date: Tue, 6 Jan 2026 14:36:03 -0500 X-Gm-Features: AQt7F2rFIG969GdnfW7Wo3EwMbu9FdZQ2VFosdOsExaDfB54GJEBKKHc9A2zRaw Message-ID: Subject: Re: pg_plan_advice To: Lukas Fittl Cc: Jacob Champion , Dian Fay , Matheus Alcantara , Jakub Wartak , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Dec 29, 2025 at 8:15=E2=80=AFPM Lukas Fittl wrote= : > For 0001, I'm not sure the following comment is correct: > > > /* When recursing =3D true, it's an unplanned or dummy subquery. */ > > rtinfo->dummy =3D recursing; > > Later in that function we only recurse if its a dummy subquery - in the c= ase of an unplanned subquery (rel->subroot =3D=3D NULL) > add_rtes_to_flat_rtable won't be called again (instead the relation RTEs = are directly added to the finalrtable). Maybe we can > clarify that comment as "When recursing =3D true, it's a dummy subquery o= r its children.". Presumably, a child of an unplanned or dummy subquery will also be unplanned or dummy, so I'm not sure I understand the need to clarify here. > From my medium-level understanding of the planner, I don't think the lack= of tracking unplanned subqueries > in subrtinfos is a problem, at least for the pg_plan_advice type use case= s. I don't think so, either. I believe that anything that falls into this category is something that is not actually going to be reflected in the final plan tree, but we can't lose track of it completely because it can matter for purposes like locking or invalidation. Since plan advice only targets things that appear in the final plan tree, it shouldn't care. If we did want to care, e.g. to emit advice like WE_ARE_EXPECTING_THIS_TO_BE_DUMMY(whatever_table), we'd need more than an rtoffset per subquery; we'd have to map each RTI individually, because some RTIs are tossed completely for in the "dummy" case, meaning that the rtoffset isn't constant for the whole subquery. AFAICT, this is not an issue because we need not care about the dummy subqueries at all. The only reason I included the SubPlanRTInfo at all for this case is that the previous SubPlanRTInfo might be for a non-dummy subquery, and some code might want to look at the next entry in the list to see where the portion of the range table belonging to that previous subqueries ends. This lets you do that. > For 0002: > > It might be helpful to clarify in a comment that ElidedNode's plan_node_i= d represents the surviving node, not that of the elided node. Good point. I'll add this comment: + * + * plan_node_id is that of the surviving plan node, the sole child of the + * one which was elided. > I also noticed that this currently doesn't support cases where multiple n= odes are elided, e.g. with multi-level table partitioning: > > CREATE TABLE pt (l1 date, l2 text) PARTITION BY RANGE (l1); > CREATE TABLE pt_202512 PARTITION OF pt FOR VALUES FROM ('2025-12-01') TO = ('2026-01-01') PARTITION BY LIST (l2); > CREATE TABLE pt_202512_TEST PARTITION OF pt_202512 FOR VALUES IN ('TEST')= ; > > EXPLAIN (RANGE_TABLE) SELECT * FROM pt WHERE l1 =3D '2025-12-15' AND l2 = =3D 'TEST'; > > QUERY PLAN > ------------------------------------------------------------------- > Seq Scan on pt_202512_test pt (cost=3D0.00..29.05 rows=3D1 width=3D36) > Filter: ((l1 =3D '2025-12-15'::date) AND (l2 =3D 'TEST'::text)) > Scan RTI: 3 > Elided Node Type: Append > Elided Node RTIs: 1 <=3D=3D=3D This is missing RTI 2 > RTI 1 (relation, inherited, in-from-clause): > Relation: pt > RTI 2 (relation, inherited, in-from-clause): > Relation: pt_202512 > RTI 3 (relation, in-from-clause): > Relation: pt_202512_test > Unprunable RTIs: 1 2 3 > > In a quick test, adding child_append_relid_sets (from 0003) to the relids= being passed to record_elided_node fixes > that. Presumably the case of partitionwise join relids doesn't matter, be= cause that would prevent it being elided. I'm not really sure there's a problem here. We definitely do not want to end up with something like "Elided Node RTIs: 1 2". What I've found experimentally is that it's often important to preserve relid sets, but you need to preserve them as sets, not individually. So there could be an argument that we somehow want to preserve both {1} and {2} here, but that's not equivalent to {1,2}, which looks like a partitionwise join between relid 1 and relid 2. But it isn't especially clear to me that we actually need to preserve RTI 2 here. One reason why preserving RTIs is important is so that as we descend a join tree, we can find the RTIs that the optimizer thought it was joining, but that only requires finding RTI 1, not RTI 2. Another reason why preserving RTIs is important is so that we can use relation identifiers to describe planning decisions made with respect to those RTIs, but that doesn't apply here because partition expansion just always happens. Of course, it's quite possible that there are reasons unrelated to this patch set why this information would be good to preserve, but if we want to do it, we're going to have to adjust the data representation somehow. We'd either need to give the ElidedNode a "cars" representation instead of a single RTI set, or we'd need to have some separate way of representing this. I hesitate a little bit to design something without a use case in mind, but maybe you have one? > For 0003: > > I also find the "cars" variable suffix a bit hard to understand, but not = sure a comment next to the variables is that useful. > Separately, the noise generated by all the additional "_cars" variables i= sn't great. > > I wonder a little bit if we couldn't introduce a better abstraction here,= e.g. a struct "AppendPathInput" that contains the > two related lists, and gets populated by accumulate_append_subpath/get_si= ngleton_append_subpath and then > passed to create_append_path as a single argument. I spent some time thinking about this day and haven't been quite able to come up with something that I like. The problem is that pa_partial_subpaths and pa_nonpartial_subpaths share a single child_append_relid_sets variable, namely pa_subpath_cars, and accumulate_append_subpaths gets called with that as the last argument and different things for the previous two. One thing I tried was making the AppendPathInput struct contain three lists rather than two, but then accumulate_append_subpath() needs an argument that makes it work in one of three different modes: Mode 1: normal -- add everything to the "normal" list Mode 2: building parallel-aware append with partial path -- add things to the "normal" list except for parallel-aware appends which need to be split between the normal and special lists Mode 3: building parallel-aware append with non-partial path -- add things to the "special" list I also tried splitting up accumulate_append_subpath() into two functions, thinking that maybe I could segregate the parallel-append handling from the more normal cases. This seems somewhat appealing in the sense that having accumulate_append_subpath() hold a bunch of extra logic that only one call site needs isn't very nice, but changing it doesn't really seem to help with the problem that we have two subpath lists sharing one cars list in this case. I'll try to find some more time to think about this, but if you have any ideas meanwhile, I'd be happy to hear them. > Note that 0005 needs a rebase, since 48d4a1423d2e92d10077365532d92e059ba2= eb2e changed the GetNamedDSMSegment API. I'll fix this. I don't love the way that commit made the callback and the callback arg non-consecutive arguments. > You may also want to move the CF entry to the PG19-4 commitfest so CFbot = runs again. It seems that I cannot move it to 19-4. I moved it to 19-Final. --=20 Robert Haas EDB: http://www.enterprisedb.com