MIME-Version: 1.0
References: 
 <CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com>
 <CAKZiRmxtJAFG7e1+Vs9B8ngON=AOzJbuws+1ZeH4LsbJh5AzoQ@mail.gmail.com>
 <CA+TgmoY9Ne_Sh10u6LSPc3wvOQPLp3kF9nBp3nqJEG2JGF2QiA@mail.gmail.com>
 <CA+Tgmoa57S6mP=aTOXH2-gDAL4TMO1WbGgrHSg0s6J4zUH=04g@mail.gmail.com>
 <DEB1SAGVDX7S.JMV96X4R0RCS@gmail.com>
 <CA+Tgmoaf__2B0BUL+vrg28P+3buX=Ti-kybqkHiLTtFrrCfzuA@mail.gmail.com>
 <CA+TgmoYpcLNOuypOTdgCSLW7FuA=t6BtB3meTARHX2-Dj_81xQ@mail.gmail.com>
 <DEFNP3WX7BLX.1X7DS1P9X6JIF@nmfay.com>
 <CA+TgmoZjv9OyFu1Gkt78w0vWEti8S33w8trYHmErf-GMmGSi=w@mail.gmail.com>
 <DELPC4B56Y5A.3OBZ3WCH3T4VX@nmfay.com>
 <CA+TgmoaOSBQD9Ux4eG40w723ZN=c0J7p-+oX4+J8urUeyLMo5w@mail.gmail.com>
 <CAOYmi+=g+MMoOpWkk2weXWKJcKH0eKey8gKHHdH0dF4Tiawrhw@mail.gmail.com>
 <CA+TgmobwaT=PXPDDrgDup+jA8KHBbkxigtziD-zNzAKKkQYVgQ@mail.gmail.com>
 <CAOYmi+mOmEW=amDRQMfw6-Fb3ZmDEQFaJzwk8Bc8W8DzaP85XQ@mail.gmail.com>
 <CA+TgmoaX2AMW4cdFM3OngBJxmxpkdmzF33R7-CWhvRLfucbFMg@mail.gmail.com>
 <CAOYmi+k4AyWCQHK=XVF99KVDuFkqxcADao61OWGLxu0nRYMONQ@mail.gmail.com>
 <CA+TgmoZ0x3ym_oueXRWzbM_=6ucKoPZVGj3rRMLBDC_FnetXDw@mail.gmail.com>
 <CAP53Pkycc=7N2bLzVT3x+qE1JamvRZWev5tFjdLJ1+-AV3Di+Q@mail.gmail.com>
 <CA+TgmoaKhuD91RnazbRyGkmP7--JdNq8oNDC3UcgTZSWbMxC7w@mail.gmail.com>
 <CAP53Pkw5-wMEeDJXFmqo_RTyL_spzCXb7HHKrbSnQqokVoZcNQ@mail.gmail.com>
 <CA+Tgmob-69bzbdi3U_QtebqAf6u1y8js=5=oNK639csVe1VbhA@mail.gmail.com>
 <CA+TgmoaZMOikxK=LqS+Jn+835h9S139JLGk-3LyETVXw5W5j=w@mail.gmail.com>
 <1136161.1769654478@sss.pgh.pa.us>
 <CAP53PkwZ1ZTMARKg6iEfAw9qzBhkjBitj-9gr_Jvy7k2AwGgWA@mail.gmail.com>
 <CAMbWs4--NuEUFE_xTo991TRXaZryE29jarJPDnVxoaQOYdt7tA@mail.gmail.com>
 <CA+TgmobzR+XMGbRosVPbjHbSo4+cgJn=qZK6w05aF1sbj=C+9Q@mail.gmail.com>
 <CA+TgmoawzvCoZAwFS85tE5+c8vBkqgcS8ZstQ_ohjXQ9wGT9sw@mail.gmail.com>
 <CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com>
 <CAK98qZ2RzbgCHrSg4zLkvpzyBam_X6te-KF8w1+_vON9BAVMEw@mail.gmail.com>
 <CA+TgmoaCdsuvNn6T6SfQ_0YD2Hh2+hgTXh9fTGHQhPg1zvy2rQ@mail.gmail.com>
 <CAK98qZ1zWzRB0ABG7ULzTeWKRR5C7-FxLqM-6v8wQDiFM+DNAg@mail.gmail.com>
 <CA+Tgmob7ozJAs5SU6bD2RfAt4w_AmsMGz-aaVA6WeLXHkBypOg@mail.gmail.com>
 <CAK98qZ1J42RoAsHnYWGPPmHziZmzmqE=Lp_O6WJ-9aKK2qjikA@mail.gmail.com>
 <CA+TgmoYjcBA6dw3nwiyfDzPXTCrxTZPXDMrc2TrDJcL1cPK6iA@mail.gmail.com>
 <CA+TgmoYru-vxoTKfwjQby30r2OkTXfb18Km_=VLs6qk8Akr0-g@mail.gmail.com>
 <CAKFQuwYxTMx1e1dvVLd5vs8CO2Xe__YRCW0m4v2UErG=5Lu7kQ@mail.gmail.com>
 <CA+TgmoaPH3rOmH1zw8s10Az3TQKRbDnF3dKYZCAFvx_DgFV5_Q@mail.gmail.com>
 <CAKFQuwZ7CRbvmJ60EjvEUJP8X9hWVCHJz8yJZ0-xRpYwL=fugA@mail.gmail.com>
 <CA+TgmoYsG2pAOG4YYWwnGsaL+QfOZopi_m+f_hZqBZ3meNEkjg@mail.gmail.com>
 <CAKFQuwY4i6xXCRHd=ccrFz0sQOfiWRxhobiBrzyK0rzu1-Y33Q@mail.gmail.com>
 <CA+Tgmobbj_TaCsYmr1grJBTRKaFaxFfotXn1T6LSXs9GQ8_Kyw@mail.gmail.com>
 <CA+TgmoZzM2i+p-Rxdphs4qx7sshn-kzxF91ASQ5duOo0dFRXLQ@mail.gmail.com>
 <CAP53PkzKeD=t90OfeMsniYrcRe2THQbUx3g6wV17Y=ZtiwmWTQ@mail.gmail.com>
 <CA+TgmoYMpO=+EQwDhMoH=zmTGbB6TD5-_9bciNEoDArP8JjSMw@mail.gmail.com>
 <CAP53PkxZSHxShLKEp9wvy1FcYPcXMCpu1fJR+Q6BzLhmknZgsg@mail.gmail.com>
In-Reply-To: 
 <CAP53PkxZSHxShLKEp9wvy1FcYPcXMCpu1fJR+Q6BzLhmknZgsg@mail.gmail.com>
From: Robert Haas <robertmhaas@gmail.com>
Date: Tue, 17 Mar 2026 09:44:34 -0400
Message-ID: 
 <CA+TgmobtbB74PTUtCQwUNR8D9ZZwKH5ZptoU30ONm34TR954Nw@mail.gmail.com>
Subject: Re: pg_plan_advice
To: Lukas Fittl <lukas@fittl.com>
Cc: "David G. Johnston" <david.g.johnston@gmail.com>,
	PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: 
 <https://www.postgresql.org/message-id/CA%2BTgmobtbB74PTUtCQwUNR8D9ZZwKH5ZptoU30ONm34TR954Nw%40mail.gmail.com>
Precedence: bulk

On Mon, Mar 16, 2026 at 7:25=E2=80=AFPM Lukas Fittl <lukas@fittl.com> wrote=
:
> That's fair. I would like us to do something about this in the PG20
> release cycle - for my part, I think its reasonable to follow the
> declarative partitioning parent-child relationship for indexes if
> present - assuming we can sort out the performance/etc. aspects of
> that.

I wonder if instead of doing this as you did it, we should try to make
partition expansion happen around expand_inherited_rtentry time. So
maybe you can write something like PARTITIONED_SEQ_SCAN(x) or
PARTITIONED_INDEX_SCAN(x y) and, internally, that gets replaced with
one SEQ_SCAN(..) or INDEX_SCAN(...) entry per child before the core
planning logic engages. And, during the plan walk, we could do the
reverse transformation: if we computed matching advice items for every
partition, consolidate them down to just one before constructing the
final advice string. If we did that, the whole thing would be
symmetric and we'd have a certain amount of automatic test coverage,
plus we'd shorten a lot of automatically written advice strings
considerably. Maybe this is not better than the more on-the-fly way
that you did it, but I think it's worth some study.

While I'm on the subject, there are some other opportunities for
brevity that I have not pursued for this release. In particular,
NO_GATHER(...) often seems quite tedious to me. We could introduce a
wildcard, like "*", that just means everything, so you could write
NO_GATHER(*) for non-parallel queries. However, that seems like it's
actually not great, because as soon as you have a single Gather or
Gather Merge node in the plan, you're back to needing to write all the
other ones out. So another idea is to let you write something like
ONLY_EXPLICIT_GATHER() as a no-argument incantation that means that
everything that isn't mentioned as a GATHER() or GATHER_MERGE() target
should be considered NO_GATHER(). Or you could call that something
like NO_GATHER_OTHERS() or whatever. Or maybe there could be some
general default facility that lets you say what should happen when
nothing is specified, like DEFAULTS(NO_GATHER), but right now the
number of things that you could apply that to would be quite limited.

> For 19 I think we might want to consider calling this out more
> explicitly in the documentation under the "Scan Method Advice"
> paragraph, i.e. that one cannot specify partition parent table names
> (at least not ones that have no data of their own) and instead one has
> to specify the partitions individually. Otherwise I think users will
> just be confused by the Append node that says "Disabled: true" and the
> advice that didn't match.

We can consider that, but I think the bigger picture here is that
writing advice strings by hand is hard, and that's why EXPLAIN
(PLAN_ADVICE) and pg_collect_advice exist -- to give you a starting
point. I fear that if we try to enumerate a lot of specific examples
of ways in which people might be confused in the documentation, they
won't read it, and they'll still be confused. I think the primary
focus of the documentation should be to get people to use the advice
generation facilities as their main way to discover how to use the
system, and then pointing out specific things that may still be
confusing where it makes sense is also good to do. For example, in a
case like this, if you sit down and write INDEX_SCAN(partitioned_table
partitioned_index), yeah that's not going to work, and you're going to
be confused (potentially). But that's not really what you're supposed
to do. You're supposed to start by running EXPLAIN (PLAN_ADVICE) on
the query whose plan you're trying to manipulate, and if you do that,
you'll see that the generated advice shows a separate INDEX_SCAN() or
SEQ_SCAN() item for each child table, and then it's sorta obvious what
you're supposed to be doing. You may very well not like that -- I
think many, many people are going to complain about it -- but you'll
understand what is possible with the system that we have. Now that is
not to say that I think you're wrong about documenting this, and I've
certainly tried to document some other instances of things that I
found confusing even as the author of the system. However, there's
also a lot of cases that I haven't tried to document because it's just
too much useless, abstract information. On this particular point, if
there's a nice plan to fit this into the documentation that doesn't
feel like a jarring topic shift or a long detour into minor details,
I'm fine with it, but I don't think it's worth a lot of contortions to
fit this specific thing in considering how many other things there
are.

> I think the common case would be someone sees the planner picked a
> Nested Loop, and instead wants to see the plan that prefers a Hash
> Join (or Merge Join), e.g. to understand costing differences. That's
> how I usually use pg_hint_plan, to understand what the alternate plan
> looked like that the planner didn't pick, but where costs were close.
> The top-level "enable_nestloop =3D off" often tends to not work that
> well for complex plans, hence my historic use of pg_hint_plan's
> HASHJOIN/MERGEJOIN (or NO_NESTLOOP) hints for this purpose.

It's interesting to me that this works well for you even in complex
plans. For example, let's say I have something like this (omitting
sorts):

Merge Join
-> Nested Loop
  -> Hash Join
    -> Seq Scan on A
    -> Hash
      -> Seq Scan on B
  -> Index Scan on C
-> Nested Loop
  -> Seq Scan on D
  -> Index Scan on E

Now, what do you write here to get rid of the nested loop involving C?
Your example before of USE_HASH_JOIN(x y) seemed to imply that you
mentioned two tables, but this is a 3-table join, so are you
mentioning all three tables in this case, like USE_HASH_JOIN(A B C)?
Or are you mentioning just C, or A and C, or what? In pg_plan_advice,
it's just HASH_JOIN(C). I'm unclear what the right answer is in
pg_hint_plan. The documentation says that HashJoin(table table [...])
forces hash join for the joins on the tables specified, but I don't
know whether that means that the whole thing functions as a single
constraint (i.e. the N-way join product of all mentioned tables should
be computed using a HashJoin at the uppermost level) or as a
constraint per table (every mentioned table should be involved in a
hash join). If it's the former, then you could write HashJoin(A B C)
in this case, but that wouldn't preclude switching the join order so
that the A-C join is done first using a Nested Loop and then the join
to B is done afterward as a HashJoin, which is probably not what you
wanted. If it's the latter, then HashJoin(C) is probably good enough,
although not necessarily. If there are join clauses connecting B and
C, the planner could try to cheat by doing a hash join between B and C
first and then doing a Nested Loop join to A afterwards, which is also
probably not what you wanted, but typically there won't be such join
clauses so it will work out. Also, if the HashJoin(C) just means there
has to be a hash join somewhere above C, rather than that C has to be
on onside or the other of a hash join by itself, then the planner
could also cheat by switching the outer merge join to a hash join, but
I'm guessing that probably isn't what it means. But if that's the
case, then what would you write you did want to replace the outer
merge join with a hash join? Both sides have more than one table, so
if HashJoin(x) means x has to appear alone on one side of the join,
there would be no way to get what you want here. I wish the actual
behavior of pg_hint_plan were better-documented here, and I'd
appreciate an explanation of how you use it in a case like this and
what actually happens.

But all that having been said, I do think there's space for softer
constraints than what pg_plan_advice currently offers. For example,
there's currently no way to say "I'd like an index scan but I don't
care which index you use". I've been thinking we could invent
ANY_INDEX_SCAN() and ANY_INDEX_ONLY_SCAN() for that purpose at some
point, or maybe people would prefer negative constraints instead, like
NO_SEQ_SCAN(). And maybe your idea of either-way join method
constraints falls in that category too. It's easiest for me to imagine
someone wanting that for merge join, where the two sides are treated
more nearly symmetrically. But I think we would need to nail down
exactly what the semantics of that are. Given that we've got a
sublists available as a tool, we could define a "symmetric join method
request" to take two arguments where the first is the relation
identifiers, or a list of all the relation identifiers, that appear on
one side of the join, and the same for the other side of the join. So
in the example above, if you wanted to replace the nested loop with a
hash join in one direction or the other, you could write
FLIPFLOPPY_HASH_JOIN((A B) C), and if you wanted to replace the outer
merge join, you could write FLIPFLOPPY_HASH_JOIN((A B C) (D E)). I'm
not altogether convinced that's better than just writing HASH_JOIN(C)
or HASH_JOIN((D E)), and there's some definitely user and code
complexity to supporting both methods, but maybe it will turn out to
be the right thing.

(It also occurs to me that the proposed semantics of
FLIPFLOPPY_HASH_JOIN are actually both stronger and weaker than the
existing HASH_JOIN directive. It's weaker in that the sides of the
join can be switched, but it's stronger in that it constrains what has
to be on both sides of the join, whereas HASH_JOIN does not do that.
Of course, as is hopefully clear by now, this is not the only possible
set of semantics that we could choose to implement here: things that
seem simple when you think about 2-table cases are often not simple at
all when scaled up to more complex situations. More than anything
else, I want whatever we implement to be extremely well-defined, with
absolutely no room for debate about what a given advice tag does or
whether a certain plan complies with a certain advice item.)

> Later in the release cycle I'll see if I can put together a community
> resource that compares pg_hint_plan to pg_plan_advice, and where they
> differ. I suspect many end users will have similar questions, and
> whilst I don't think explaining the differences belongs in the regular
> Postgres docs, it could fit the wiki as a cheatsheet of sorts.

Yeah, that sounds nice.

> Right - I can see the usefulness for testing, but I worry that people
> use it on production systems and then experience unexpected
> performance issues. That said, we could address that with a warning in
> the docs noting its not intended for production use.

I think it depends a lot on what you mean by "production use". I think
it's definitely true that pg_collect_advice is not intended for
continuous advice collection. If you try to use it that way, you'll
either start throwing away entries you wanted to keep (if the
collection limit is low) or you'll blow out memory (if the collection
limit is high). But it's completely reasonable to enable it on a
production server in a controlled way, to collect all the queries and
advice strings for one representative transaction of each type (or
even 10 or 50 representative transactions of each type). So I feel
like this is about understanding how it's intended to be used, and the
answer is definitely not "just like pg_stat_statements!".

BTW, I wonder if it would be worth considering, obviously for next
release cycle rather than this one, extending pg_stat_statements to
have the ability to grab plan advice as well, rather than building the
query normalization and deduplication features that pg_stat_statements
already has into pg_collect_advice.

--=20
Robert Haas
EDB: http://www.enterprisedb.com