Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wSIOh-0035p6-0K for pgsql-hackers@arkaria.postgresql.org; Wed, 27 May 2026 17:50:27 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wSIOf-0090W4-0G for pgsql-hackers@arkaria.postgresql.org; Wed, 27 May 2026 17:50:26 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wSIOe-0090Vw-2T for pgsql-hackers@lists.postgresql.org; Wed, 27 May 2026 17:50:25 +0000 Received: from mail-ed1-x542.google.com ([2a00:1450:4864:20::542]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wSIOd-00000001jHp-0a8g for pgsql-hackers@lists.postgresql.org; Wed, 27 May 2026 17:50:25 +0000 Received: by mail-ed1-x542.google.com with SMTP id 4fb4d7f45d1cf-68a26413288so2694444a12.2 for ; Wed, 27 May 2026 10:50:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1779904221; cv=none; d=google.com; s=arc-20240605; b=HsFeic/loKC7bYJtWdVbYkUnL6ZcFTW1FX62oQCr3Y9bRototNx4wc5HzdPV8oOQIs JfcHDAyObVUO+uyMExnSYwCUH8RlpYibcTuHJpj4Kt4ZrR98ukJE4FxlMc0X/If0QOxw iZGbj8k+sl9YmeZ1ETOmklBx9zFlWxapY/VgkLCHgiqZJRhUOckypPGZVJdSFlQcUYj/ LuBjJ+g6KOZEyZ8QUUgkObMmZ6Z0opE6zh7YJ6JpNOIvj5lTqXp4hBSgCCO8mzWMYHGD 2t4HEj+r6ysMVmpHaszEtrVEJsMhu27BCN20SE6D8ONWXKk1+ItGWUGuD4aJIW0GF/9G g7MQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=PuMln8pOkLGSkD0wYVykkeFn2sADnRVLY4yvUDEPpyA=; fh=BV7CCCU3nZQf1Yk8zxXXiuVURO1kPfDK20BU54Kv0RE=; b=b8DwuLGycdo+5ob/Ii4j84VE5kVXuU77KihAbIDonIrMGNjPkxL2HeoOjr64DhvBrA 1lh57/Zvou/22qpBre3QFdZEhc4weFzKBlCaJE9ttPAllHS1b9Cxk/wbX7iBf5+6JRoQ KvgGPJMF4wT6lAz4V7K+OKH/CFv/F9YKBADORFh694zheDPEAdHPDuQNLvknusBvNika nZlA7JL2jDRtiNfGaCRQtN5uHv/t3Mjn1Kh8lOdy+FV+sXeC1TeJkG+fTXIzMyNIjmdp wpMrDxzVE998yKdfCLz83jkMONlJQ3w9nLqGelpT7Jp3B/vkBmnz7EyXzhRJPb/VHNLS pRFw==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779904221; x=1780509021; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=PuMln8pOkLGSkD0wYVykkeFn2sADnRVLY4yvUDEPpyA=; b=Otq4jhv3QCFwX5pa0u1L9LqrKM5Vg2riPrpF/840lEVTOQTjCxEhoTQSDPean3/8xt h9cNynI7hM91SsAJpEB0dy8/chud25TZZgxBymx735MlonsIx7ZIRI+nJTaMtxW2p79v NaS34nDh/9GLd+zd/+EnCvnG2ZP+kFPl6AgN+PTs+IIeHyJlG6iO5DohuFGAJh0X8XJD O4kSwdaoK0Vp3mIhoI+6COQiPT8c7rjGH0MAb4jHzRVEGi0UuJdYByPo/YxRGvwFnFoE zPl5b+XfSh3AjD1W4SHsuKNBSczledJsCiwe8G0ju/kEh9hNOfhc6MiwuVLpN46/Y+xV KlSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779904221; x=1780509021; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PuMln8pOkLGSkD0wYVykkeFn2sADnRVLY4yvUDEPpyA=; b=rpKFrg7S0ypfOxhe+GYeH8dSw2ch2fmDy6/E+vhsqBu4EiAi1H6dM9fab9Ba0nO1Ky 4/8e/sJ4P/uBREFTOqufIwM5sw/cOnJy8EZr66CWixWdmAF8Uv2cfXECLlbIy6JKzLuJ e0+ZOm/pJgj3/1gph20A4GYp6rrsytChqmpTc7JjcGkfFg5FGP1VHJKk14hjRTRWxN/j V8GpwqSHF+MoKePy+dJIT1gms8bvGGwB/iGMX4j6R3D47AqtLKbIuq501ZzM5dm+f4gF Wc0y/zPLMPjfgFNik73IFAoc9mXQevU3UtlJOWazesH/iDJTLcEIAN7QRpKHAJcu/Y+c 01Rw== X-Forwarded-Encrypted: i=1; AFNElJ+kDKWPRYBrXoRWSQYcXHfHnjvBblCbVE2Hd47U0SkBxMLI2gKgTMi+l7+nQigWgKoVkbSdBgWpRfTLjwUK@lists.postgresql.org X-Gm-Message-State: AOJu0Yz4fx1rInJH6vLIVYPnjDx1y47U+f+gajOFyOJpijFhmsxgPvTF KZXIa88wimfyrNacfehXwVfh9RAkb3HbxTWeCCOQwM1BtlkPIkWSqcH3iwzcJGeeV+LWxGrvqWT H9Uak83wvWWxn2C4kCZkzbp3E3dfXUZ8= X-Gm-Gg: Acq92OG538E4i+9NWx6ubvBc9NZVguXEmC977hVC+jvOzJQNX/5F0fJUpfLU/izgvoj PjvtcrEVCzPdKHTIY9SEWbv7ljoHYNsr1ht/348ed04xCYuWYvGqWdr/2yp10n+YRM+owETCpMT VzDMm+e7U8LpS1YLXmQs/VU+Pblwua2jOSTfxz+lOPrs7yR1nzYqVEl1PGzyV6t9EUl4eEumvlC gNWZ5QO6rTztD77Xza6vrXDoedRclt1g7+o7ZDEvS9mkbuC6Yw/P8TdkyEJjboU7hcH0GWP7HeE nd5QySqQHQ== X-Received: by 2002:a17:906:9c9:b0:bcb:d9c7:c6f with SMTP id a640c23a62f3a-bdd22a454a7mr1036771466b.7.1779904221199; Wed, 27 May 2026 10:50:21 -0700 (PDT) MIME-Version: 1.0 References: <8df3d212-5d60-4e30-9606-d8849f7d37ae@gmail.com> <3c477f2f-10e4-4705-bb21-90ccbe67e9d2@gmail.com> <24247.1779395113@sss.pgh.pa.us> <910a4628-720a-4912-af8f-8b5a96a0b336@vondra.me> <2468544.1779721437@sss.pgh.pa.us> In-Reply-To: <2468544.1779721437@sss.pgh.pa.us> From: Alexandra Wang Date: Wed, 27 May 2026 10:49:44 -0700 X-Gm-Features: AVHnY4JIWvYSG3CLnM5zRKcrApF5ZT0H8i9Xp5VVnn1Z69gRrgzl8xWtcA8ONjA Message-ID: Subject: Re: Is there value in having optimizer stats for joins/foreignkeys? To: Tom Lane Cc: Tomas Vondra , jian he , pgsql-hackers@lists.postgresql.org, Andrei Lepikhov , Corey Huinker , hs@cybertec.at, Jeff Davis Content-Type: multipart/alternative; boundary="000000000000d6c2090652d03e9f" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000d6c2090652d03e9f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Tom and Tomas, Thank you so much for the feedback! On Mon, May 25, 2026 at 8:04=E2=80=AFAM Tom Lane wrote: > Tomas Vondra writes: > > On 5/21/26 22:25, Tom Lane wrote: > >> I don't love stxkeyrefs[]. I wonder if it's time to throw away > >> stxkeys[], represent all the target columns as regular expression > >> trees in stxexprs, and then special-case columns that are simple > >> Vars where appropriate at execution. > >> (In the same vein, I dislike the grammar's separation of plain > >> columns from expressions; I'd like to replace stats_params > >> with expr_list and sort it all out later. But perhaps that's > >> material for a separate patch.) > > > FWIW the extended stats copied this from pg_index, which also stores > > keys and expressions separately. I suppose there was a reason for that, > > most likely performance - is cheaper to compare attnums than > > expressions, and plain keys are much more common. > > I think I might be to blame for the separate storage of indexprs. > If so, the motivation was to avoid breakage of older code that only > knew about indkey[]. (Of course, such code would necessarily fail > on indexes with expressions, but we wanted to avoid breakage for the > common case of no-expressions.) I don't think that consideration is > nearly as pressing for extended stats. There's probably a lot less > client-side code that knows about extended stats at all, and what > there is seems more likely to rely on the server-side display > functions than to dig into the catalog details for itself. Also, > if there is anything that's looking at pg_statistic_ext details, > it will need work anyway after this patch; there's no way around that. I'm working on removing stxkeys[] as a prerequisite commit before the main join stats patch, representing all target columns as Var nodes in stxexprs, as you both suggested. One question about the pg_stats_ext view: currently it has two complementar= y columns: - attnames (name[]) =E2=80=94 Names of the columns included in the statisti= cs object - exprs (text[]) =E2=80=94 Expressions included in the statistics object With stxkeys gone from the catalog, should the view: (a) Stay as-is: keep attnames and exprs as separate columns with the same semantics. Implemented via a helper function that extracts plain column names from the unified stxexprs. or (b) Mirror the catalog: remove attnames, make exprs show all entries (both column names and expressions together in one text[] array). Any preference? --=20 Alexandra Wang EDB: https://www.enterprisedb.com --000000000000d6c2090652d03e9f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Tom and Tomas,

Thank you so much for the fe= edback!

On Mon, May 25, 2026 at 8:04=E2=80=AFAM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Toma= s Vondra <tomas@vondra.me> wri= tes:
> > On 5/21/26 22:25, Tom Lane wrote:
> >> I don&= #39;t love stxkeyrefs[].=C2=A0 I wonder if it's time to throw away
&= gt; >> stxkeys[], represent all the target columns as regular express= ion
> >> trees in stxexprs, and then special-case columns that = are simple
> >> Vars where appropriate at execution.
> &g= t;> (In the same vein, I dislike the grammar's separation of plain> >> columns from expressions; I'd like to replace stats_par= ams
> >> with expr_list and sort it all out later.=C2=A0 But pe= rhaps that's
> >> material for a separate patch.)
> <= br>> > FWIW the extended stats copied this from pg_index, which also = stores
> > keys and expressions separately. I suppose there was a = reason for that,
> > most likely performance - is cheaper to compa= re attnums than
> > expressions, and plain keys are much more comm= on.
>
> I think I might be to blame for the separate storage o= f indexprs.
> If so, the motivation was to avoid breakage of older co= de that only
> knew about indkey[]. =C2=A0(Of course, such code would= necessarily fail
> on indexes with expressions, but we wanted to avo= id breakage for the
> common case of no-expressions.) =C2=A0I don'= ;t think that consideration is
> nearly as pressing for extended stat= s.=C2=A0 There's probably a lot less
> client-side code that know= s about extended stats at all, and what
> there is seems more likely = to rely on the server-side display
> functions than to dig into the c= atalog details for itself.=C2=A0 Also,
> if there is anything that= 9;s looking at pg_statistic_ext details,
> it will need work anyway a= fter this patch; there's no way around that.

I'm working on = removing stxkeys[] as a prerequisite commit before the main join
stats p= atch, representing all target columns as Var nodes in stxexprs, as you
b= oth suggested.

One question about the pg_stats_ext view: currently i= t has two complementary
columns:

- attnames (name[]) =E2=80=94 Na= mes of the columns included in the statistics object
- exprs (text[]) = =E2=80=94 Expressions included in the statistics object

With stxkeys= gone from the catalog, should the view:

(a) Stay as-is: keep attnam= es and exprs as separate columns with the same
semantics. Implemented vi= a a helper function that extracts plain column names
from the unified st= xexprs.

or

(b) Mirror the catalog: remove attnames, make expr= s show all entries (both
column names and expressions together in one te= xt[] array).

Any preference?

--
Alexandra Wang
<= font face=3D"monospace">EDB: https://www.enterprisedb.com
--000000000000d6c2090652d03e9f--