public inbox for [email protected]  
help / color / mirror / Atom feed
pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
7+ messages / 5 participants
[nested] [flat]

* pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
@ 2025-04-04 13:17  Alexander Korotkov <[email protected]>
  0 siblings, 1 reply; 7+ messages in thread

From: Alexander Korotkov @ 2025-04-04 13:17 UTC (permalink / raw)
  To: [email protected]

Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate

This commit implements the automatic conversion of 'x IN (VALUES ...)' into
ScalarArrayOpExpr.  That simplifies the query tree, eliminating the appearance
of an unnecessary join.

Since VALUES describes a relational table, and the value of such a list is
a table row, the optimizer will likely face an underestimation problem due to
the inability to estimate cardinality through MCV statistics.  The cardinality
evaluation mechanism can work with the array inclusion check operation.
If the array is small enough (< 100 elements), it will perform a statistical
evaluation element by element.

We perform the transformation in the convert_ANY_sublink_to_join() if VALUES
RTE is proper and the transformation is convertible.  The conversion is only
possible for operations on scalar values, not rows.  Also, we currently
support the transformation only when it ends up with a constant array.
Otherwise, the evaluation of non-hashed SAOP might be slower than the
corresponding Hash Join with VALUES.

Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com
Author: Alena Rybakina <[email protected]>
Author: Andrei Lepikhov <[email protected]>
Reviewed-by: Ivan Kush <[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/c0962a113d1f2f94cb7222a7ca025a67e9ce3860

Modified Files
--------------
src/backend/optimizer/plan/subselect.c    |  80 ++++++++
src/backend/optimizer/prep/prepjointree.c |  12 ++
src/backend/optimizer/util/clauses.c      |  14 +-
src/include/optimizer/subselect.h         |   3 +
src/test/regress/expected/subselect.out   | 308 ++++++++++++++++++++++++++++++
src/test/regress/sql/subselect.sql        | 100 ++++++++++
6 files changed, 512 insertions(+), 5 deletions(-)



^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
@ 2025-04-04 15:47  Melanie Plageman <[email protected]>
  parent: Alexander Korotkov <[email protected]>
  0 siblings, 2 replies; 7+ messages in thread

From: Melanie Plageman @ 2025-04-04 15:47 UTC (permalink / raw)
  To: Alexander Korotkov <[email protected]>; +Cc: [email protected]; PostgreSQL Hackers <[email protected]>

On Fri, Apr 4, 2025 at 9:17 AM Alexander Korotkov
<[email protected]> wrote:
>
> Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
>
> This commit implements the automatic conversion of 'x IN (VALUES ...)' into
> ScalarArrayOpExpr.  That simplifies the query tree, eliminating the appearance
> of an unnecessary join.

I haven't looked at this patch, but it seems likely that it is related
to the recent failures I noticed in CI in the subselect test when the
regress suite is run by 002_pg_upgrade.pl

diff --strip-trailing-cr -U3
C:/cirrus/src/test/regress/expected/subselect.out
C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out
--- C:/cirrus/src/test/regress/expected/subselect.out 2025-04-04
14:44:17.637206600 +0000
+++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out
2025-04-04 14:47:20.358393500 +0000
@@ -2769,15 +2769,16 @@
 EXPLAIN (COSTS OFF)
 SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid)
 WHERE c.oid IN (VALUES (1), (2));
-                          QUERY PLAN
----------------------------------------------------------------
- Hash Join
-   Hash Cond: (a.oid = c.oid)
-   ->  Seq Scan on pg_am a
-   ->  Hash
-         ->  Index Scan using pg_class_oid_index on pg_class c
-               Index Cond: (oid = ANY ('{1,2}'::oid[]))
-(6 rows)
+                       QUERY PLAN
+---------------------------------------------------------
+ Merge Join
+   Merge Cond: (c.oid = a.oid)
+   ->  Index Scan using pg_class_oid_index on pg_class c
+         Index Cond: (oid = ANY ('{1,2}'::oid[]))
+   ->  Sort
+         Sort Key: a.oid
+         ->  Seq Scan on pg_am a
+(7 rows)





^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
@ 2025-04-04 16:03  Alexander Korotkov <[email protected]>
  parent: Melanie Plageman <[email protected]>
  1 sibling, 0 replies; 7+ messages in thread

From: Alexander Korotkov @ 2025-04-04 16:03 UTC (permalink / raw)
  To: Melanie Plageman <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; [email protected]; PostgreSQL Hackers <[email protected]>

On Fri, Apr 4, 2025 at 6:47 PM Melanie Plageman
<[email protected]> wrote:
>
> On Fri, Apr 4, 2025 at 9:17 AM Alexander Korotkov
> <[email protected]> wrote:
> >
> > Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
> >
> > This commit implements the automatic conversion of 'x IN (VALUES ...)' into
> > ScalarArrayOpExpr.  That simplifies the query tree, eliminating the appearance
> > of an unnecessary join.
>
> I haven't looked at this patch, but it seems likely that it is related
> to the recent failures I noticed in CI in the subselect test when the
> regress suite is run by 002_pg_upgrade.pl


Thank you for pointing.  I'm looking right now.

------
Regards,
Alexander Korotkov
Supabase





^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
@ 2025-04-07 09:15  David Rowley <[email protected]>
  parent: Melanie Plageman <[email protected]>
  1 sibling, 2 replies; 7+ messages in thread

From: David Rowley @ 2025-04-07 09:15 UTC (permalink / raw)
  To: Melanie Plageman <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; [email protected]; PostgreSQL Hackers <[email protected]>

On Mon, 7 Apr 2025 at 19:39, Melanie Plageman <[email protected]> wrote:
> +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out
> 2025-04-04 14:47:20.358393500 +0000
> @@ -2769,15 +2769,16 @@
>  EXPLAIN (COSTS OFF)
>  SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid)
>  WHERE c.oid IN (VALUES (1), (2));
> -                          QUERY PLAN
> ----------------------------------------------------------------
> - Hash Join
> -   Hash Cond: (a.oid = c.oid)
> -   ->  Seq Scan on pg_am a
> -   ->  Hash
> -         ->  Index Scan using pg_class_oid_index on pg_class c
> -               Index Cond: (oid = ANY ('{1,2}'::oid[]))
> -(6 rows)
> +                       QUERY PLAN
> +---------------------------------------------------------
> + Merge Join
> +   Merge Cond: (c.oid = a.oid)
> +   ->  Index Scan using pg_class_oid_index on pg_class c
> +         Index Cond: (oid = ANY ('{1,2}'::oid[]))
> +   ->  Sort
> +         Sort Key: a.oid
> +         ->  Seq Scan on pg_am a
> +(7 rows)

Are these failures from patches applied to master prior to 3ba2cdaa?

David





^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
@ 2025-04-07 13:08  Alexander Korotkov <[email protected]>
  parent: David Rowley <[email protected]>
  1 sibling, 0 replies; 7+ messages in thread

From: Alexander Korotkov @ 2025-04-07 13:08 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Melanie Plageman <[email protected]>; Alexander Korotkov <[email protected]>; [email protected]; PostgreSQL Hackers <[email protected]>

Hi, David!

On Mon, Apr 7, 2025 at 12:15 PM David Rowley <[email protected]> wrote:
>
> On Mon, 7 Apr 2025 at 19:39, Melanie Plageman <[email protected]> wrote:
> > +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out
> > 2025-04-04 14:47:20.358393500 +0000
> > @@ -2769,15 +2769,16 @@
> >  EXPLAIN (COSTS OFF)
> >  SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid)
> >  WHERE c.oid IN (VALUES (1), (2));
> > -                          QUERY PLAN
> > ----------------------------------------------------------------
> > - Hash Join
> > -   Hash Cond: (a.oid = c.oid)
> > -   ->  Seq Scan on pg_am a
> > -   ->  Hash
> > -         ->  Index Scan using pg_class_oid_index on pg_class c
> > -               Index Cond: (oid = ANY ('{1,2}'::oid[]))
> > -(6 rows)
> > +                       QUERY PLAN
> > +---------------------------------------------------------
> > + Merge Join
> > +   Merge Cond: (c.oid = a.oid)
> > +   ->  Index Scan using pg_class_oid_index on pg_class c
> > +         Index Cond: (oid = ANY ('{1,2}'::oid[]))
> > +   ->  Sort
> > +         Sort Key: a.oid
> > +         ->  Seq Scan on pg_am a
> > +(7 rows)
>
> Are these failures from patches applied to master prior to 3ba2cdaa?

Yes, these failures appears before 3ba2cdaa.  Tom committed 3ba2cdaa
to fix the problem before I get into it [1].

Links.
1. https://www.postgresql.org/message-id/srnuqlttuimzmvoulhsrbgvj4vnul6b65osswvua7sfkqsvmuy%40yg7apybpx...

------
Regards,
Alexander Korotkov
Supabase





^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
@ 2025-04-07 13:50  Melanie Plageman <[email protected]>
  parent: David Rowley <[email protected]>
  1 sibling, 1 reply; 7+ messages in thread

From: Melanie Plageman @ 2025-04-07 13:50 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Alexander Korotkov <[email protected]>; [email protected]; PostgreSQL Hackers <[email protected]>

On Mon, Apr 7, 2025 at 5:15 AM David Rowley <[email protected]> wrote:

> Are these failures from patches applied to master prior to 3ba2cdaa?


Yea, my email was held in moderation for days. I guess cross-posting is
flagged. I thought I saw people regularly cc pgsql-hackers when replying to
pgsql-committers, but I guess not.


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
@ 2025-04-07 15:04  Tom Lane <[email protected]>
  parent: Melanie Plageman <[email protected]>
  0 siblings, 0 replies; 7+ messages in thread

From: Tom Lane @ 2025-04-07 15:04 UTC (permalink / raw)
  To: Melanie Plageman <[email protected]>; +Cc: David Rowley <[email protected]>; Alexander Korotkov <[email protected]>; PostgreSQL Hackers <[email protected]>

Melanie Plageman <[email protected]> writes:
> Yea, my email was held in moderation for days. I guess cross-posting is
> flagged. I thought I saw people regularly cc pgsql-hackers when replying to
> pgsql-committers, but I guess not.

Yeah, that used to be common practice, but we made a policy change
to discourage cross-posting a few years ago.  Too many newbies
cross-posting questions to umpteen lists.

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 7+ messages in thread


end of thread, other threads:[~2025-04-07 15:04 UTC | newest]

Thread overview: 7+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-04-04 13:17 pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate Alexander Korotkov <[email protected]>
2025-04-04 15:47 ` Melanie Plageman <[email protected]>
2025-04-04 16:03   ` Alexander Korotkov <[email protected]>
2025-04-07 09:15   ` David Rowley <[email protected]>
2025-04-07 13:08     ` Alexander Korotkov <[email protected]>
2025-04-07 13:50     ` Melanie Plageman <[email protected]>
2025-04-07 15:04       ` Tom Lane <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox