public inbox for [email protected]  
help / color / mirror / Atom feed
From: Richard Guo <[email protected]>
To: Pg Hackers <[email protected]>
Subject: Re: Convert NOT IN sublinks to anti-joins when safe
Date: Wed, 4 Feb 2026 18:47:37 +0900
Message-ID: <CAMbWs49tBNbW7S7VCbTasJEfPQKKU7fbJT2TMoi9uHMhT6rwqw@mail.gmail.com> (raw)
In-Reply-To: <CAMbWs495eF=-fSa5CwJS6B-BaEi3ARp0UNb4Lt3EkgUGZJwkAQ@mail.gmail.com>
References: <CAMbWs495eF=-fSa5CwJS6B-BaEi3ARp0UNb4Lt3EkgUGZJwkAQ@mail.gmail.com>

On Tue, Feb 3, 2026 at 4:12 PM Richard Guo <[email protected]> wrote:
> This topic has been discussed several times in the past.  Due to the
> semantic mismatch regarding NULL handling, NOT IN is not ordinarily
> equivalent to an anti-join.  However, if we can prove that neither the
> outer expressions nor the subquery outputs can yield NULL values, it
> should be safe to convert NOT IN to an anti-join.

I've noticed a loose end in the v1 patch.

The semantic gap between NOT IN and anti-join actually exists whenever
the operator returns NULL.  For NOT IN, if (A op B) returns NULL, then
NOT (NULL) evaluates to NULL (effectively false), and the row is
discarded.  In contrast, for an anti-join, if (A op B) returns NULL,
it implies no match was found, and the anti-join logic dictates that
the row should be kept.

To guarantee that (A op B) never returns NULL, the current patch
verifies that both A and B are non-nullable.  However, this is not
sufficient.  The "op" might be an operator that returns NULL on
non-null inputs.

On the other hand, if "op" does not return NULL on NULL inputs, like
IS DISTINCT FROM, we technically would not even need to require that A
and B are non-nullable.

Is there a convenient way to verify that an operator never returns
NULL on non-null inputs?  Would it be sufficient to insist that the
operator belongs to btree opclass (assuming that the strict ordering
requirements of btree imply this safety)?

And, is it worth checking if an operator never returns NULL even on
NULL inputs?  If we can identify such operators, we should be able to
remove the requirement that both sides of NOT IN must be non-nullable.
Is there a convenient way to check for such operators?

- Richard






view thread (13+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Convert NOT IN sublinks to anti-joins when safe
  In-Reply-To: <CAMbWs49tBNbW7S7VCbTasJEfPQKKU7fbJT2TMoi9uHMhT6rwqw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox