public inbox for [email protected]
help / color / mirror / Atom feedFrom: Richard Guo <[email protected]>
To: Pg Hackers <[email protected]>
Subject: var_is_nonnullable() fails to handle invalid NOT NULL constraints
Date: Fri, 10 Apr 2026 17:48:26 +0900
Message-ID: <CAMbWs48ALW=mR0ydQ62dGS-Q+3D7WdDSh=EWDezcKp19xi=TUA@mail.gmail.com> (raw)
While fixing another bug in var_is_nonnullable(), I noticed $subject.
The NOTNULL_SOURCE_SYSCACHE code path (newly added for the NOT IN to
anti-join transformation) checks pg_attribute.attnotnull, which can be
true even for invalid (NOT VALID) NOT NULL constraints.
The consequence is that query_outputs_are_not_nullable() could wrongly
conclude that a subquery's output is non-nullable, causing NOT IN to
be incorrectly converted to an anti-join.
The attached fix checks the attnullability field in the relation's
tuple descriptor instead, which correctly distinguishes valid from
invalid constraints. This is also consistent with what we do in
get_relation_notnullatts().
It could be argued that the added table_open/table_close call is a
performance concern, but I don't think so:
1. The relation is already locked by the rewriter, so
table_open(rte->relid, NoLock) is just a relcache lookup.
2. This code path is only reached when converting NOT IN to an
anti-join, and only after the outer side of the test expression has
already been proved non-nullable.
3. It is only called for relation RTEs in the subquery.
Thoughts?
- Richard
Attachments:
[application/octet-stream] v1-0001-Fix-var_is_nonnullable-to-handle-invalid-NOT-NULL.patch (8.7K, 2-v1-0001-Fix-var_is_nonnullable-to-handle-invalid-NOT-NULL.patch)
download | inline diff:
From 5192aef920936133c7ffb7482465e9a1847a0164 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 10 Apr 2026 16:51:18 +0900
Subject: [PATCH v1] Fix var_is_nonnullable() to handle invalid NOT NULL
constraints
The NOTNULL_SOURCE_SYSCACHE code path in var_is_nonnullable() used
get_attnotnull() to check pg_attribute.attnotnull, which is true for
both valid and invalid (NOT VALID) NOT NULL constraints. An invalid
constraint does not guarantee the absence of NULLs, so this could lead
to incorrect results. For example, query_outputs_are_not_nullable()
could wrongly conclude that a subquery's output is non-nullable,
causing NOT IN to be incorrectly converted to an anti-join.
Fix by checking the attnullability field in the relation's tuple
descriptor instead, which correctly distinguishes valid from invalid
constraints, consistent with what the NOTNULL_SOURCE_HASHTABLE code
path already does.
While at it, rename NOTNULL_SOURCE_SYSCACHE to NOTNULL_SOURCE_CATALOG
to reflect that this code path no longer uses a syscache lookup, and
remove the now-unused get_attnotnull() function.
---
src/backend/optimizer/util/clauses.c | 34 +++++++++++++++++++------
src/backend/utils/cache/lsyscache.c | 27 --------------------
src/include/optimizer/optimizer.h | 2 +-
src/include/utils/lsyscache.h | 1 -
src/test/regress/expected/subselect.out | 23 +++++++++++++++++
src/test/regress/sql/subselect.sql | 14 ++++++++++
6 files changed, 64 insertions(+), 37 deletions(-)
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index cd4e2e86c6d..598b4edacce 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -20,6 +20,7 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/table.h"
#include "catalog/pg_class.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_language.h"
@@ -59,6 +60,7 @@
#include "utils/jsonpath.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/syscache.h"
#include "utils/typcache.h"
@@ -2134,7 +2136,7 @@ query_outputs_are_not_nullable(Query *query)
* parse tree, we need to look up the not-null constraints from the
* system catalogs.
*/
- if (expr_is_nonnullable(&subroot, expr, NOTNULL_SOURCE_SYSCACHE))
+ if (expr_is_nonnullable(&subroot, expr, NOTNULL_SOURCE_CATALOG))
continue;
if (IsA(expr, Var))
@@ -4696,13 +4698,19 @@ var_is_nonnullable(PlannerInfo *root, Var *var, NotNullSource source)
return bms_is_member(var->varattno, notnullattnums);
}
- case NOTNULL_SOURCE_SYSCACHE:
+ case NOTNULL_SOURCE_CATALOG:
{
/*
- * We look up the "attnotnull" field in the attribute
- * relation.
+ * We check the attnullability field in the tuple descriptor.
+ * This is necessary rather than checking the attnotnull field
+ * from the attribute relation, because attnotnull is also set
+ * for invalid (NOT VALID) NOT NULL constraints, which do not
+ * guarantee the absence of NULLs.
*/
RangeTblEntry *rte;
+ Relation rel;
+ CompactAttribute *attr;
+ bool result;
rte = planner_rt_fetch(var->varno, root);
@@ -4723,7 +4731,17 @@ var_is_nonnullable(PlannerInfo *root, Var *var, NotNullSource source)
rte->relkind != RELKIND_PARTITIONED_TABLE)
return false;
- return get_attnotnull(rte->relid, var->varattno);
+ /*
+ * We need not lock the relation since it was already locked
+ * by the rewriter.
+ */
+ rel = table_open(rte->relid, NoLock);
+ attr = TupleDescCompactAttr(RelationGetDescr(rel),
+ var->varattno - 1);
+ result = (attr->attnullability == ATTNULLABLE_VALID);
+ table_close(rel, NoLock);
+
+ return result;
}
default:
elog(ERROR, "unrecognized NotNullSource: %d",
@@ -4746,9 +4764,9 @@ var_is_nonnullable(PlannerInfo *root, Var *var, NotNullSource source)
* - NOTNULL_SOURCE_HASHTABLE: Used when RelOptInfos are not yet available,
* but we have already collected relation-level not-null constraints into the
* global hash table.
- * - NOTNULL_SOURCE_SYSCACHE: Used for raw parse trees where neither
- * RelOptInfos nor the hash table are available. In this case, we have to
- * look up the 'attnotnull' field directly in the system catalogs.
+ * - NOTNULL_SOURCE_CATALOG: Used for raw parse trees where neither
+ * RelOptInfos nor the hash table are available. In this case, we check the
+ * column's attnullability in the tuple descriptor.
*
* For now, we support only a limited set of expression types. Support for
* additional node types can be added in the future.
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 9b2b97f5cba..fa9fe9ffa26 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -1114,33 +1114,6 @@ get_attoptions(Oid relid, int16 attnum)
return result;
}
-/*
- * get_attnotnull
- *
- * Given the relation id and the attribute number,
- * return the "attnotnull" field from the attribute relation.
- */
-bool
-get_attnotnull(Oid relid, AttrNumber attnum)
-{
- HeapTuple tp;
- bool result = false;
-
- tp = SearchSysCache2(ATTNUM,
- ObjectIdGetDatum(relid),
- Int16GetDatum(attnum));
-
- if (HeapTupleIsValid(tp))
- {
- Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(tp);
-
- result = att_tup->attnotnull;
- ReleaseSysCache(tp);
- }
-
- return result;
-}
-
/* ---------- PG_CAST CACHE ---------- */
/*
diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h
index e8b409afb7f..cb6241e2bdd 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -135,7 +135,7 @@ typedef enum
{
NOTNULL_SOURCE_RELOPT, /* Use RelOptInfo */
NOTNULL_SOURCE_HASHTABLE, /* Use Global Hash Table */
- NOTNULL_SOURCE_SYSCACHE, /* Use System Catalog */
+ NOTNULL_SOURCE_CATALOG, /* Use System Catalog */
} NotNullSource;
extern bool contain_mutable_functions(Node *clause);
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index e57795fa01f..2e0258d877e 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -99,7 +99,6 @@ extern Oid get_atttype(Oid relid, AttrNumber attnum);
extern void get_atttypetypmodcoll(Oid relid, AttrNumber attnum,
Oid *typid, int32 *typmod, Oid *collid);
extern Datum get_attoptions(Oid relid, int16 attnum);
-extern bool get_attnotnull(Oid relid, AttrNumber attnum);
extern Oid get_cast_oid(Oid sourcetypeid, Oid targettypeid, bool missing_ok);
extern char *get_collation_name(Oid colloid);
extern bool get_collation_isdeterministic(Oid colloid);
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 200236a0a69..a3778c23c34 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -3761,4 +3761,27 @@ WHERE NOT id ?= ANY (SELECT id FROM not_null_tab);
-> Seq Scan on not_null_tab not_null_tab_1
(5 rows)
+-- No ANTI JOIN: the inner side has an unvalidated NOT NULL constraint, so
+-- the column might contain NULLs.
+CREATE TEMP TABLE notnull_notvalid_tab (id int);
+INSERT INTO notnull_notvalid_tab VALUES (NULL);
+ALTER TABLE notnull_notvalid_tab ADD CONSTRAINT nn NOT NULL id NOT VALID;
+EXPLAIN (COSTS OFF)
+SELECT * FROM not_null_tab
+WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
+ QUERY PLAN
+----------------------------------------------------------
+ Seq Scan on not_null_tab
+ Filter: (NOT (ANY (id = (hashed SubPlan any_1).col1)))
+ SubPlan any_1
+ -> Seq Scan on notnull_notvalid_tab
+(4 rows)
+
+-- NOT IN with NULL on inner side should return no rows
+SELECT * FROM not_null_tab
+WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
+ id | val
+----+-----
+(0 rows)
+
ROLLBACK;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 4cd016f4ac3..1a02c3f86c0 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -1632,4 +1632,18 @@ EXPLAIN (COSTS OFF)
SELECT * FROM not_null_tab
WHERE NOT id ?= ANY (SELECT id FROM not_null_tab);
+-- No ANTI JOIN: the inner side has an unvalidated NOT NULL constraint, so
+-- the column might contain NULLs.
+CREATE TEMP TABLE notnull_notvalid_tab (id int);
+INSERT INTO notnull_notvalid_tab VALUES (NULL);
+ALTER TABLE notnull_notvalid_tab ADD CONSTRAINT nn NOT NULL id NOT VALID;
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM not_null_tab
+WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
+
+-- NOT IN with NULL on inner side should return no rows
+SELECT * FROM not_null_tab
+WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
+
ROLLBACK;
--
2.39.5 (Apple Git-154)
view thread (3+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: var_is_nonnullable() fails to handle invalid NOT NULL constraints
In-Reply-To: <CAMbWs48ALW=mR0ydQ62dGS-Q+3D7WdDSh=EWDezcKp19xi=TUA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox