public inbox for [email protected]help / color / mirror / Atom feed
Re: using index to speedup add not null constraints to a table 4+ messages / 2 participants [nested] [flat]
* Re: using index to speedup add not null constraints to a table @ 2026-02-06 04:58 jian he <[email protected]> 2026-04-07 07:39 ` Re: using index to speedup add not null constraints to a table jian he <[email protected]> 0 siblings, 1 reply; 4+ messages in thread From: jian he @ 2026-02-06 04:58 UTC (permalink / raw) To: Álvaro Herrera <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> On Wed, Feb 4, 2026 at 8:52 PM jian he <[email protected]> wrote: > > hi. > > v11 is attached. > Mainly fixes regress test failures. I have tested locally on https://cirrus-ci.com/build/5064590561640448 this time, the attachment should have no regress test failure. -- jian https://www.enterprisedb.com/ Attachments: [text/x-patch] v12-0001-using-indexscan-to-speedup-add-not-null-constraints.patch (21.0K, 2-v12-0001-using-indexscan-to-speedup-add-not-null-constraints.patch) download | inline diff: From 7363acf27135b0caad04703af27e5bae75fdcbe7 Mon Sep 17 00:00:00 2001 From: jian he <[email protected]> Date: Fri, 6 Feb 2026 11:05:30 +0800 Subject: [PATCH v12 1/1] using indexscan to speedup add not null constraints This patch tries to use index_beginscan() / index_getnext() / index_endscan() mentioned in [1] to speedup adding not-null constraints to the existing table. The main logic happens in phase3 ATRewriteTable 1. collect all not-null constraints. 2. For each not-null constraint, check whether there is a corresponding index available for validation. If not, then cannot use indexscan mechanism to verify not-null constraints. 3. If any of the following conditions are true * table scan or rewrite is required, * table wasn't locked with `AccessExclusiveLock` * the NOT NULL constraint applies to a virtual generated column then index scan cannot be used for fast validation. If all conditions are satisfied, attempt to use indexscan to verify whether the column contains any NULL values. concurrency concern: ALTER TABLE SET NOT NULL will take an ACCESS EXCLUSIVE lock, so there is less variant of racing issue can occur? to prove accurate, I wrote some isolation tests. see[2] references: [1] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com [2] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de discussion: https://postgr.es/m/CACJufxFiW=4k1is=F1J=r-Cx1RuByXQPUrWB331U47rSnGz+hw@mail.gmail.com commitfest entry: https://commitfest.postgresql.org/patch/5444 --- src/backend/commands/tablecmds.c | 285 +++++++++++++++++++++- src/test/regress/expected/constraints.out | 65 +++++ src/test/regress/sql/constraints.sql | 48 ++++ 3 files changed, 392 insertions(+), 6 deletions(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index f976c0e5c7e..49dbc690e90 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -210,16 +210,19 @@ typedef struct AlteredTableInfo List *changedStatisticsDefs; /* string definitions of same */ } AlteredTableInfo; -/* Struct describing one new constraint to check in Phase 3 scan */ -/* Note: new not-null constraints are handled elsewhere */ +/* + * Struct describing one new constraint to check in Phase 3 scan. Note: new + * not-null constraints are handled here too. +*/ typedef struct NewConstraint { char *name; /* Constraint name, or NULL if none */ - ConstrType contype; /* CHECK or FOREIGN */ + ConstrType contype; /* CHECK or FOREIGN or NOT NULL */ Oid refrelid; /* PK rel, if FOREIGN */ Oid refindid; /* OID of PK's index, if FOREIGN */ bool conwithperiod; /* Whether the new FOREIGN KEY uses PERIOD */ Oid conid; /* OID of pg_constraint entry, if FOREIGN */ + int attnum; /* NOT NULL constraint attribute number */ Node *qual; /* Check expr or CONSTR_FOREIGN Constraint */ ExprState *qualstate; /* Execution state for CHECK expr */ } NewConstraint; @@ -745,6 +748,7 @@ static void ATExecMergePartitions(List **wqueue, AlteredTableInfo *tab, Relation static void ATExecSplitPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, PartitionCmd *cmd, AlterTableUtilityContext *context); +static bool index_check_notnull(Relation relation, List *notnull_attnums); /* ---------------------------------------------------------------- * DefineRelation @@ -6217,6 +6221,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) needscan = true; con->qualstate = ExecPrepareExpr((Expr *) expand_generated_columns_in_expr(con->qual, oldrel, 1), estate); break; + case CONSTR_NOTNULL: + /* Nothing to do here */ + break; case CONSTR_FOREIGN: /* Nothing to do here */ break; @@ -6270,10 +6277,78 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) wholeatt->attnum); } } - if (notnull_attrs || notnull_virtual_attrs) - needscan = true; } + /* + * The conditions for using indexscan mechanism fast verifying not-null + * constraints are quite strict. All of the following conditions must be + * met. + * + * 1. AlteredTableInfo->verify_new_notnull is true. + * + * 2. If table scan (for other reason, like check constraint verification) + * or table rewrite is expected later, using indexscan is just wastes + * cycles. + * + * 3. Indexes cannot be created on virtual generated columns, so fast + * checking not-null constraints is not applicable to them. + * + * 4. This applies to plain relations only. + * + * 5. To prevent possible concurrency issues, the table must already be + * locked with an AccessExclusiveLock, which is the lock obtained during + * ALTER TABLE SET NOT NULL. + */ + if (!needscan && + newrel == NULL && + tab->verify_new_notnull && + notnull_virtual_attrs == NIL && + oldrel->rd_rel->relkind == RELKIND_RELATION && + CheckRelationLockedByMe(oldrel, AccessExclusiveLock, false)) + { + List *notnull_attnums = NIL; + + Assert(!tab->rewrite); + + foreach(l, tab->constraints) + { + Form_pg_attribute attr; + NewConstraint *con = lfirst(l); + + /* + * This may be a FOREIGN KEY (even if not a CHECK constraint), + * therefore we cannot just raise error for non-NOT-NULL + * constraint. + */ + if (con->contype != CONSTR_NOTNULL) + continue; + + attr = TupleDescAttr(newTupDesc, con->attnum - 1); + + if (attr->attisdropped) + continue; + + Assert(attr->attnotnull); + Assert(attr->attnum == con->attnum); + Assert(attr->attgenerated != ATTRIBUTE_GENERATED_VIRTUAL); + + notnull_attnums = list_append_unique_int(notnull_attnums, + attr->attnum); + } + + if (notnull_attnums != NIL) + { + if (index_check_notnull(oldrel, notnull_attnums)) + ereport(DEBUG1, + errmsg_internal("all new not-null constraints on relation \"%s\" have been validated by using index scan", + RelationGetRelationName(oldrel))); + else + needscan = true; + } + } + else if (notnull_attrs || notnull_virtual_attrs) + needscan = true; + if (newrel || needscan) { ExprContext *econtext; @@ -7949,6 +8024,7 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName, CookedConstraint *ccon; List *cooked; bool is_no_inherit = false; + AlteredTableInfo *tab = ATGetQueueEntry(wqueue, rel);; /* Guard against stack overflow due to overly deep inheritance tree. */ check_stack_depth(); @@ -8077,6 +8153,25 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName, cooked = AddRelationNewConstraints(rel, NIL, list_make1(constraint), false, !recursing, false, NULL); ccon = linitial(cooked); + + Assert(ccon->contype == CONSTR_NOTNULL); + + /* + * We may using indexscan mechanism to vertify this not-null constraint in + * Phase3. Add this to-be-validated not-null constraint to Phase 3's + * queue. + */ + if (!ccon->skip_validation) + { + NewConstraint *newcon = palloc0_object(NewConstraint); + + newcon->name = ccon->name; + newcon->contype = CONSTR_NOTNULL; + newcon->attnum = ccon->attnum; + + tab->constraints = lappend(tab->constraints, newcon); + } + ObjectAddressSet(address, ConstraintRelationId, ccon->conoid); InvokeObjectPostAlterHook(RelationRelationId, @@ -9983,13 +10078,15 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel, { CookedConstraint *ccon = (CookedConstraint *) lfirst(lcon); - if (!ccon->skip_validation && ccon->contype != CONSTR_NOTNULL) + if (!ccon->skip_validation) { NewConstraint *newcon; newcon = palloc0_object(NewConstraint); newcon->name = ccon->name; newcon->contype = ccon->contype; + if (ccon->contype == CONSTR_NOTNULL) + newcon->attnum = ccon->attnum; newcon->qual = ccon->expr; tab->constraints = lappend(tab->constraints, newcon); @@ -13258,6 +13355,7 @@ QueueNNConstraintValidation(List **wqueue, Relation conrel, Relation rel, List *children = NIL; AttrNumber attnum; char *colname; + NewConstraint *newcon; con = (Form_pg_constraint) GETSTRUCT(contuple); Assert(con->contype == CONSTRAINT_NOTNULL); @@ -13321,7 +13419,19 @@ QueueNNConstraintValidation(List **wqueue, Relation conrel, Relation rel, /* Set attnotnull appropriately without queueing another validation */ set_attnotnull(NULL, rel, attnum, true, false); + /* + * Queue validation for phase 3. ALTER TABLE SET NOT NULL adds a NOT NULL + * constraint to AlteredTableInfo->constraints, for consistency, do the + * same here. + */ + newcon = palloc0_object(NewConstraint); + newcon->name = colname; + newcon->contype = CONSTR_NOTNULL; + newcon->attnum = attnum; + + /* Find or create work queue entry for this table */ tab = ATGetQueueEntry(wqueue, rel); + tab->constraints = lappend(tab->constraints, newcon); tab->verify_new_notnull = true; /* @@ -23340,3 +23450,166 @@ ATExecSplitPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, /* Restore the userid and security context. */ SetUserIdAndSecContext(save_userid, save_sec_context); } + +/* + * notnull_attnums: list of attribute numbers for all newly added NOT NULL + * constraints. + * + * For each NOT NULL attribute, first check whether the relation has a suitable + * index to fetch that attribute content. If so, using indexscan to verify that + * attribute contains no NULL values. + * + * Returning true indicates that all new NOT NULL constraints have been + * validated and that no additional table scan is required. + */ +static bool +index_check_notnull(Relation relation, List *notnull_attnums) +{ + SysScanDesc indscan; + ScanKeyData skey; + HeapTuple htup; + List *idxs = NIL; + List *attnums = NIL; + bool all_not_null = true; + ListCell *lc, + *lc2; + + Relation pg_index = table_open(IndexRelationId, AccessShareLock); + + /* Prepare to scan pg_index for entries having indrelid = this rel */ + ScanKeyInit(&skey, + Anum_pg_index_indrelid, + BTEqualStrategyNumber, + F_OIDEQ, + ObjectIdGetDatum(RelationGetRelid(relation))); + + indscan = systable_beginscan(pg_index, IndexIndrelidIndexId, true, + NULL, 1, &skey); + + while (HeapTupleIsValid(htup = systable_getnext(indscan))) + { + Form_pg_attribute attr; + Relation index_rel; + + Form_pg_index index = (Form_pg_index) GETSTRUCT(htup); + + /* + * We only use non-deferred, valid, and alive btree index to vertify + * not-null constraints + */ + if (!index->indimmediate || !index->indisvalid || !index->indislive) + continue; + + /* cannot use expression index or partial index too */ + if (!heap_attisnull(htup, Anum_pg_index_indexprs, NULL) || + !heap_attisnull(htup, Anum_pg_index_indpred, NULL)) + continue; + + index_rel = index_open(index->indexrelid, AccessShareLock); + + if (index_rel->rd_rel->relam != BTREE_AM_OID) + { + index_close(index_rel, AccessShareLock); + + continue; + } + + attr = TupleDescAttr(RelationGetDescr(relation), (index->indkey.values[0] - 1)); + + if (list_member_int(notnull_attnums, attr->attnum) && + !list_member_int(attnums, attr->attnum)) + { + attnums = lappend_int(attnums, attr->attnum); + idxs = lappend_oid(idxs, index->indexrelid); + } + + index_close(index_rel, NoLock); + } + systable_endscan(indscan); + table_close(pg_index, AccessShareLock); + + /* + * Each NOT NULL constraint requires a suitable index for verification. If + * no such index exists, fall back to a regular table scan to verify the + * constraints. + */ + if (attnums == NIL || + list_length(notnull_attnums) != list_length(attnums)) + return false; + + foreach_int(attno, notnull_attnums) + { + if (!list_member_int(attnums, attno)) + return false; + } + + forboth(lc, attnums, lc2, idxs) + { + SnapshotData DirtySnapshot; + IndexScanDesc index_scan; + ScanKeyData scankeys[INDEX_MAX_KEYS]; + AttrNumber sk_attno = -1; + AttrNumber attno = lfirst_int(lc); + Oid indexoid = lfirst_oid(lc2); + + Relation idxrel = index_open(indexoid, NoLock); + IndexInfo *indexInfo = BuildIndexInfo(idxrel); + int indnkeyatts = IndexRelationGetNumberOfKeyAttributes(idxrel); + + TupleTableSlot *existing_slot = table_slot_create(relation, NULL); + + /* + * Search the tuples that are in the index for any violations, + * including tuples that aren't visible yet. + */ + InitDirtySnapshot(DirtySnapshot); + + if (indexInfo->ii_IndexAttrNumbers[0] == attno) + sk_attno = 1; + else + elog(ERROR, "cache lookup failed for attribute number %d on index %u", + indexoid, attno); + + for (int i = 0; i < indnkeyatts; i++) + { + /* set up an IS NULL scan key so that we ignore not nulls */ + ScanKeyEntryInitialize(&scankeys[i], + SK_ISNULL | SK_SEARCHNULL, + sk_attno, /* index col to scan */ + InvalidStrategy, /* no strategy */ + InvalidOid, /* no strategy subtype */ + InvalidOid, /* no collation */ + InvalidOid, /* no reg proc for this */ + (Datum) 0); /* constant */ + } + + index_scan = index_beginscan(relation, + idxrel, + &DirtySnapshot, + NULL, + indnkeyatts, + 0); + index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0); + + while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot)) + { + /* + * At this point, we have found that some attribute value is NULL. + * btree indexes are never lossy, we don't need recheck the + * condition. + */ + all_not_null = false; + break; + } + + index_endscan(index_scan); + index_close(idxrel, NoLock); + ExecDropSingleTupleTableSlot(existing_slot); + + /* exit earlier */ + if (!all_not_null) + return false; + } + + return true; +} diff --git a/src/test/regress/expected/constraints.out b/src/test/regress/expected/constraints.out index ebc892a2a42..fc6f5457370 100644 --- a/src/test/regress/expected/constraints.out +++ b/src/test/regress/expected/constraints.out @@ -1717,3 +1717,68 @@ COMMENT ON CONSTRAINT inv_ck ON DOMAIN constraint_comments_dom IS 'comment on in CREATE TABLE regress_notnull1 (a integer); CREATE TABLE regress_notnull2 () INHERITS (regress_notnull1); ALTER TABLE ONLY regress_notnull2 ALTER COLUMN a SET NOT NULL; +CREATE TABLE tp_notnull (a int, b int, c int, d int, f1 int default 10, f2 int default 11) PARTITION BY range(a); +CREATE TABLE tp_notnull_1 partition of tp_notnull for values FROM ( 1 ) to (10); +CREATE TABLE tp_notnull_2 partition of tp_notnull for values FROM ( 10 ) to (21); +CREATE INDEX tp_notnull_idx_b ON tp_notnull(b); +CREATE INDEX tp_notnull_idx_c ON tp_notnull(c); +CREATE INDEX tp_notnull_idx_d ON tp_notnull(d); +CREATE INDEX tp_notnull_idx_d_include ON tp_notnull(d) include(f1); +CREATE INDEX tp_notnull_idx_f2 ON tp_notnull USING hash (f2 int4_ops); +INSERT INTO tp_notnull SELECT g, g + 10, g + 15, g + random(min=>1::int, max=>10::int) FROM generate_series(8,19) g; +SET log_statement to NONE; +SET client_min_messages TO 'debug1'; +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL; +DEBUG: all new not-null constraints on relation "tp_notnull_1" have been validated by using index scan +DEBUG: all new not-null constraints on relation "tp_notnull_2" have been validated by using index scan +ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL; +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull_1 ALTER COLUMN b SET NOT NULL; +DEBUG: all new not-null constraints on relation "tp_notnull_1" have been validated by using index scan +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull_2 ALTER COLUMN b SET NOT NULL; +DEBUG: all new not-null constraints on relation "tp_notnull_2" have been validated by using index scan +ALTER TABLE tp_notnull_2 ADD CONSTRAINT nn_d NOT NULL d NOT VALID; +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull_2 VALIDATE CONSTRAINT nn_d; +DEBUG: verifying table "tp_notnull_2" +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ADD CONSTRAINT nnf1 NOT NULL f1; +DEBUG: verifying table "tp_notnull_1" +DEBUG: verifying table "tp_notnull_2" +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ADD CONSTRAINT nnf2 NOT NULL f2; +DEBUG: verifying table "tp_notnull_1" +DEBUG: verifying table "tp_notnull_2" +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN a SET NOT NULL, ADD CONSTRAINT c_nn NOT NULL c; +DEBUG: verifying table "tp_notnull_1" +DEBUG: verifying table "tp_notnull_2" +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN d SET NOT NULL, ADD CONSTRAINT c_nn NOT NULL c, ALTER COLUMN b SET NOT NULL; +DEBUG: all new not-null constraints on relation "tp_notnull_1" have been validated by using index scan +ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL; +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL, ALTER COLUMN b SET DATA TYPE BIGINT; +DEBUG: rewriting table "tp_notnull_1" +DEBUG: building index "tp_notnull_1_c_idx" on table "tp_notnull_1" serially +DEBUG: index "tp_notnull_1_c_idx" can safely use deduplication +DEBUG: building index "tp_notnull_1_d_idx" on table "tp_notnull_1" serially +DEBUG: index "tp_notnull_1_d_idx" can safely use deduplication +DEBUG: building index "tp_notnull_1_d_f1_idx" on table "tp_notnull_1" serially +DEBUG: building index "tp_notnull_1_f2_idx" on table "tp_notnull_1" serially +DEBUG: building index "tp_notnull_1_b_idx" on table "tp_notnull_1" serially +DEBUG: index "tp_notnull_1_b_idx" can safely use deduplication +DEBUG: rewriting table "tp_notnull_2" +DEBUG: building index "tp_notnull_2_c_idx" on table "tp_notnull_2" serially +DEBUG: index "tp_notnull_2_c_idx" can safely use deduplication +DEBUG: building index "tp_notnull_2_d_idx" on table "tp_notnull_2" serially +DEBUG: index "tp_notnull_2_d_idx" can safely use deduplication +DEBUG: building index "tp_notnull_2_d_f1_idx" on table "tp_notnull_2" serially +DEBUG: building index "tp_notnull_2_f2_idx" on table "tp_notnull_2" serially +DEBUG: building index "tp_notnull_2_b_idx" on table "tp_notnull_2" serially +DEBUG: index "tp_notnull_2_b_idx" can safely use deduplication +RESET client_min_messages; +RESET log_statement; +DROP TABLE tp_notnull; diff --git a/src/test/regress/sql/constraints.sql b/src/test/regress/sql/constraints.sql index 1e9989698b6..3e6c8a1120a 100644 --- a/src/test/regress/sql/constraints.sql +++ b/src/test/regress/sql/constraints.sql @@ -1059,3 +1059,51 @@ COMMENT ON CONSTRAINT inv_ck ON DOMAIN constraint_comments_dom IS 'comment on in CREATE TABLE regress_notnull1 (a integer); CREATE TABLE regress_notnull2 () INHERITS (regress_notnull1); ALTER TABLE ONLY regress_notnull2 ALTER COLUMN a SET NOT NULL; + +CREATE TABLE tp_notnull (a int, b int, c int, d int, f1 int default 10, f2 int default 11) PARTITION BY range(a); +CREATE TABLE tp_notnull_1 partition of tp_notnull for values FROM ( 1 ) to (10); +CREATE TABLE tp_notnull_2 partition of tp_notnull for values FROM ( 10 ) to (21); +CREATE INDEX tp_notnull_idx_b ON tp_notnull(b); +CREATE INDEX tp_notnull_idx_c ON tp_notnull(c); +CREATE INDEX tp_notnull_idx_d ON tp_notnull(d); +CREATE INDEX tp_notnull_idx_d_include ON tp_notnull(d) include(f1); +CREATE INDEX tp_notnull_idx_f2 ON tp_notnull USING hash (f2 int4_ops); +INSERT INTO tp_notnull SELECT g, g + 10, g + 15, g + random(min=>1::int, max=>10::int) FROM generate_series(8,19) g; + +SET log_statement to NONE; +SET client_min_messages TO 'debug1'; + +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL; +ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL; + +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull_1 ALTER COLUMN b SET NOT NULL; + +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull_2 ALTER COLUMN b SET NOT NULL; + +ALTER TABLE tp_notnull_2 ADD CONSTRAINT nn_d NOT NULL d NOT VALID; +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull_2 VALIDATE CONSTRAINT nn_d; + +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ADD CONSTRAINT nnf1 NOT NULL f1; + +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ADD CONSTRAINT nnf2 NOT NULL f2; + +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN a SET NOT NULL, ADD CONSTRAINT c_nn NOT NULL c; + +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN d SET NOT NULL, ADD CONSTRAINT c_nn NOT NULL c, ALTER COLUMN b SET NOT NULL; + +ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL; +--debug message should not contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL, ALTER COLUMN b SET DATA TYPE BIGINT; + +RESET client_min_messages; +RESET log_statement; + +DROP TABLE tp_notnull; -- 2.34.1 ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: using index to speedup add not null constraints to a table 2026-02-06 04:58 Re: using index to speedup add not null constraints to a table jian he <[email protected]> @ 2026-04-07 07:39 ` jian he <[email protected]> 2026-04-07 10:02 ` Re: using index to speedup add not null constraints to a table Álvaro Herrera <[email protected]> 0 siblings, 1 reply; 4+ messages in thread From: jian he @ 2026-04-07 07:39 UTC (permalink / raw) To: Álvaro Herrera <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> On Fri, Feb 6, 2026 at 12:58 PM jian he <[email protected]> wrote: > +-- should produce a debug message containing "have been validated by using index scan" + ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL, ALTER COLUMN b SET DATA TYPE BIGINT; DEBUG: rehashing catalog cache id 64 for pg_class; 257 tups, 128 buckets The above DEBUG message make me realize that +SET client_min_messages TO 'debug1'; +--debug message should contain "have been validated by using index scan" +ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL; +DEBUG: all new not-null constraints on relation "tp_notnull_1" have been validated by using index scan +DEBUG: all new not-null constraints on relation "tp_notnull_2" have been validated by using index scan Using the above regression test to check if certain ``ereport(DEBUG1,`` is reached is unstable. Therefore, we fall back to the previously used TAP tests to ensure ereport(DEBUG1, errmsg_internal("all new not-null constraints on relation \"%s\" have been validated by using index scan", RelationGetRelationName(oldrel))); is being reached. Rebased and made some minor comment adjustments; no big change. -- jian https://www.enterprisedb.com/ Attachments: [text/x-patch] v13-0001-using-indexscan-to-speedup-add-not-null-constraints.patch (19.5K, 2-v13-0001-using-indexscan-to-speedup-add-not-null-constraints.patch) download | inline diff: From 5a07e49e06eff375968ec111b38d6580174e8cca Mon Sep 17 00:00:00 2001 From: jian he <[email protected]> Date: Tue, 7 Apr 2026 15:26:15 +0800 Subject: [PATCH v13 1/1] using indexscan to speedup add not null constraints This patch tries to use index_beginscan() / index_getnext() / index_endscan() mentioned in [1] to speedup adding not-null constraints to the existing table. The main logic happens in phase3 ATRewriteTable 1. collect all not-null constraints. 2. For each not-null constraint, check whether there is a corresponding index available for validation. If not, then cannot use indexscan mechanism to verify not-null constraints. 3. If any of the following conditions are true * table scan or rewrite is required, * table wasn't locked with `AccessExclusiveLock` * the NOT NULL constraint applies to a virtual generated column then index scan cannot be used for fast validation. If all conditions are satisfied, attempt to use indexscan to verify whether the column contains any NULL values. concurrency concern: ALTER TABLE SET NOT NULL will take an ACCESS EXCLUSIVE lock, so there is less variant of racing issue can occur. see[2] references: [1] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com [2] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de discussion: https://postgr.es/m/CACJufxFiW=4k1is=F1J=r-Cx1RuByXQPUrWB331U47rSnGz+hw@mail.gmail.com commitfest entry: https://commitfest.postgresql.org/patch/5444 --- src/backend/commands/tablecmds.c | 282 +++++++++++++++++- src/test/modules/test_misc/meson.build | 1 + .../t/012_indexscan_validate_notnull.pl | 118 ++++++++ 3 files changed, 396 insertions(+), 5 deletions(-) create mode 100644 src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index eec09ba1ded..7b88d0882bd 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -211,16 +211,19 @@ typedef struct AlteredTableInfo List *changedStatisticsDefs; /* string definitions of same */ } AlteredTableInfo; -/* Struct describing one new constraint to check in Phase 3 scan */ -/* Note: new not-null constraints are handled elsewhere */ +/* + * Struct describing one new constraint to check in Phase 3 scan. Note: new + * not-null constraints are handled here too. +*/ typedef struct NewConstraint { char *name; /* Constraint name, or NULL if none */ - ConstrType contype; /* CHECK or FOREIGN */ + ConstrType contype; /* CHECK or FOREIGN or NOT NULL */ Oid refrelid; /* PK rel, if FOREIGN */ Oid refindid; /* OID of PK's index, if FOREIGN */ bool conwithperiod; /* Whether the new FOREIGN KEY uses PERIOD */ Oid conid; /* OID of pg_constraint entry, if FOREIGN */ + int attnum; /* NOT NULL constraint attribute number */ Node *qual; /* Check expr or CONSTR_FOREIGN Constraint */ ExprState *qualstate; /* Execution state for CHECK expr */ } NewConstraint; @@ -760,6 +763,7 @@ static void ATExecMergePartitions(List **wqueue, AlteredTableInfo *tab, Relation static void ATExecSplitPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, PartitionCmd *cmd, AlterTableUtilityContext *context); +static bool index_check_notnull(Relation relation, List *notnull_attnums); /* ---------------------------------------------------------------- * DefineRelation @@ -6251,6 +6255,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) needscan = true; con->qualstate = ExecPrepareExpr((Expr *) expand_generated_columns_in_expr(con->qual, oldrel, 1), estate); break; + case CONSTR_NOTNULL: + /* Nothing to do here */ + break; case CONSTR_FOREIGN: /* Nothing to do here */ break; @@ -6304,9 +6311,74 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) wholeatt->attnum); } } - if (notnull_attrs || notnull_virtual_attrs) + } + + /* + * The conditions for using indexscan mechanism fast verifying not-null + * constraints are quite strict. All of the following conditions must be + * met. + * + * 1. AlteredTableInfo->verify_new_notnull is true. + * + * 2. No table scan (e.g., for CHECK constraint verification) or table + * rewrite is expected later, if one is, using indexscan would just wastes + * cycles. + * + * 3. Indexes cannot be created on virtual generated columns, so fast + * checking not-null constraints is not applicable to them. + * + * 4. The relation must be a plain table. + * + * 5. To prevent possible concurrency issues, the table must already be + * locked with an AccessExclusiveLock, which is the lock obtained during + * ALTER TABLE SET NOT NULL. + */ + if (!needscan && + newrel == NULL && + tab->verify_new_notnull && + notnull_virtual_attrs == NIL && + oldrel->rd_rel->relkind == RELKIND_RELATION && + CheckRelationLockedByMe(oldrel, AccessExclusiveLock, false)) + { + List *notnull_attnums = NIL; + + Assert(!tab->rewrite); + + foreach(l, tab->constraints) + { + Form_pg_attribute attr; + NewConstraint *con = lfirst(l); + + /* + * Skip any constraint that is not a NOT NULL constraint. We + * cannot simply raise an error here, because this might be a + * FOREIGN KEY (even if it isn't a CHECK constraint). + */ + if (con->contype != CONSTR_NOTNULL) + continue; + + attr = TupleDescAttr(newTupDesc, con->attnum - 1); + + if (attr->attisdropped) + continue; + + Assert(attr->attnotnull); + Assert(attr->attnum == con->attnum); + Assert(attr->attgenerated != ATTRIBUTE_GENERATED_VIRTUAL); + + notnull_attnums = list_append_unique_int(notnull_attnums, + attr->attnum); + } + + if (notnull_attnums != NIL && index_check_notnull(oldrel, notnull_attnums)) + ereport(DEBUG1, + errmsg_internal("all new not-null constraints on relation \"%s\" have been validated by using index scan", + RelationGetRelationName(oldrel))); + else needscan = true; } + else if (notnull_attrs || notnull_virtual_attrs) + needscan = true; if (newrel || needscan) { @@ -8009,6 +8081,7 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName, CookedConstraint *ccon; List *cooked; bool is_no_inherit = false; + AlteredTableInfo *tab = ATGetQueueEntry(wqueue, rel);; /* Guard against stack overflow due to overly deep inheritance tree. */ check_stack_depth(); @@ -8137,6 +8210,25 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName, cooked = AddRelationNewConstraints(rel, NIL, list_make1(constraint), false, !recursing, false, NULL); ccon = linitial(cooked); + + Assert(ccon->contype == CONSTR_NOTNULL); + + /* + * We may using indexscan mechanism to vertify this not-null constraint in + * Phase3. Add this to-be-validated not-null constraint to Phase 3's + * queue. + */ + if (!ccon->skip_validation) + { + NewConstraint *newcon = palloc0_object(NewConstraint); + + newcon->name = ccon->name; + newcon->contype = CONSTR_NOTNULL; + newcon->attnum = ccon->attnum; + + tab->constraints = lappend(tab->constraints, newcon); + } + ObjectAddressSet(address, ConstraintRelationId, ccon->conoid); /* Mark pg_attribute.attnotnull for the column and queue validation */ @@ -10030,13 +10122,15 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel, { CookedConstraint *ccon = (CookedConstraint *) lfirst(lcon); - if (!ccon->skip_validation && ccon->contype != CONSTR_NOTNULL) + if (!ccon->skip_validation) { NewConstraint *newcon; newcon = palloc0_object(NewConstraint); newcon->name = ccon->name; newcon->contype = ccon->contype; + if (ccon->contype == CONSTR_NOTNULL) + newcon->attnum = ccon->attnum; newcon->qual = ccon->expr; tab->constraints = lappend(tab->constraints, newcon); @@ -13474,6 +13568,7 @@ QueueNNConstraintValidation(List **wqueue, Relation conrel, Relation rel, List *children = NIL; AttrNumber attnum; char *colname; + NewConstraint *newcon; con = (Form_pg_constraint) GETSTRUCT(contuple); Assert(con->contype == CONSTRAINT_NOTNULL); @@ -13537,7 +13632,21 @@ QueueNNConstraintValidation(List **wqueue, Relation conrel, Relation rel, /* Set attnotnull appropriately without queueing another validation */ set_attnotnull(NULL, rel, attnum, true, false); + /* + * Queue validation for phase 3. ALTER TABLE SET NOT NULL adds a NOT NULL + * constraint (NewConstraint) to AlteredTableInfo->constraints; for + * consistency, we do the same here. This setup allows phase 3 + * (ATRewriteTable) to quickly validate the column's NULL status using an + * index scan, if all conditions are met. + */ + newcon = palloc0_object(NewConstraint); + newcon->name = colname; + newcon->contype = CONSTR_NOTNULL; + newcon->attnum = attnum; + + /* Find or create work queue entry for this table */ tab = ATGetQueueEntry(wqueue, rel); + tab->constraints = lappend(tab->constraints, newcon); tab->verify_new_notnull = true; /* @@ -23602,3 +23711,166 @@ ATExecSplitPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, /* Restore the userid and security context. */ SetUserIdAndSecContext(save_userid, save_sec_context); } + +/* + * notnull_attnums: list of attribute numbers for all newly added NOT NULL + * constraints. + * + * For each NOT NULL attribute, first check whether the relation has a suitable + * index to fetch that attribute content. If so, using indexscan to verify that + * attribute contains no NULL values. + * + * Returning true means the new NOT NULL constraints are fully verified and no + * extra table scans are necessary. + */ +static bool +index_check_notnull(Relation relation, List *notnull_attnums) +{ + SysScanDesc indscan; + ScanKeyData skey; + HeapTuple htup; + List *idxs = NIL; + List *attnums = NIL; + bool all_not_null = true; + ListCell *lc, + *lc2; + + Relation pg_index = table_open(IndexRelationId, AccessShareLock); + + /* Prepare to scan pg_index for entries having indrelid = this rel */ + ScanKeyInit(&skey, + Anum_pg_index_indrelid, + BTEqualStrategyNumber, + F_OIDEQ, + ObjectIdGetDatum(RelationGetRelid(relation))); + + indscan = systable_beginscan(pg_index, IndexIndrelidIndexId, true, + NULL, 1, &skey); + + while (HeapTupleIsValid(htup = systable_getnext(indscan))) + { + Form_pg_attribute attr; + Relation index_rel; + + Form_pg_index index = (Form_pg_index) GETSTRUCT(htup); + + /* + * We only use non-deferred, valid, and live b-tree indexes to verify + * NOT NULL constraints. + */ + if (!index->indimmediate || !index->indisvalid || !index->indislive) + continue; + + /* cannot use expression index or partial index too */ + if (!heap_attisnull(htup, Anum_pg_index_indexprs, NULL) || + !heap_attisnull(htup, Anum_pg_index_indpred, NULL)) + continue; + + index_rel = index_open(index->indexrelid, AccessShareLock); + + if (index_rel->rd_rel->relam != BTREE_AM_OID) + { + index_close(index_rel, AccessShareLock); + + continue; + } + + attr = TupleDescAttr(RelationGetDescr(relation), (index->indkey.values[0] - 1)); + + if (list_member_int(notnull_attnums, attr->attnum) && + !list_member_int(attnums, attr->attnum)) + { + attnums = lappend_int(attnums, attr->attnum); + idxs = lappend_oid(idxs, index->indexrelid); + } + + index_close(index_rel, NoLock); + } + systable_endscan(indscan); + table_close(pg_index, AccessShareLock); + + /* + * Verify NOT NULL constraints using a suitable index, falling back to a + * full table scan if a suitable index is not present. + */ + if (attnums == NIL || + list_length(notnull_attnums) != list_length(attnums)) + return false; + + foreach_int(attno, notnull_attnums) + { + if (!list_member_int(attnums, attno)) + return false; + } + + forboth(lc, attnums, lc2, idxs) + { + SnapshotData DirtySnapshot; + IndexScanDesc index_scan; + ScanKeyData scankeys[INDEX_MAX_KEYS]; + AttrNumber sk_attno = -1; + AttrNumber attno = lfirst_int(lc); + Oid indexoid = lfirst_oid(lc2); + + Relation idxrel = index_open(indexoid, NoLock); + IndexInfo *indexInfo = BuildIndexInfo(idxrel); + int indnkeyatts = IndexRelationGetNumberOfKeyAttributes(idxrel); + + TupleTableSlot *existing_slot = table_slot_create(relation, NULL); + + /* + * Search the tuples that are in the index for any violations, + * including tuples that aren't visible yet. + */ + InitDirtySnapshot(DirtySnapshot); + + if (indexInfo->ii_IndexAttrNumbers[0] == attno) + sk_attno = 1; + else + elog(ERROR, "cache lookup failed for attribute number %d on index %u", + indexoid, attno); + + for (int i = 0; i < indnkeyatts; i++) + { + /* set up an IS NULL scan key so that we ignore not nulls */ + ScanKeyEntryInitialize(&scankeys[i], + SK_ISNULL | SK_SEARCHNULL, + sk_attno, /* index col to scan */ + InvalidStrategy, /* no strategy */ + InvalidOid, /* no strategy subtype */ + InvalidOid, /* no collation */ + InvalidOid, /* no reg proc for this */ + (Datum) 0); /* constant */ + } + + index_scan = index_beginscan(relation, + idxrel, + &DirtySnapshot, + NULL, + indnkeyatts, + 0, + SO_NONE); + index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0); + + while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot)) + { + /* + * At this point, we have found that some attribute value is NULL. + * btree indexes are never lossy, we don't need recheck the + * condition. + */ + all_not_null = false; + break; + } + + index_endscan(index_scan); + index_close(idxrel, NoLock); + ExecDropSingleTupleTableSlot(existing_slot); + + /* exit earlier */ + if (!all_not_null) + return false; + } + + return true; +} diff --git a/src/test/modules/test_misc/meson.build b/src/test/modules/test_misc/meson.build index 1b25d98f7f3..e02a2cb0c42 100644 --- a/src/test/modules/test_misc/meson.build +++ b/src/test/modules/test_misc/meson.build @@ -20,6 +20,7 @@ tests += { 't/009_log_temp_files.pl', 't/010_index_concurrently_upsert.pl', 't/011_lock_stats.pl', + 't/012_indexscan_validate_notnull.pl', ], # The injection points are cluster-wide, so disable installcheck 'runningcheck': false, diff --git a/src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl b/src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl new file mode 100644 index 00000000000..53e38588740 --- /dev/null +++ b/src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl @@ -0,0 +1,118 @@ +# Copyright (c) 2026, PostgreSQL Global Development Group + +# Verify that indexscan mechanism can speedup ALTER TABLE ADD NOT NULL + +use strict; +use warnings FATAL => 'all'; + +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; +use Time::HiRes qw(usleep); + +# Initialize a test cluster +my $node = PostgreSQL::Test::Cluster->new('primary'); +$node->init(); +# Turn message level up to DEBUG1 so that we get the messages we want to see +$node->append_conf('postgresql.conf', 'client_min_messages = DEBUG1'); +$node->start; + +# Run a SQL command and return psql's stderr (including debug messages) +sub run_sql_command +{ + my $sql = shift; + my $stderr; + + $node->psql( + 'postgres', + $sql, + stderr => \$stderr, + on_error_die => 1, + on_error_stop => 1); + return $stderr; +} + +# Verify that run_sql_command output confirms an index scan is used for the NOT NULL check. +sub is_indexscan_veritify_notnull +{ + my $output = shift; + return index($output, 'DEBUG: all new not-null constraints on relation') != -1; +} + +my $output; + +note "test alter table SET NOT NULL using indexscan with partitioned table"; + +run_sql_command( + 'CREATE TABLE tp_notnull (a int, b int, c int, d int, f1 int default 1, f2 int default 2, f3 int default 3) PARTITION BY range(a); + CREATE TABLE tp_notnull_1 partition of tp_notnull for values from ( 1 ) to (10); + CREATE TABLE tp_notnull_2 partition of tp_notnull for values from ( 10 ) to (21); + INSERT INTO tp_notnull(a,b,c,d) SELECT g, g + 10, g + 15, g FROM generate_series(1,19) g; + CREATE INDEX ON tp_notnull(b); + CREATE INDEX ON tp_notnull(c); + CREATE INDEX ON tp_notnull(d); + CREATE INDEX ON tp_notnull(f1) INCLUDE (f2); + CREATE INDEX ON tp_notnull USING hash (f3 int4_ops); + '); + +# Using index scan for NOT NULL verification requires every not-null column must possess a suitable index +$output = run_sql_command( + 'ALTER TABLE tp_notnull ALTER COLUMN a SET NOT NULL, ADD CONSTRAINT c_nn NOT NULL c'); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan to verify not-null constraints on table tp_notnull'); + +# All columns have index on it, OK +$output = run_sql_command( + 'ALTER TABLE tp_notnull ALTER COLUMN d SET NOT NULL, ALTER COLUMN c SET NOT NULL, ALTER COLUMN b SET NOT NULL;'); +ok(is_indexscan_veritify_notnull($output), + 'using indexscan to verify not-null constraints on column b, c, d'); +ok( $output =~ + m/all new not-null constraints on relation "tp_notnull_1" have been validated by using index scan/, + 'all newly added constraints proved by indexscan'); +ok( $output =~ + m/all new not-null constraints on relation "tp_notnull_2" have been validated by using index scan/, + 'all newly added constraints proved by indexscan'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL;'); +# Now using index scans on partitions to quickly verify NOT NULL constraints +$output = run_sql_command('ALTER TABLE tp_notnull_1 ALTER COLUMN b SET NOT NULL;'); +ok(is_indexscan_veritify_notnull($output), 'index scan verifies NOT NULL for column b'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL;'); +# Cannot use indexscan to verify not-null constraint if table rewrite happens +$output = run_sql_command( + 'ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL, ALTER COLUMN b SET DATA TYPE bigint;' +); +ok(!is_indexscan_veritify_notnull($output), 'table rewrite on tp_notnull, cannot use indexscan to verify not-null constraints on it'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN d DROP NOT NULL'); + +# Cannot use indexscan because ALTER TABLE VALIDATE CONSTRAINT only hold +# ShareUpdateExclusiveLock lock on the table +run_sql_command('ALTER TABLE tp_notnull ADD CONSTRAINT tp_notnull_d NOT NULL d NOT VALID;'); +$output = run_sql_command( + 'ALTER TABLE tp_notnull VALIDATE CONSTRAINT tp_notnull_d;' +); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan for ALTER TABLE VALIDATE CONSTRAINT command'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN d DROP NOT NULL; + ALTER TABLE tp_notnull ADD CONSTRAINT tp_notnull_d NOT NULL d NOT VALID;'); + +# However this will be OK, since ALTER COLUMN SET NOT NULL will acquire AccessExclusiveLock +$output = run_sql_command( + 'ALTER TABLE tp_notnull VALIDATE CONSTRAINT tp_notnull_d, ALTER COLUMN b SET NOT NULL;' +); +ok(is_indexscan_veritify_notnull($output), 'using indexscan to verify not-null constraints on column b, c, d'); + +# Indexscan mechanism to verify NOT NULL constraints cannot apply to index INCLUDE column +$output = run_sql_command( + 'ALTER TABLE tp_notnull ADD CONSTRAINT nnf4 NOT NULL f2;' +); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan to validate not-null constraint on tp_notnull'); + +# Indexscan mechanism to verify NOT NULL constraints cannot apply to non-Btree index +$output = run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN f3 SET NOT NULL;'); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan to validate not-null constraints on tp_notnull'); + +$node->stop('fast'); + +done_testing(); \ No newline at end of file -- 2.34.1 ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: using index to speedup add not null constraints to a table 2026-02-06 04:58 Re: using index to speedup add not null constraints to a table jian he <[email protected]> 2026-04-07 07:39 ` Re: using index to speedup add not null constraints to a table jian he <[email protected]> @ 2026-04-07 10:02 ` Álvaro Herrera <[email protected]> 2026-04-08 01:45 ` Re: using index to speedup add not null constraints to a table jian he <[email protected]> 0 siblings, 1 reply; 4+ messages in thread From: Álvaro Herrera @ 2026-04-07 10:02 UTC (permalink / raw) To: jian he <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> Hello, On 2026-Apr-07, jian he wrote: > Rebased and made some minor comment adjustments; no big change. I think you may have not looked at my review at https://postgr.es/m/[email protected] because as far as I can tell, this new version has essentially the same structure as before that review. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: using index to speedup add not null constraints to a table 2026-02-06 04:58 Re: using index to speedup add not null constraints to a table jian he <[email protected]> 2026-04-07 07:39 ` Re: using index to speedup add not null constraints to a table jian he <[email protected]> 2026-04-07 10:02 ` Re: using index to speedup add not null constraints to a table Álvaro Herrera <[email protected]> @ 2026-04-08 01:45 ` jian he <[email protected]> 0 siblings, 0 replies; 4+ messages in thread From: jian he @ 2026-04-08 01:45 UTC (permalink / raw) To: Álvaro Herrera <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]> On Tue, Apr 7, 2026 at 6:02 PM Álvaro Herrera <[email protected]> wrote: > > Hello, > > I think you may have not looked at my review at > https://postgr.es/m/[email protected] > because as far as I can tell, this new version has essentially the same > structure as before that review. > Hi. I guess the point I haven't addressed yet is: > However, if we do this, then I think computing notnull_attrs is > pointless. So we should only change the order: do this new check first, > and if we find that any new not-null column is on a generated column, > then we compute both notnull_virtual_attrs and notnull_attrs. No? > The attached patch implements this reordering, first do index_check_notnull then compute both notnull_virtual_attrs and notnull_attrs. -- jian https://www.enterprisedb.com/ Attachments: [text/x-patch] v14-0001-using-indexscan-to-speedup-add-not-null-constraints.patch (20.1K, 2-v14-0001-using-indexscan-to-speedup-add-not-null-constraints.patch) download | inline diff: From b1dab79bb0c81e668811bd124360d602b5c4674d Mon Sep 17 00:00:00 2001 From: jian he <[email protected]> Date: Wed, 8 Apr 2026 09:33:56 +0800 Subject: [PATCH v14 1/1] using indexscan to speedup add not null constraints This patch tries to use index_beginscan() / index_getnext() / index_endscan() mentioned in [1] to speedup adding not-null constraints to the existing table. The main logic happens in phase3 ATRewriteTable 1. collect all not-null constraints. 2. For each not-null constraint, check whether there is a corresponding index available for validation. If not, then cannot use indexscan mechanism to verify not-null constraints. 3. If any of the following conditions are true * table scan or rewrite is required, * table wasn't locked with `AccessExclusiveLock` * the NOT NULL constraint applies to a virtual generated column then index scan cannot be used for fast validation. If all conditions are satisfied, attempt to use indexscan to verify whether the column contains any NULL values. concurrency concern: ALTER TABLE SET NOT NULL will take an ACCESS EXCLUSIVE lock, so there is less variant of racing issue can occur. see[2] references: [1] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com [2] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de discussion: https://postgr.es/m/CACJufxFiW=4k1is=F1J=r-Cx1RuByXQPUrWB331U47rSnGz+hw@mail.gmail.com commitfest entry: https://commitfest.postgresql.org/patch/5444 --- src/backend/commands/tablecmds.c | 295 +++++++++++++++++- src/test/modules/test_misc/meson.build | 1 + .../t/012_indexscan_validate_notnull.pl | 118 +++++++ 3 files changed, 409 insertions(+), 5 deletions(-) create mode 100644 src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index eec09ba1ded..178ac5f2a9b 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -211,16 +211,19 @@ typedef struct AlteredTableInfo List *changedStatisticsDefs; /* string definitions of same */ } AlteredTableInfo; -/* Struct describing one new constraint to check in Phase 3 scan */ -/* Note: new not-null constraints are handled elsewhere */ +/* + * Struct describing one new constraint to check in Phase 3 scan. Note: new + * not-null constraints are handled here too. +*/ typedef struct NewConstraint { char *name; /* Constraint name, or NULL if none */ - ConstrType contype; /* CHECK or FOREIGN */ + ConstrType contype; /* CHECK or FOREIGN or NOT NULL */ Oid refrelid; /* PK rel, if FOREIGN */ Oid refindid; /* OID of PK's index, if FOREIGN */ bool conwithperiod; /* Whether the new FOREIGN KEY uses PERIOD */ Oid conid; /* OID of pg_constraint entry, if FOREIGN */ + int attnum; /* NOT NULL constraint attribute number */ Node *qual; /* Check expr or CONSTR_FOREIGN Constraint */ ExprState *qualstate; /* Execution state for CHECK expr */ } NewConstraint; @@ -760,6 +763,7 @@ static void ATExecMergePartitions(List **wqueue, AlteredTableInfo *tab, Relation static void ATExecSplitPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, PartitionCmd *cmd, AlterTableUtilityContext *context); +static bool index_check_notnull(Relation relation, List *notnull_attnums); /* ---------------------------------------------------------------- * DefineRelation @@ -6251,6 +6255,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) needscan = true; con->qualstate = ExecPrepareExpr((Expr *) expand_generated_columns_in_expr(con->qual, oldrel, 1), estate); break; + case CONSTR_NOTNULL: + /* Nothing to do here */ + break; case CONSTR_FOREIGN: /* Nothing to do here */ break; @@ -6278,6 +6285,78 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) notnull_attrs = notnull_virtual_attrs = NIL; if (newrel || tab->verify_new_notnull) { + bool null_vertified = false; + + /* + * The conditions for using indexscan mechanism fast verifying + * not-null constraints are quite strict. All of the following + * conditions must be met. + * + * 1. AlteredTableInfo->verify_new_notnull is true. + * + * 2. No table scan (e.g., for CHECK constraint verification) or table + * rewrite is expected later, if one is, using indexscan would just + * wastes cycles. + * + * 3. Indexes cannot be created on virtual generated columns, so fast + * checking not-null constraints is not applicable to them. + * + * 4. The relation must be a plain table. + * + * 5. To prevent possible concurrency issues, the table must already + * be locked with an AccessExclusiveLock, which is the lock obtained + * during ALTER TABLE SET NOT NULL. + */ + if (!needscan && + newrel == NULL && + oldrel->rd_rel->relkind == RELKIND_RELATION && + CheckRelationLockedByMe(oldrel, AccessExclusiveLock, false)) + { + List *notnull_attnums = NIL; + + Assert(!tab->rewrite); + + foreach(l, tab->constraints) + { + Form_pg_attribute attr; + NewConstraint *con = lfirst(l); + + if (con->contype != CONSTR_NOTNULL) + continue; + + attr = TupleDescAttr(newTupDesc, con->attnum - 1); + + if (attr->attisdropped) + continue; + + Assert(attr->attnotnull); + Assert(attr->attnum == con->attnum); + + if (attr->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL) + { + needscan = true; + break; + } + + notnull_attnums = list_append_unique_int(notnull_attnums, + attr->attnum); + } + + if (!needscan) + { + if (index_check_notnull(oldrel, notnull_attnums)) + { + null_vertified = true; + + ereport(DEBUG1, + errmsg_internal("all new not-null constraints on relation \"%s\" have been validated by using index scan", + RelationGetRelationName(oldrel))); + } + else + needscan = true; + } + } + /* * If we are rebuilding the tuples OR if we added any new but not * verified not-null constraints, check all *valid* not-null @@ -6288,7 +6367,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) * not-null constraints over virtual generated columns; instead, they * are collected in notnull_virtual_attrs for verification elsewhere. */ - for (i = 0; i < newTupDesc->natts; i++) + for (i = 0; !null_vertified && i < newTupDesc->natts; i++) { CompactAttribute *attr = TupleDescCompactAttr(newTupDesc, i); @@ -6306,6 +6385,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) } if (notnull_attrs || notnull_virtual_attrs) needscan = true; + + if (null_vertified) + Assert(needscan == false); } if (newrel || needscan) @@ -8009,6 +8091,7 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName, CookedConstraint *ccon; List *cooked; bool is_no_inherit = false; + AlteredTableInfo *tab = ATGetQueueEntry(wqueue, rel);; /* Guard against stack overflow due to overly deep inheritance tree. */ check_stack_depth(); @@ -8137,6 +8220,25 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName, cooked = AddRelationNewConstraints(rel, NIL, list_make1(constraint), false, !recursing, false, NULL); ccon = linitial(cooked); + + Assert(ccon->contype == CONSTR_NOTNULL); + + /* + * We may using indexscan mechanism to vertify this not-null constraint in + * Phase3. Add this to-be-validated not-null constraint to Phase 3's + * queue. + */ + if (!ccon->skip_validation) + { + NewConstraint *newcon = palloc0_object(NewConstraint); + + newcon->name = ccon->name; + newcon->contype = CONSTR_NOTNULL; + newcon->attnum = ccon->attnum; + + tab->constraints = lappend(tab->constraints, newcon); + } + ObjectAddressSet(address, ConstraintRelationId, ccon->conoid); /* Mark pg_attribute.attnotnull for the column and queue validation */ @@ -10030,13 +10132,15 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel, { CookedConstraint *ccon = (CookedConstraint *) lfirst(lcon); - if (!ccon->skip_validation && ccon->contype != CONSTR_NOTNULL) + if (!ccon->skip_validation) { NewConstraint *newcon; newcon = palloc0_object(NewConstraint); newcon->name = ccon->name; newcon->contype = ccon->contype; + if (ccon->contype == CONSTR_NOTNULL) + newcon->attnum = ccon->attnum; newcon->qual = ccon->expr; tab->constraints = lappend(tab->constraints, newcon); @@ -13474,6 +13578,7 @@ QueueNNConstraintValidation(List **wqueue, Relation conrel, Relation rel, List *children = NIL; AttrNumber attnum; char *colname; + NewConstraint *newcon; con = (Form_pg_constraint) GETSTRUCT(contuple); Assert(con->contype == CONSTRAINT_NOTNULL); @@ -13537,7 +13642,21 @@ QueueNNConstraintValidation(List **wqueue, Relation conrel, Relation rel, /* Set attnotnull appropriately without queueing another validation */ set_attnotnull(NULL, rel, attnum, true, false); + /* + * Queue validation for phase 3. ALTER TABLE SET NOT NULL adds a NOT NULL + * constraint (NewConstraint) to AlteredTableInfo->constraints; for + * consistency, we do the same here. This setup allows phase 3 + * (ATRewriteTable) to quickly validate the column's NULL status using an + * index scan, if all conditions are met. + */ + newcon = palloc0_object(NewConstraint); + newcon->name = colname; + newcon->contype = CONSTR_NOTNULL; + newcon->attnum = attnum; + + /* Find or create work queue entry for this table */ tab = ATGetQueueEntry(wqueue, rel); + tab->constraints = lappend(tab->constraints, newcon); tab->verify_new_notnull = true; /* @@ -23602,3 +23721,169 @@ ATExecSplitPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, /* Restore the userid and security context. */ SetUserIdAndSecContext(save_userid, save_sec_context); } + +/* + * notnull_attnums: list of attribute numbers for all newly added NOT NULL + * constraints. + * + * For each NOT NULL attribute, first check whether the relation has a suitable + * index to fetch that attribute content. If so, using indexscan to verify that + * attribute contains no NULL values. + * + * Returning true means the new NOT NULL constraints are fully verified and no + * extra table scans are necessary. + */ +static bool +index_check_notnull(Relation relation, List *notnull_attnums) +{ + SysScanDesc indscan; + ScanKeyData skey; + HeapTuple htup; + List *idxs = NIL; + List *attnums = NIL; + bool all_not_null = true; + ListCell *lc, + *lc2; + + Relation pg_index = table_open(IndexRelationId, AccessShareLock); + + if (notnull_attnums == NIL) + return false; + + /* Prepare to scan pg_index for entries having indrelid = this rel */ + ScanKeyInit(&skey, + Anum_pg_index_indrelid, + BTEqualStrategyNumber, + F_OIDEQ, + ObjectIdGetDatum(RelationGetRelid(relation))); + + indscan = systable_beginscan(pg_index, IndexIndrelidIndexId, true, + NULL, 1, &skey); + + while (HeapTupleIsValid(htup = systable_getnext(indscan))) + { + Form_pg_attribute attr; + Relation index_rel; + + Form_pg_index index = (Form_pg_index) GETSTRUCT(htup); + + /* + * We only use non-deferred, valid, and live b-tree indexes to verify + * NOT NULL constraints. + */ + if (!index->indimmediate || !index->indisvalid || !index->indislive) + continue; + + /* cannot use expression index or partial index too */ + if (!heap_attisnull(htup, Anum_pg_index_indexprs, NULL) || + !heap_attisnull(htup, Anum_pg_index_indpred, NULL)) + continue; + + index_rel = index_open(index->indexrelid, AccessShareLock); + + if (index_rel->rd_rel->relam != BTREE_AM_OID) + { + index_close(index_rel, AccessShareLock); + + continue; + } + + attr = TupleDescAttr(RelationGetDescr(relation), (index->indkey.values[0] - 1)); + + if (list_member_int(notnull_attnums, attr->attnum) && + !list_member_int(attnums, attr->attnum)) + { + attnums = lappend_int(attnums, attr->attnum); + idxs = lappend_oid(idxs, index->indexrelid); + } + + index_close(index_rel, NoLock); + } + systable_endscan(indscan); + table_close(pg_index, AccessShareLock); + + /* + * Verify NOT NULL constraints using a suitable index, falling back to a + * full table scan if a suitable index is not present. + */ + if (attnums == NIL || + list_length(notnull_attnums) != list_length(attnums)) + return false; + + foreach_int(attno, notnull_attnums) + { + if (!list_member_int(attnums, attno)) + return false; + } + + forboth(lc, attnums, lc2, idxs) + { + SnapshotData DirtySnapshot; + IndexScanDesc index_scan; + ScanKeyData scankeys[INDEX_MAX_KEYS]; + AttrNumber sk_attno = -1; + AttrNumber attno = lfirst_int(lc); + Oid indexoid = lfirst_oid(lc2); + + Relation idxrel = index_open(indexoid, NoLock); + IndexInfo *indexInfo = BuildIndexInfo(idxrel); + int indnkeyatts = IndexRelationGetNumberOfKeyAttributes(idxrel); + + TupleTableSlot *existing_slot = table_slot_create(relation, NULL); + + /* + * Search the tuples that are in the index for any violations, + * including tuples that aren't visible yet. + */ + InitDirtySnapshot(DirtySnapshot); + + if (indexInfo->ii_IndexAttrNumbers[0] == attno) + sk_attno = 1; + else + elog(ERROR, "cache lookup failed for attribute number %d on index %u", + indexoid, attno); + + for (int i = 0; i < indnkeyatts; i++) + { + /* set up an IS NULL scan key so that we ignore not nulls */ + ScanKeyEntryInitialize(&scankeys[i], + SK_ISNULL | SK_SEARCHNULL, + sk_attno, /* index col to scan */ + InvalidStrategy, /* no strategy */ + InvalidOid, /* no strategy subtype */ + InvalidOid, /* no collation */ + InvalidOid, /* no reg proc for this */ + (Datum) 0); /* constant */ + } + + index_scan = index_beginscan(relation, + idxrel, + &DirtySnapshot, + NULL, + indnkeyatts, + 0, + SO_NONE); + index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0); + + while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot)) + { + /* + * At this point, we have found that some attribute value is NULL. + * btree indexes are never lossy, we don't need recheck the + * condition. + */ + all_not_null = false; + break; + } + + index_endscan(index_scan); + index_close(idxrel, NoLock); + ExecDropSingleTupleTableSlot(existing_slot); + + /* exit earlier */ + if (!all_not_null) + return false; + } + + return true; +} diff --git a/src/test/modules/test_misc/meson.build b/src/test/modules/test_misc/meson.build index 1b25d98f7f3..e02a2cb0c42 100644 --- a/src/test/modules/test_misc/meson.build +++ b/src/test/modules/test_misc/meson.build @@ -20,6 +20,7 @@ tests += { 't/009_log_temp_files.pl', 't/010_index_concurrently_upsert.pl', 't/011_lock_stats.pl', + 't/012_indexscan_validate_notnull.pl', ], # The injection points are cluster-wide, so disable installcheck 'runningcheck': false, diff --git a/src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl b/src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl new file mode 100644 index 00000000000..53e38588740 --- /dev/null +++ b/src/test/modules/test_misc/t/012_indexscan_validate_notnull.pl @@ -0,0 +1,118 @@ +# Copyright (c) 2026, PostgreSQL Global Development Group + +# Verify that indexscan mechanism can speedup ALTER TABLE ADD NOT NULL + +use strict; +use warnings FATAL => 'all'; + +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; +use Time::HiRes qw(usleep); + +# Initialize a test cluster +my $node = PostgreSQL::Test::Cluster->new('primary'); +$node->init(); +# Turn message level up to DEBUG1 so that we get the messages we want to see +$node->append_conf('postgresql.conf', 'client_min_messages = DEBUG1'); +$node->start; + +# Run a SQL command and return psql's stderr (including debug messages) +sub run_sql_command +{ + my $sql = shift; + my $stderr; + + $node->psql( + 'postgres', + $sql, + stderr => \$stderr, + on_error_die => 1, + on_error_stop => 1); + return $stderr; +} + +# Verify that run_sql_command output confirms an index scan is used for the NOT NULL check. +sub is_indexscan_veritify_notnull +{ + my $output = shift; + return index($output, 'DEBUG: all new not-null constraints on relation') != -1; +} + +my $output; + +note "test alter table SET NOT NULL using indexscan with partitioned table"; + +run_sql_command( + 'CREATE TABLE tp_notnull (a int, b int, c int, d int, f1 int default 1, f2 int default 2, f3 int default 3) PARTITION BY range(a); + CREATE TABLE tp_notnull_1 partition of tp_notnull for values from ( 1 ) to (10); + CREATE TABLE tp_notnull_2 partition of tp_notnull for values from ( 10 ) to (21); + INSERT INTO tp_notnull(a,b,c,d) SELECT g, g + 10, g + 15, g FROM generate_series(1,19) g; + CREATE INDEX ON tp_notnull(b); + CREATE INDEX ON tp_notnull(c); + CREATE INDEX ON tp_notnull(d); + CREATE INDEX ON tp_notnull(f1) INCLUDE (f2); + CREATE INDEX ON tp_notnull USING hash (f3 int4_ops); + '); + +# Using index scan for NOT NULL verification requires every not-null column must possess a suitable index +$output = run_sql_command( + 'ALTER TABLE tp_notnull ALTER COLUMN a SET NOT NULL, ADD CONSTRAINT c_nn NOT NULL c'); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan to verify not-null constraints on table tp_notnull'); + +# All columns have index on it, OK +$output = run_sql_command( + 'ALTER TABLE tp_notnull ALTER COLUMN d SET NOT NULL, ALTER COLUMN c SET NOT NULL, ALTER COLUMN b SET NOT NULL;'); +ok(is_indexscan_veritify_notnull($output), + 'using indexscan to verify not-null constraints on column b, c, d'); +ok( $output =~ + m/all new not-null constraints on relation "tp_notnull_1" have been validated by using index scan/, + 'all newly added constraints proved by indexscan'); +ok( $output =~ + m/all new not-null constraints on relation "tp_notnull_2" have been validated by using index scan/, + 'all newly added constraints proved by indexscan'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL;'); +# Now using index scans on partitions to quickly verify NOT NULL constraints +$output = run_sql_command('ALTER TABLE tp_notnull_1 ALTER COLUMN b SET NOT NULL;'); +ok(is_indexscan_veritify_notnull($output), 'index scan verifies NOT NULL for column b'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN b DROP NOT NULL;'); +# Cannot use indexscan to verify not-null constraint if table rewrite happens +$output = run_sql_command( + 'ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL, ALTER COLUMN b SET DATA TYPE bigint;' +); +ok(!is_indexscan_veritify_notnull($output), 'table rewrite on tp_notnull, cannot use indexscan to verify not-null constraints on it'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN d DROP NOT NULL'); + +# Cannot use indexscan because ALTER TABLE VALIDATE CONSTRAINT only hold +# ShareUpdateExclusiveLock lock on the table +run_sql_command('ALTER TABLE tp_notnull ADD CONSTRAINT tp_notnull_d NOT NULL d NOT VALID;'); +$output = run_sql_command( + 'ALTER TABLE tp_notnull VALIDATE CONSTRAINT tp_notnull_d;' +); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan for ALTER TABLE VALIDATE CONSTRAINT command'); + +run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN d DROP NOT NULL; + ALTER TABLE tp_notnull ADD CONSTRAINT tp_notnull_d NOT NULL d NOT VALID;'); + +# However this will be OK, since ALTER COLUMN SET NOT NULL will acquire AccessExclusiveLock +$output = run_sql_command( + 'ALTER TABLE tp_notnull VALIDATE CONSTRAINT tp_notnull_d, ALTER COLUMN b SET NOT NULL;' +); +ok(is_indexscan_veritify_notnull($output), 'using indexscan to verify not-null constraints on column b, c, d'); + +# Indexscan mechanism to verify NOT NULL constraints cannot apply to index INCLUDE column +$output = run_sql_command( + 'ALTER TABLE tp_notnull ADD CONSTRAINT nnf4 NOT NULL f2;' +); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan to validate not-null constraint on tp_notnull'); + +# Indexscan mechanism to verify NOT NULL constraints cannot apply to non-Btree index +$output = run_sql_command('ALTER TABLE tp_notnull ALTER COLUMN f3 SET NOT NULL;'); +ok(!is_indexscan_veritify_notnull($output), 'cannot use indexscan to validate not-null constraints on tp_notnull'); + +$node->stop('fast'); + +done_testing(); \ No newline at end of file -- 2.34.1 ^ permalink raw reply [nested|flat] 4+ messages in thread
end of thread, other threads:[~2026-04-08 01:45 UTC | newest] Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-02-06 04:58 Re: using index to speedup add not null constraints to a table jian he <[email protected]> 2026-04-07 07:39 ` jian he <[email protected]> 2026-04-07 10:02 ` Álvaro Herrera <[email protected]> 2026-04-08 01:45 ` jian he <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox