public inbox for [email protected]  
help / color / mirror / Atom feed
using index to speedup add not null constraints to a table
3+ messages / 1 participants
[nested] [flat]

* using index to speedup add not null constraints to a table
@ 2024-12-16 07:07  jian he <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: jian he @ 2024-12-16 07:07 UTC (permalink / raw)
  To: PostgreSQL Hackers <[email protected]>

hi.
context: i was trying to speedup add check constraint then
Sergei Kornilov sent me a link: [1].
so i studied that thread, then tried to implement $SUBJECT.
obviously not using SPI at all.


the logic is mentioned [2]:
"""
we could try an index scan - via index_beginscan() /
index_getnext() / index_endscan() - trying to pull exactly one null
from the index, and judge whether or not there are nulls present based
only whether we get one. This would be a lot cheaper than scanning a
large table, but it needs some careful thought because of visibility
issues.
"""

Currently, if the leading key column of an index is the same as the column with
the NOT NULL constraint, then $SUBJECT applies.
so the following test case, the $SUBJECT applies:
create index t_idx_ab on t(a,b);
alter table t add constraint t1 not null a;

However, query:
alter table t add constraint t1 not null b;
$SUBEJCT does not apply, since "b" is not the leading column of the index.
(It is possible that this could be implemented. So I missed something....)

This approach will not work for partitioned tables, as ALTER TABLE ALTER COLUMN
SET EXPRESSION may trigger an index rebuild.  We cannot perform an index scan
if the index will be rebuilt later.  In the future, if we determine that the
column being rebuilt in the index is the same as the column to which the NOT
NULL constraint is being added, then $SUBJECT can also be applied to partitioned
tables.



based on [3], I wrote some isolation tests to address concurrency issues.
however since add not-null constraint takes ACCESS EXCLUSIVE lock,
so there is less anomaly can happen?



PERFORMANCE:
--100% bloat and zero null bloat value:
drop table if exists t \; create unlogged table t(a int, b int, c int)
\; create index t_idx_a on t(a);
insert into t select g, g+1 from generate_series(1,1_000_000) g;
delete from t;

alter table t add constraint t1 not null a;
with patch Time:  1.873 ms
master Time: 648.312 ms


comments are welcome.

[1] https://www.postgresql.org/message-id/9878.1511970441%40sss.pgh.pa.us
[2] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com
[3] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de


Attachments:

  [text/x-patch] v1-0001-using-index-to-speedup-add-not-null-constraint-to.patch (26.1K, 2-v1-0001-using-index-to-speedup-add-not-null-constraint-to.patch)
  download | inline diff:
From 476cee10531330fa8d83aa940c227deaaad8ab47 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Mon, 16 Dec 2024 15:03:18 +0800
Subject: [PATCH v1 1/1] using index to speedup add not null constraint to a
 table

This patch trying to use index_beginscan() / index_getnext() / index_endscan() mentioned in [1]
to speedup add not-null constraints to the existing table.

there are two ways to add not-null constraints to an existing table.
1. ATExecAddConstraint->ATAddCheckNNConstraint->AddRelationNewConstraints
2. ATExecSetNotNull->AddRelationNewConstraints

the logic is:
1. In AddRelationNewConstraints, if condition meet (see get_index_for_validate_notnull)
   then attempt to use indexscan to precheck whether the column is NOT NULL.
2. AddRelationNewConstraints returned a list of CookedConstraint nodes. pass not-null cooked constraint
   to tab->constraints (a list of NewConstraint). code snippet:
	if (!ccon->skip_validation)
    {
		newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
		if (ccon->pre_validated)
			newcon->nn_attnum = ccon->attnum;
		tab->constraints = lappend(tab->constraints, newcon);
    }
3. In ATRewriteTable, verify whether all attnums associated with notnull_attrs have already been prevalidated.
   If they have, set needscan to false.

CAUTION:
ALTER TABLE SET DATA TYPE and ALTER TABLE SET EXPRESSION trigger an index rebuild
(refer to RememberAllDependentForRebuilding).
While the former cannot be applied to partitioned tables, the latter can.
If an index rebuild occurs, then indexscan cannot be used to prevalidate whether a column contains null values.
Consequently, pre-validating a NOT NULL constraint is currently not supported for partitioned tables.

AlteredTableInfo->verify_new_notnull only used in ATExecAddColumn, so in

concurrency concern:
ALTER TABLE SET NOT NULL will take an ACCESS EXCLUSIVE lock,
so there is less variant of race issue can occur?
to prove accurate, I wrote some isolate tests.
see[2]

performance concern:
it will slower than normal add not-null if your index is bloat and a large
percent of bloat is related to null value data.

take a simple example:

100% bloat and zero null bloat value:
drop table if exists t \; create unlogged table t(a int, b int, c int) \; create index t_idx_a on t(a);
insert into t select g, g+1 from generate_series(1,1_000_000) g;
delete from t;
alter table t add constraint t1 not null a;

with patch Time:  1.873 ms
master Time: 648.312 ms
--------------------------
-- %20 percent column value is null and deleted from heap still on the index.
drop table if exists t \; create unlogged table t(a int, b int) \; create index t_idx_a on t(a);
insert into t select case when g % 5 = 0 then null else g end, g+1 from generate_series(1,1_000_000) g;
delete from t where a is null;

alter table t add constraint t1 not null a;
with patch Time:: 1080.407 ms (00:01.080)
master Time: 970.671 ms

references:
[1] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com
[2] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de
---
 src/backend/catalog/heap.c                | 80 +++++++++++++++++++++++
 src/backend/catalog/pg_constraint.c       |  1 +
 src/backend/commands/tablecmds.c          | 74 ++++++++++++++++++---
 src/backend/executor/execIndexing.c       | 80 +++++++++++++++++++++++
 src/include/catalog/heap.h                |  2 +
 src/include/executor/executor.h           |  1 +
 src/test/isolation/isolation_schedule     |  1 +
 src/test/regress/expected/alter_table.out | 35 ++++++++++
 src/test/regress/sql/alter_table.sql      | 40 ++++++++++++
 9 files changed, 306 insertions(+), 8 deletions(-)

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index d7b88b61dc..fa0872e2dd 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -57,6 +57,7 @@
 #include "commands/tablecmds.h"
 #include "commands/typecmds.h"
 #include "common/int.h"
+#include "executor/executor.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
@@ -116,6 +117,7 @@ static Node *cookConstraint(ParseState *pstate,
 							char *relname);
 
 
+static Oid get_index_for_validate_notnull(Relation rel, int colnum);
 /* ----------------------------------------------------------------
  *				XXX UGLY HARD CODED BADNESS FOLLOWS XXX
  *
@@ -2298,6 +2300,8 @@ StoreConstraints(Relation rel, List *cooked_constraints, bool is_internal)
  * newConstraints: list of Constraint nodes
  * allow_merge: true if check constraints may be merged with existing ones
  * is_local: true if definition is local, false if it's inherited
+ * index_rebuild: true if the ALTER TABLE command and it's subcommands
+ * willl do any index rebuild in future.
  * is_internal: true if result of some internal process, not a user request
  * queryString: used during expression transformation of default values and
  *		cooked CHECK constraints
@@ -2322,6 +2326,7 @@ AddRelationNewConstraints(Relation rel,
 						  bool allow_merge,
 						  bool is_local,
 						  bool is_internal,
+						  bool index_rebuild,
 						  const char *queryString)
 {
 	List	   *cookedConstraints = NIL;
@@ -2409,6 +2414,7 @@ AddRelationNewConstraints(Relation rel,
 		cooked->is_local = is_local;
 		cooked->inhcount = is_local ? 0 : 1;
 		cooked->is_no_inherit = false;
+		cooked->pre_validated = false;
 		cookedConstraints = lappend(cookedConstraints, cooked);
 	}
 
@@ -2539,6 +2545,7 @@ AddRelationNewConstraints(Relation rel,
 			cooked->is_local = is_local;
 			cooked->inhcount = is_local ? 0 : 1;
 			cooked->is_no_inherit = cdef->is_no_inherit;
+			cooked->pre_validated = false;
 			cookedConstraints = lappend(cookedConstraints, cooked);
 		}
 		else if (cdef->contype == CONSTR_NOTNULL)
@@ -2547,6 +2554,7 @@ AddRelationNewConstraints(Relation rel,
 			AttrNumber	colnum;
 			int16		inhcount = is_local ? 0 : 1;
 			char	   *nnname;
+			Oid			indexoid = InvalidOid;
 
 			/* Determine which column to modify */
 			colnum = get_attnum(RelationGetRelid(rel), strVal(linitial(cdef->keys)));
@@ -2599,6 +2607,16 @@ AddRelationNewConstraints(Relation rel,
 								inhcount,
 								cdef->is_no_inherit);
 
+			/*
+			 * if there is any index to be rebuild, then we cannot do indexscan
+			 * in index_check_notnull; for re_added not-nullconstraint also
+			 * cannot use indexscan to check null existence.
+			 *
+			*/
+			if (rel->rd_rel->relkind == RELKIND_RELATION &&
+				!is_internal &&
+				!index_rebuild)
+				indexoid = get_index_for_validate_notnull(rel, colnum);
 			nncooked = (CookedConstraint *) palloc(sizeof(CookedConstraint));
 			nncooked->contype = CONSTR_NOTNULL;
 			nncooked->conoid = constrOid;
@@ -2609,7 +2627,14 @@ AddRelationNewConstraints(Relation rel,
 			nncooked->is_local = is_local;
 			nncooked->inhcount = inhcount;
 			nncooked->is_no_inherit = cdef->is_no_inherit;
+			nncooked->pre_validated = false;
+			if (OidIsValid(indexoid))
+				nncooked->pre_validated = index_check_notnull(RelationGetRelid(rel), indexoid);
 
+			if (nncooked->pre_validated)
+				elog(DEBUG1, "can do fast not-null check");
+			else
+				elog(DEBUG1, "cannot do fast not-null check");
 			cookedConstraints = lappend(cookedConstraints, nncooked);
 		}
 	}
@@ -2626,6 +2651,61 @@ AddRelationNewConstraints(Relation rel,
 	return cookedConstraints;
 }
 
+static Oid
+get_index_for_validate_notnull(Relation relation, int colnum)
+{
+	Relation	indrel;
+	SysScanDesc indscan;
+	ScanKeyData skey;
+	HeapTuple	htup;
+	List	   *result;
+	TupleDesc	tupdesc;
+	Form_pg_attribute attr;
+
+	tupdesc = RelationGetDescr(relation);
+	result = NIL;
+
+	/* Prepare to scan pg_index for entries having indrelid = this rel. */
+	ScanKeyInit(&skey,
+				Anum_pg_index_indrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationGetRelid(relation)));
+
+	indrel = table_open(IndexRelationId, AccessShareLock);
+	indscan = systable_beginscan(indrel, IndexIndrelidIndexId, true,
+								 NULL, 1, &skey);
+
+	while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+	{
+		Form_pg_index index = (Form_pg_index) GETSTRUCT(htup);
+
+		if (!index->indimmediate || !index->indisvalid || !index->indislive)
+			continue;
+
+		if (!heap_attisnull(htup, Anum_pg_index_indexprs, NULL) ||
+			!heap_attisnull(htup, Anum_pg_index_indpred, NULL))
+			continue;
+
+		/* we are only interested in index's leading column */
+		attr = TupleDescAttr(tupdesc, (index->indkey.values[0]) - 1);
+		if (attr->attnum != colnum)
+			continue;
+
+		/* add index's OID to result list */
+		result = lappend_oid(result, index->indexrelid);
+	}
+	systable_endscan(indscan);
+	table_close(indrel, AccessShareLock);
+
+	/* Sort the result list into OID order, make the result stable. */
+	list_sort(result, list_oid_cmp);
+
+	if(list_length(result) == 0)
+		return InvalidOid;
+	else
+		return list_nth_oid(result, 0);
+}
+
 /*
  * Check for a pre-existing check constraint that conflicts with a proposed
  * new one, and either adjust its conislocal/coninhcount settings or throw
diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c
index 9c05a98d28..6638f1fbe2 100644
--- a/src/backend/catalog/pg_constraint.c
+++ b/src/backend/catalog/pg_constraint.c
@@ -826,6 +826,7 @@ RelationGetNotNullConstraints(Oid relid, bool cooked, bool include_noinh)
 			cooked->is_local = true;
 			cooked->inhcount = 0;
 			cooked->is_no_inherit = conForm->connoinherit;
+			cooked->pre_validated = false;
 
 			notnulls = lappend(notnulls, cooked);
 		}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6ccae4cb4a..274568e0f2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -219,6 +219,8 @@ typedef struct NewConstraint
 	Oid			refindid;		/* OID of PK's index, if FOREIGN */
 	bool		conwithperiod;	/* Whether the new FOREIGN KEY uses PERIOD */
 	Oid			conid;			/* OID of pg_constraint entry, if FOREIGN */
+	int			nn_attnum;		/* already validated NOT-NULL constraint attnum.
+									-1 means not appliable */
 	Node	   *qual;			/* Check expr or CONSTR_FOREIGN Constraint */
 	ExprState  *qualstate;		/* Execution state for CHECK expr */
 } NewConstraint;
@@ -977,6 +979,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 			cooked->is_local = true;	/* not used for defaults */
 			cooked->inhcount = 0;	/* ditto */
 			cooked->is_no_inherit = false;
+			cooked->pre_validated = false;
 			cookedDefaults = lappend(cookedDefaults, cooked);
 			attr->atthasdef = true;
 		}
@@ -1060,7 +1063,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	 */
 	if (rawDefaults)
 		AddRelationNewConstraints(rel, rawDefaults, NIL,
-								  true, true, false, queryString);
+								  true, true, false, false, queryString);
 
 	/*
 	 * Make column generation expressions visible for use by partitioning.
@@ -1291,7 +1294,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	 */
 	if (stmt->constraints)
 		AddRelationNewConstraints(rel, NIL, stmt->constraints,
-								  true, true, false, queryString);
+								  true, true, false, false, queryString);
 
 	/*
 	 * Finally, merge the not-null constraints that are declared directly with
@@ -6038,6 +6041,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 	TupleDesc	oldTupDesc;
 	TupleDesc	newTupDesc;
 	bool		needscan = false;
+	bool		prevalided = true;
 	List	   *notnull_attrs;
 	int			i;
 	ListCell   *l;
@@ -6095,6 +6099,11 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 				needscan = true;
 				con->qualstate = ExecPrepareExpr((Expr *) con->qual, estate);
 				break;
+
+			case CONSTR_NOTNULL:
+				if(con->nn_attnum == -1)
+					prevalided = false;
+				break;
 			case CONSTR_FOREIGN:
 				/* Nothing to do here */
 				break;
@@ -6136,6 +6145,12 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 		}
 		if (notnull_attrs)
 			needscan = true;
+		/*
+		 * we did some prevalidation at phase 2, so we can safely be sure no
+		 * need to scan. see AddRelationNewConstraints and index_check_notnull
+		*/
+		if(prevalided)
+			needscan = false;
 	}
 
 	if (newrel || needscan)
@@ -7280,7 +7295,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 		 * _list_ of defaults, but we just do one.
 		 */
 		AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-								  false, true, false, NULL);
+								  false, true, false, false, NULL);
 
 		/* Make the additional catalog changes visible */
 		CommandCounterIncrement();
@@ -7722,6 +7737,8 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName,
 	CookedConstraint *ccon;
 	List	   *cooked;
 	bool		is_no_inherit = false;
+	bool		index_rebuild = false;
+	AlteredTableInfo *tab = NULL;
 
 	/* Guard against stack overflow due to overly deep inheritance tree. */
 	check_stack_depth();
@@ -7837,10 +7854,39 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName,
 	constraint->is_no_inherit = is_no_inherit;
 	constraint->conname = conName;
 
+	/*
+	 * for ALTER TABLE ALTER COLUMN SET EXPRESSION, RememberIndexForRebuilding
+	 * can only remember the partitioned index entry, not the regular index. In
+	 * that case, tab->changedIndexOids won't have regular index oid, then  we
+	 * can not use index_check_notnull to fast check the column not-null status.
+	 * that why if recursing is true, we blindly assume index_rebuild is true.
+	 * we use index_rebuild for index_check_notnull purpose only.
+	 */
+	tab = ATGetQueueEntry(wqueue, rel);
+	if (list_length(tab->changedIndexOids) > 0 || recursing)
+		index_rebuild = true;
+
 	/* and do it */
 	cooked = AddRelationNewConstraints(rel, NIL, list_make1(constraint),
-									   false, !recursing, false, NULL);
+									   false, !recursing, false, index_rebuild, NULL);
 	ccon = linitial(cooked);
+
+	if (!ccon->skip_validation)
+	{
+		NewConstraint *newcon;
+
+		newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
+		newcon->name = ccon->name;
+		newcon->contype = ccon->contype;
+		if (ccon->pre_validated)
+			newcon->nn_attnum = ccon->attnum;
+		else
+			newcon->nn_attnum = -1;
+		newcon->qual = NULL;
+
+		tab->constraints = lappend(tab->constraints, newcon);
+		Assert(ccon->attnum > 0);
+	}
 	ObjectAddressSet(address, ConstraintRelationId, ccon->conoid);
 
 	InvokeObjectPostAlterHook(RelationRelationId,
@@ -7988,7 +8034,7 @@ ATExecColumnDefault(Relation rel, const char *colName,
 		 * _list_ of defaults, but we just do one.
 		 */
 		AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-								  false, true, false, NULL);
+								  false, true, false, false, NULL);
 	}
 
 	ObjectAddressSubSet(address, RelationRelationId,
@@ -8450,7 +8496,7 @@ ATExecSetExpression(AlteredTableInfo *tab, Relation rel, const char *colName,
 
 	/* Store the generated expression */
 	AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-							  false, true, false, NULL);
+							  false, true, false, false, NULL);
 
 	/* Make above new expression visible */
 	CommandCounterIncrement();
@@ -9554,6 +9600,10 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	List	   *children;
 	ListCell   *child;
 	ObjectAddress address = InvalidObjectAddress;
+	bool		index_rebuild = false;
+
+	if (list_length(tab->changedIndexOids) > 0 || recursing)
+		index_rebuild = true;
 
 	/* Guard against stack overflow due to overly deep inheritance tree. */
 	check_stack_depth();
@@ -9578,6 +9628,7 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 										recursing || is_readd,	/* allow_merge */
 										!recursing, /* is_local */
 										is_readd,	/* is_internal */
+										index_rebuild,
 										NULL);	/* queryString not available
 												 * here */
 
@@ -9589,15 +9640,19 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	{
 		CookedConstraint *ccon = (CookedConstraint *) lfirst(lcon);
 
-		if (!ccon->skip_validation && ccon->contype != CONSTR_NOTNULL)
+		if (!ccon->skip_validation)
 		{
 			NewConstraint *newcon;
 
 			newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
 			newcon->name = ccon->name;
 			newcon->contype = ccon->contype;
+			if (ccon->pre_validated && ccon->contype == CONSTR_NOTNULL)
+				newcon->nn_attnum = ccon->attnum;
+			else
+				newcon->nn_attnum = -1;
+
 			newcon->qual = ccon->expr;
-
 			tab->constraints = lappend(tab->constraints, newcon);
 		}
 
@@ -10718,6 +10773,7 @@ addFkRecurseReferencing(List **wqueue, Constraint *fkconstraint, Relation rel,
 			newcon->refindid = indexOid;
 			newcon->conid = parentConstr;
 			newcon->conwithperiod = fkconstraint->fk_with_period;
+			newcon->nn_attnum = -1;
 			newcon->qual = (Node *) fkconstraint;
 
 			tab->constraints = lappend(tab->constraints, newcon);
@@ -12026,6 +12082,7 @@ ATExecValidateConstraint(List **wqueue, Relation rel, char *constrName,
 			newcon->refrelid = con->confrelid;
 			newcon->refindid = con->conindid;
 			newcon->conid = con->oid;
+			newcon->nn_attnum = -1;
 			newcon->qual = (Node *) fkconstraint;
 
 			/* Find or create work queue entry for this table */
@@ -12098,6 +12155,7 @@ ATExecValidateConstraint(List **wqueue, Relation rel, char *constrName,
 
 			val = SysCacheGetAttrNotNull(CONSTROID, tuple,
 										 Anum_pg_constraint_conbin);
+			newcon->nn_attnum = -1;
 			conbin = TextDatumGetCString(val);
 			newcon->qual = (Node *) stringToNode(conbin);
 
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index f0a5f8879a..69cea974ca 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -108,6 +108,7 @@
 
 #include "access/genam.h"
 #include "access/relscan.h"
+#include "access/table.h"
 #include "access/tableam.h"
 #include "access/xact.h"
 #include "catalog/index.h"
@@ -1166,3 +1167,82 @@ ExecWithoutOverlapsNotEmpty(Relation rel, NameData attname, Datum attval, char t
 				 errmsg("empty WITHOUT OVERLAPS value found in column \"%s\" in relation \"%s\"",
 						NameStr(attname), RelationGetRelationName(rel))));
 }
+
+bool
+index_check_notnull(Oid table_oid, Oid indexoid)
+{
+	Relation index;
+	Relation heap;
+
+	SnapshotData DirtySnapshot;
+	IndexScanDesc index_scan;
+	ScanKeyData scankeys[INDEX_MAX_KEYS];
+	TupleTableSlot *existing_slot;
+	IndexInfo  *indexInfo;
+	EState	   *estate;
+	int			indnkeyatts;
+	ExprContext *econtext;
+	bool all_not_null = true;
+
+	heap = table_open(table_oid, NoLock);
+	index = index_open(indexoid, AccessShareLock);
+	/*
+	 * Need an EState for slot to hold the current tuple.
+	 *
+	 */
+	estate = CreateExecutorState();
+	econtext = GetPerTupleExprContext(estate);
+	existing_slot = table_slot_create(heap, NULL);
+
+	/* Arrange for econtext's scan tuple to be the tuple under test */
+	econtext->ecxt_scantuple = existing_slot;
+
+	indexInfo = BuildIndexInfo(index);
+	indnkeyatts = IndexRelationGetNumberOfKeyAttributes(index);
+	/*
+	 * Search the tuples that are in the index for any violations, including
+	 * tuples that aren't visible yet.
+	 */
+	InitDirtySnapshot(DirtySnapshot);
+
+	for (int i = 0; i < indnkeyatts; i++)
+	{
+		/* set up an IS NOT NULL scan key so that we ignore nulls */
+		ScanKeyEntryInitialize(&scankeys[i],
+								SK_ISNULL | SK_SEARCHNULL,
+								1,				/* index col to scan */
+								InvalidStrategy,/* no strategy */
+								InvalidOid,		/* no strategy subtype */
+								InvalidOid,		/* no collation */
+								InvalidOid,		/* no reg proc for this */
+								(Datum) 0);		/* constant */
+	}
+
+	index_scan = index_beginscan(heap, index, &DirtySnapshot, indnkeyatts, 0);
+	index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0);
+	while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot))
+	{
+		Datum		existing_values[INDEX_MAX_KEYS];
+		bool		existing_isnull[INDEX_MAX_KEYS];
+
+		/*
+		 * Extract the index column values and isnull flags from the existing
+		 * tuple.
+		 */
+		FormIndexDatum(indexInfo, existing_slot, estate,
+					   existing_values, existing_isnull);
+		if (existing_isnull[0])
+		{
+			all_not_null = false;
+			break;
+		}
+	}
+
+	index_endscan(index_scan);
+	table_close(heap, NoLock);
+	index_close(index, AccessShareLock);
+	ExecDropSingleTupleTableSlot(existing_slot);
+	FreeExecutorState(estate);
+
+	return all_not_null;
+}
\ No newline at end of file
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 8c278f202b..a918801760 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -45,6 +45,7 @@ typedef struct CookedConstraint
 	int16		inhcount;		/* number of times constraint is inherited */
 	bool		is_no_inherit;	/* constraint has local def and cannot be
 								 * inherited */
+	bool		pre_validated;  /* the constraint already validated? only for CONSTR_CHECK */
 } CookedConstraint;
 
 extern Relation heap_create(const char *relname,
@@ -113,6 +114,7 @@ extern List *AddRelationNewConstraints(Relation rel,
 									   bool allow_merge,
 									   bool is_local,
 									   bool is_internal,
+									   bool index_rebuild,
 									   const char *queryString);
 extern List *AddRelationNotNullConstraints(Relation rel,
 										   List *constraints,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 494ec4f2e5..962ffa3522 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -662,6 +662,7 @@ extern void check_exclusion_constraint(Relation heap, Relation index,
 									   const Datum *values, const bool *isnull,
 									   EState *estate, bool newIndex);
 
+extern bool index_check_notnull(Oid heap, Oid indexoid);
 /*
  * prototypes from functions in execReplication.c
  */
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 143109aa4d..32d7e51aa9 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -55,6 +55,7 @@ test: merge-delete
 test: merge-update
 test: merge-match-recheck
 test: merge-join
+test: fast-set-notnull
 test: delete-abort-savept
 test: delete-abort-savept-2
 test: aborted-keyrevoke
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 12852aa612..81e065914a 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -4738,3 +4738,38 @@ drop publication pub1;
 drop schema alter1 cascade;
 drop schema alter2 cascade;
 NOTICE:  drop cascades to table alter2.t1
+-----fast add not-null constraint tests.
+CREATE TABLE tp (a int, b int GENERATED ALWAYS AS (22) stored) PARTITION BY range(a);
+CREATE TABLE tpp1(a int, b int GENERATED ALWAYS AS (a+1) stored);
+CREATE TABLE tpp2(a int, b int GENERATED ALWAYS AS (a+10) stored);
+ALTER TABLE tp ATTACH PARTITION tpp1 FOR values from ( 1 ) to (100);
+ALTER TABLE tp ATTACH PARTITION tpp2 default;
+CREATE INDEX ON tp(b);
+insert into tp select g from generate_series(100,120) g;
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere the index rebuild.
+ALTER TABLE tp alter column b set not null, ALTER COLUMN b SET EXPRESSION AS (a * 3);
+ALTER TABLE tp alter column b drop not null;
+ALTER TABLE tp add constraint nn not null b, ALTER COLUMN b SET EXPRESSION AS (a * 11);
+------test non-partitioned table
+CREATE TABLE t1(f1 INT, f2 int, f3 int);
+INSERT INTO t1 select g, g+1, g+2 from generate_series(1, 20) g;
+CREATE INDEX t1_f1idx ON t1(f1);
+CREATE INDEX t1_f2idx ON t1(f2);
+CREATE UNIQUE INDEX t1_f3idx ON t1(f3);
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere the index rebuild.
+ALTER TABLE t1 alter column f1 set not null, ALTER f1 TYPE BIGINT;
+ALTER TABLE t1 alter column f1 drop not null;
+ALTER TABLE t1 add constraint nn not null f1, ALTER f1 TYPE bigint;
+--ok.
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 alter column f2 set not null, alter column f1 set not null;
+--ok.
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 add constraint nn not null f2, add constraint nnf1 not null f1;
+--ok.
+ALTER TABLE t1 add constraint t_f3_pk primary key using index t1_f3idx;
+NOTICE:  ALTER TABLE / ADD CONSTRAINT USING INDEX will rename index "t1_f3idx" to "t_f3_pk"
+drop table t1;
+drop table tpp1, tpp2, tp;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index c88f9eaab0..e64b3dceed 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -3102,3 +3102,43 @@ alter table alter1.t1 set schema alter2;
 drop publication pub1;
 drop schema alter1 cascade;
 drop schema alter2 cascade;
+
+-----fast add not-null constraint tests.
+CREATE TABLE tp (a int, b int GENERATED ALWAYS AS (22) stored) PARTITION BY range(a);
+CREATE TABLE tpp1(a int, b int GENERATED ALWAYS AS (a+1) stored);
+CREATE TABLE tpp2(a int, b int GENERATED ALWAYS AS (a+10) stored);
+ALTER TABLE tp ATTACH PARTITION tpp1 FOR values from ( 1 ) to (100);
+ALTER TABLE tp ATTACH PARTITION tpp2 default;
+CREATE INDEX ON tp(b);
+insert into tp select g from generate_series(100,120) g;
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere the index rebuild.
+ALTER TABLE tp alter column b set not null, ALTER COLUMN b SET EXPRESSION AS (a * 3);
+ALTER TABLE tp alter column b drop not null;
+ALTER TABLE tp add constraint nn not null b, ALTER COLUMN b SET EXPRESSION AS (a * 11);
+
+------test non-partitioned table
+CREATE TABLE t1(f1 INT, f2 int, f3 int);
+INSERT INTO t1 select g, g+1, g+2 from generate_series(1, 20) g;
+CREATE INDEX t1_f1idx ON t1(f1);
+CREATE INDEX t1_f2idx ON t1(f2);
+CREATE UNIQUE INDEX t1_f3idx ON t1(f3);
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere the index rebuild.
+ALTER TABLE t1 alter column f1 set not null, ALTER f1 TYPE BIGINT;
+ALTER TABLE t1 alter column f1 drop not null;
+ALTER TABLE t1 add constraint nn not null f1, ALTER f1 TYPE bigint;
+
+--ok.
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 alter column f2 set not null, alter column f1 set not null;
+--ok.
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 add constraint nn not null f2, add constraint nnf1 not null f1;
+--ok.
+ALTER TABLE t1 add constraint t_f3_pk primary key using index t1_f3idx;
+
+drop table t1;
+drop table tpp1, tpp2, tp;
\ No newline at end of file
-- 
2.34.1



^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: using index to speedup add not null constraints to a table
@ 2024-12-31 16:09  jian he <[email protected]>
  parent: jian he <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: jian he @ 2024-12-31 16:09 UTC (permalink / raw)
  To: PostgreSQL Hackers <[email protected]>

hi.

In v1 I didn't do the `git add` for newly created isolation test related files.
so the cfbot for isolation tests failed.

v1 with index:
create index t_idx_ab on t(a,b);
we cannot fast add a not-null constraint for column b.
with the attached v2 patch, now we can do that.

v2, isolation test also adds other session drop index scarenio.


Attachments:

  [text/x-patch] v2-0001-using-index-to-speedup-add-not-null-constraint-to.patch (33.3K, 2-v2-0001-using-index-to-speedup-add-not-null-constraint-to.patch)
  download | inline diff:
From ac1ec78395479c61844e122c90ef26d545d35f69 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Tue, 31 Dec 2024 23:32:49 +0800
Subject: [PATCH v2 1/1] using index to speedup add not null constraint to
 existing table

This patch tries to use index_beginscan() / index_getnext() / index_endscan()
mentioned in [1] to speedup adding not-null constraints to the existing table.

There are two ways to add not-null constraints to an existing table.
1. ATExecAddConstraint->ATAddCheckNNConstraint->AddRelationNewConstraints
2. ATExecSetNotNull->AddRelationNewConstraints

The logic is:
1. In AddRelationNewConstraints, if condition meet (see get_index_for_validate_notnull)
   then attempt to use indexscan to pre check whether the column is NOT NULL.
2. AddRelationNewConstraints returned a list of CookedConstraint nodes.
   pass not-null cooked constraint to tab->constraints (a list of NewConstraint).
   code snippet:
    if (!ccon->skip_validation)
    {
        newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
        if (ccon->pre_validated)
            newcon->nn_attnum = ccon->attnum;
        tab->constraints = lappend(tab->constraints, newcon);
    }
3. In ATRewriteTable, if all attnums associated with not-null constraint have
already been prevalidated, then no need to needscan.

CAUTION:
ALTER TABLE SET DATA TYPE and ALTER TABLE SET EXPRESSION trigger an index
rebuild (refer to RememberAllDependentForRebuilding).  While the former cannot
be applied to partitioned tables, the latter can.  If an index rebuild occurs,
then indexscan cannot be used to prevalidate whether a column contains null
values.  Consequently, pre-validating a NOT NULL constraint is currently not
supported for partitioned tables.

concurrency concern:
ALTER TABLE SET NOT NULL will take an ACCESS EXCLUSIVE lock, so there is less
variant of racing issue can occur?  to prove accurate, I wrote some isolation
tests. see[2]

performance concern:
it will be slower than normal adding not-null constraint ONLY IF your index is
bloat and a large percent of bloat is related to null value data.
demo:

100% bloat and zero percent null value:
drop table if exists t \; create unlogged table t(a int, b int, c int);
insert into t select g, g+1 from generate_series(1,1_000_000) g;
create index t_idx_a on t(a);
delete from t;
alter table t add constraint t1 not null a;

patch Time:  1.873 ms
master Time: 648.312 ms
----------------------------------------------------
---- %20 percent column value is null and deleted from heap still on the index.
drop table if exists t \; create unlogged table t(a int, b int);
insert into t select case when g % 5 = 0 then null else g end, g+1
from generate_series(1,1_000_000) g;

create index t_idx_a on t(a);
delete from t where a is null;

alter table t add constraint t1 not null a;
patch Time:: 1080.407 ms (00:01.080)
master Time: 970.671 ms

references:
[1] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com
[2] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de

discussion: https://postgr.es/m/CACJufxFiW=4k1is=F1J=r-Cx1RuByXQPUrWB331U47rSnGz+hw@mail.gmail.com
---
 src/backend/catalog/heap.c                    |  88 +++++++++++-
 src/backend/catalog/pg_constraint.c           |   1 +
 src/backend/commands/tablecmds.c              |  80 +++++++++--
 src/backend/executor/execIndexing.c           |  93 ++++++++++++
 src/include/catalog/heap.h                    |   2 +
 src/include/executor/executor.h               |   1 +
 .../isolation/expected/fast-set-notnull.out   | 133 ++++++++++++++++++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/fast-set-notnull.spec     |  46 ++++++
 src/test/regress/expected/alter_table.out     |  38 +++++
 src/test/regress/sql/alter_table.sql          |  43 ++++++
 11 files changed, 516 insertions(+), 10 deletions(-)
 create mode 100644 src/test/isolation/expected/fast-set-notnull.out
 create mode 100644 src/test/isolation/specs/fast-set-notnull.spec

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index d7b88b61dc..7994b0c480 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -57,6 +57,7 @@
 #include "commands/tablecmds.h"
 #include "commands/typecmds.h"
 #include "common/int.h"
+#include "executor/executor.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
@@ -116,6 +117,7 @@ static Node *cookConstraint(ParseState *pstate,
 							char *relname);
 
 
+static Oid get_index_for_validate_notnull(Relation rel, int colnum);
 /* ----------------------------------------------------------------
  *				XXX UGLY HARD CODED BADNESS FOLLOWS XXX
  *
@@ -2298,6 +2300,8 @@ StoreConstraints(Relation rel, List *cooked_constraints, bool is_internal)
  * newConstraints: list of Constraint nodes
  * allow_merge: true if check constraints may be merged with existing ones
  * is_local: true if definition is local, false if it's inherited
+ * index_rebuild: true if the ALTER TABLE command and it's subcommands
+ * will do any index rebuild in the future.
  * is_internal: true if result of some internal process, not a user request
  * queryString: used during expression transformation of default values and
  *		cooked CHECK constraints
@@ -2322,6 +2326,7 @@ AddRelationNewConstraints(Relation rel,
 						  bool allow_merge,
 						  bool is_local,
 						  bool is_internal,
+						  bool index_rebuild,
 						  const char *queryString)
 {
 	List	   *cookedConstraints = NIL;
@@ -2409,6 +2414,7 @@ AddRelationNewConstraints(Relation rel,
 		cooked->is_local = is_local;
 		cooked->inhcount = is_local ? 0 : 1;
 		cooked->is_no_inherit = false;
+		cooked->pre_validated = false;
 		cookedConstraints = lappend(cookedConstraints, cooked);
 	}
 
@@ -2539,6 +2545,7 @@ AddRelationNewConstraints(Relation rel,
 			cooked->is_local = is_local;
 			cooked->inhcount = is_local ? 0 : 1;
 			cooked->is_no_inherit = cdef->is_no_inherit;
+			cooked->pre_validated = false;
 			cookedConstraints = lappend(cookedConstraints, cooked);
 		}
 		else if (cdef->contype == CONSTR_NOTNULL)
@@ -2547,6 +2554,7 @@ AddRelationNewConstraints(Relation rel,
 			AttrNumber	colnum;
 			int16		inhcount = is_local ? 0 : 1;
 			char	   *nnname;
+			Oid			indexoid = InvalidOid;
 
 			/* Determine which column to modify */
 			colnum = get_attnum(RelationGetRelid(rel), strVal(linitial(cdef->keys)));
@@ -2599,6 +2607,16 @@ AddRelationNewConstraints(Relation rel,
 								inhcount,
 								cdef->is_no_inherit);
 
+			/*
+			 * if there is any index to be rebuild, then we cannot do indexscan
+			 * in index_check_notnull; for re_added (is_internal is true)
+			 * not-null constraint also cannot use indexscan to check null
+			 * existence.
+			*/
+			if (rel->rd_rel->relkind == RELKIND_RELATION &&
+				!is_internal &&
+				!index_rebuild)
+				indexoid = get_index_for_validate_notnull(rel, colnum);
 			nncooked = (CookedConstraint *) palloc(sizeof(CookedConstraint));
 			nncooked->contype = CONSTR_NOTNULL;
 			nncooked->conoid = constrOid;
@@ -2609,7 +2627,17 @@ AddRelationNewConstraints(Relation rel,
 			nncooked->is_local = is_local;
 			nncooked->inhcount = inhcount;
 			nncooked->is_no_inherit = cdef->is_no_inherit;
-
+			nncooked->pre_validated = false;
+			if (OidIsValid(indexoid))
+			{
+				nncooked->pre_validated =
+						index_check_notnull(RelationGetRelid(rel),
+											indexoid,
+											colnum);
+			}
+			if (nncooked->pre_validated)
+				elog(DEBUG1, "already using index (%d) vertified that relation \"%s\" column number (%d) don't have null values",
+							  indexoid, RelationGetRelationName(rel), colnum);
 			cookedConstraints = lappend(cookedConstraints, nncooked);
 		}
 	}
@@ -2626,6 +2654,64 @@ AddRelationNewConstraints(Relation rel,
 	return cookedConstraints;
 }
 
+static Oid
+get_index_for_validate_notnull(Relation relation, int colnum)
+{
+	Relation	indrel;
+	SysScanDesc indscan;
+	ScanKeyData skey;
+	HeapTuple	htup;
+	TupleDesc	tupdesc;
+	Form_pg_attribute attr;
+	List	   *result = NIL;
+
+	tupdesc = RelationGetDescr(relation);
+
+	/* Prepare to scan pg_index for entries having indrelid = this rel. */
+	ScanKeyInit(&skey,
+				Anum_pg_index_indrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationGetRelid(relation)));
+
+	indrel = table_open(IndexRelationId, AccessShareLock);
+	indscan = systable_beginscan(indrel, IndexIndrelidIndexId, true,
+								 NULL, 1, &skey);
+
+	while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+	{
+		Form_pg_index index = (Form_pg_index) GETSTRUCT(htup);
+
+		if (!index->indimmediate || !index->indisvalid || !index->indislive)
+			continue;
+
+		if (!heap_attisnull(htup, Anum_pg_index_indexprs, NULL) ||
+			!heap_attisnull(htup, Anum_pg_index_indpred, NULL))
+			continue;
+
+		for (int i = 0; i < index->indnkeyatts; i++)
+		{
+			attr = TupleDescAttr(tupdesc, (index->indkey.values[i]) - 1);
+			if (attr->attnum == colnum)
+			{
+				Assert(OidIsValid(index->indexrelid));
+				result = lappend_oid(result, index->indexrelid);
+				break;
+			}
+		}
+	}
+	systable_endscan(indscan);
+	table_close(indrel, AccessShareLock);
+
+	if(list_length(result) == 0)
+		return InvalidOid;
+	else
+	{
+		/* Sort the result list into OID order, make the result stable. */
+		list_sort(result, list_oid_cmp);
+		return list_nth_oid(result, 0);
+	}
+}
+
 /*
  * Check for a pre-existing check constraint that conflicts with a proposed
  * new one, and either adjust its conislocal/coninhcount settings or throw
diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c
index 9c05a98d28..6638f1fbe2 100644
--- a/src/backend/catalog/pg_constraint.c
+++ b/src/backend/catalog/pg_constraint.c
@@ -826,6 +826,7 @@ RelationGetNotNullConstraints(Oid relid, bool cooked, bool include_noinh)
 			cooked->is_local = true;
 			cooked->inhcount = 0;
 			cooked->is_no_inherit = conForm->connoinherit;
+			cooked->pre_validated = false;
 
 			notnulls = lappend(notnulls, cooked);
 		}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6ccae4cb4a..edaa4daca6 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -219,6 +219,8 @@ typedef struct NewConstraint
 	Oid			refindid;		/* OID of PK's index, if FOREIGN */
 	bool		conwithperiod;	/* Whether the new FOREIGN KEY uses PERIOD */
 	Oid			conid;			/* OID of pg_constraint entry, if FOREIGN */
+	int			nn_attnum;		/* already validated NOT-NULL constraint attnum.
+									-1 means not applicable */
 	Node	   *qual;			/* Check expr or CONSTR_FOREIGN Constraint */
 	ExprState  *qualstate;		/* Execution state for CHECK expr */
 } NewConstraint;
@@ -977,6 +979,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 			cooked->is_local = true;	/* not used for defaults */
 			cooked->inhcount = 0;	/* ditto */
 			cooked->is_no_inherit = false;
+			cooked->pre_validated = false;
 			cookedDefaults = lappend(cookedDefaults, cooked);
 			attr->atthasdef = true;
 		}
@@ -1060,7 +1063,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	 */
 	if (rawDefaults)
 		AddRelationNewConstraints(rel, rawDefaults, NIL,
-								  true, true, false, queryString);
+								  true, true, false, false, queryString);
 
 	/*
 	 * Make column generation expressions visible for use by partitioning.
@@ -1291,7 +1294,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	 */
 	if (stmt->constraints)
 		AddRelationNewConstraints(rel, NIL, stmt->constraints,
-								  true, true, false, queryString);
+								  true, true, false, false, queryString);
 
 	/*
 	 * Finally, merge the not-null constraints that are declared directly with
@@ -6038,6 +6041,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 	TupleDesc	oldTupDesc;
 	TupleDesc	newTupDesc;
 	bool		needscan = false;
+	bool		prevalidated = true;
 	List	   *notnull_attrs;
 	int			i;
 	ListCell   *l;
@@ -6095,6 +6099,11 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 				needscan = true;
 				con->qualstate = ExecPrepareExpr((Expr *) con->qual, estate);
 				break;
+
+			case CONSTR_NOTNULL:
+				if(con->nn_attnum == -1)
+					prevalidated = false;
+				break;
 			case CONSTR_FOREIGN:
 				/* Nothing to do here */
 				break;
@@ -6134,7 +6143,12 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 			if (attr->attnotnull && !attr->attisdropped)
 				notnull_attrs = lappend_int(notnull_attrs, i);
 		}
-		if (notnull_attrs)
+
+		/*
+		 * we did some prevalidation at phase 2, so we can safely be sure no
+		 * need to scan. see AddRelationNewConstraints and index_check_notnull
+		*/
+		if (notnull_attrs && !prevalidated)
 			needscan = true;
 	}
 
@@ -7280,7 +7294,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 		 * _list_ of defaults, but we just do one.
 		 */
 		AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-								  false, true, false, NULL);
+								  false, true, false, false, NULL);
 
 		/* Make the additional catalog changes visible */
 		CommandCounterIncrement();
@@ -7722,6 +7736,8 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName,
 	CookedConstraint *ccon;
 	List	   *cooked;
 	bool		is_no_inherit = false;
+	bool		index_rebuild = false;
+	AlteredTableInfo *tab = NULL;
 
 	/* Guard against stack overflow due to overly deep inheritance tree. */
 	check_stack_depth();
@@ -7837,10 +7853,44 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName,
 	constraint->is_no_inherit = is_no_inherit;
 	constraint->conname = conName;
 
+	/*
+	 * for ALTER TABLE ALTER COLUMN SET EXPRESSION, RememberIndexForRebuilding
+	 * can only remember the partitioned index entry
+	 * (RELKIND_PARTITIONED_INDEX), not the regular index
+	 * (RELKIND_PARTITIONED_INDEX). So we can not use index_check_notnull to
+	 * fast check the column not-null status for the partitioned table now.
+	 * that is why if recursing is true, we blindly assume index_rebuild is
+	 * true.  in future, we can iterate changedIndexOids, find out if there is
+	 * any index over that not-null column or not.
+	 */
+	tab = ATGetQueueEntry(wqueue, rel);
+	if (list_length(tab->changedIndexOids) > 0 || recursing)
+		index_rebuild = true;
+
 	/* and do it */
 	cooked = AddRelationNewConstraints(rel, NIL, list_make1(constraint),
-									   false, !recursing, false, NULL);
+									   false, !recursing, false, index_rebuild, NULL);
 	ccon = linitial(cooked);
+
+	if (!ccon->skip_validation)
+	{
+		NewConstraint *newcon;
+
+		newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
+		newcon->name = ccon->name;
+		newcon->contype = ccon->contype;
+		if (ccon->pre_validated)
+		{
+			newcon->nn_attnum = ccon->attnum;
+			Assert(ccon->attnum > 0);
+		}
+		else
+			newcon->nn_attnum = -1;
+		newcon->qual = NULL;
+
+		tab->constraints = lappend(tab->constraints, newcon);
+	}
+
 	ObjectAddressSet(address, ConstraintRelationId, ccon->conoid);
 
 	InvokeObjectPostAlterHook(RelationRelationId,
@@ -7988,7 +8038,7 @@ ATExecColumnDefault(Relation rel, const char *colName,
 		 * _list_ of defaults, but we just do one.
 		 */
 		AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-								  false, true, false, NULL);
+								  false, true, false, false, NULL);
 	}
 
 	ObjectAddressSubSet(address, RelationRelationId,
@@ -8450,7 +8500,7 @@ ATExecSetExpression(AlteredTableInfo *tab, Relation rel, const char *colName,
 
 	/* Store the generated expression */
 	AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-							  false, true, false, NULL);
+							  false, true, false, false, NULL);
 
 	/* Make above new expression visible */
 	CommandCounterIncrement();
@@ -9554,6 +9604,10 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	List	   *children;
 	ListCell   *child;
 	ObjectAddress address = InvalidObjectAddress;
+	bool		index_rebuild = false;
+
+	if (list_length(tab->changedIndexOids) > 0 || recursing)
+		index_rebuild = true;
 
 	/* Guard against stack overflow due to overly deep inheritance tree. */
 	check_stack_depth();
@@ -9578,6 +9632,7 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 										recursing || is_readd,	/* allow_merge */
 										!recursing, /* is_local */
 										is_readd,	/* is_internal */
+										index_rebuild,
 										NULL);	/* queryString not available
 												 * here */
 
@@ -9589,15 +9644,19 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	{
 		CookedConstraint *ccon = (CookedConstraint *) lfirst(lcon);
 
-		if (!ccon->skip_validation && ccon->contype != CONSTR_NOTNULL)
+		if (!ccon->skip_validation)
 		{
 			NewConstraint *newcon;
 
 			newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
 			newcon->name = ccon->name;
 			newcon->contype = ccon->contype;
+			if (ccon->pre_validated && ccon->contype == CONSTR_NOTNULL)
+				newcon->nn_attnum = ccon->attnum;
+			else
+				newcon->nn_attnum = -1;
+
 			newcon->qual = ccon->expr;
-
 			tab->constraints = lappend(tab->constraints, newcon);
 		}
 
@@ -10718,6 +10777,7 @@ addFkRecurseReferencing(List **wqueue, Constraint *fkconstraint, Relation rel,
 			newcon->refindid = indexOid;
 			newcon->conid = parentConstr;
 			newcon->conwithperiod = fkconstraint->fk_with_period;
+			newcon->nn_attnum = -1;
 			newcon->qual = (Node *) fkconstraint;
 
 			tab->constraints = lappend(tab->constraints, newcon);
@@ -12026,6 +12086,7 @@ ATExecValidateConstraint(List **wqueue, Relation rel, char *constrName,
 			newcon->refrelid = con->confrelid;
 			newcon->refindid = con->conindid;
 			newcon->conid = con->oid;
+			newcon->nn_attnum = -1;
 			newcon->qual = (Node *) fkconstraint;
 
 			/* Find or create work queue entry for this table */
@@ -12098,6 +12159,7 @@ ATExecValidateConstraint(List **wqueue, Relation rel, char *constrName,
 
 			val = SysCacheGetAttrNotNull(CONSTROID, tuple,
 										 Anum_pg_constraint_conbin);
+			newcon->nn_attnum = -1;
 			conbin = TextDatumGetCString(val);
 			newcon->qual = (Node *) stringToNode(conbin);
 
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index f0a5f8879a..37a7d3a96d 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -108,6 +108,7 @@
 
 #include "access/genam.h"
 #include "access/relscan.h"
+#include "access/table.h"
 #include "access/tableam.h"
 #include "access/xact.h"
 #include "catalog/index.h"
@@ -1166,3 +1167,95 @@ ExecWithoutOverlapsNotEmpty(Relation rel, NameData attname, Datum attval, char t
 				 errmsg("empty WITHOUT OVERLAPS value found in column \"%s\" in relation \"%s\"",
 						NameStr(attname), RelationGetRelationName(rel))));
 }
+
+bool
+index_check_notnull(Oid table_oid, Oid indexoid, AttrNumber colnum)
+{
+	Relation index;
+	Relation heap;
+
+	SnapshotData DirtySnapshot;
+	IndexScanDesc index_scan;
+	ScanKeyData scankeys[INDEX_MAX_KEYS];
+	TupleTableSlot *existing_slot;
+	IndexInfo  *indexInfo;
+	EState	   *estate;
+	int			indnkeyatts;
+	ExprContext *econtext;
+	bool all_not_null = true;
+	AttrNumber	sk_attno = -1;
+
+	heap = table_open(table_oid, NoLock);
+	index = index_open(indexoid, AccessShareLock);
+	/*
+	 * Need an EState for slot to hold the current tuple.
+	 *
+	 */
+	estate = CreateExecutorState();
+	econtext = GetPerTupleExprContext(estate);
+	existing_slot = table_slot_create(heap, NULL);
+
+	/* Arrange for econtext's scan tuple to be the tuple under test */
+	econtext->ecxt_scantuple = existing_slot;
+
+	indexInfo = BuildIndexInfo(index);
+	indnkeyatts = IndexRelationGetNumberOfKeyAttributes(index);
+
+	/*
+	 * Search the tuples that are in the index for any violations, including
+	 * tuples that aren't visible yet.
+	 */
+	InitDirtySnapshot(DirtySnapshot);
+
+	for (int i = 0; i < indnkeyatts; i++)
+	{
+		if (indexInfo->ii_IndexAttrNumbers[i] == colnum)
+		{
+			sk_attno = i+1;
+			break;
+		}
+	}
+
+	if (sk_attno == -1)
+		elog(ERROR, "index %u should effect on column number %d", indexoid, colnum);
+
+	for (int i = 0; i < indnkeyatts; i++)
+	{
+		/* set up an IS NULL scan key so that we ignore not nulls */
+		ScanKeyEntryInitialize(&scankeys[i],
+								SK_ISNULL | SK_SEARCHNULL,
+								sk_attno,		/* index col to scan */
+								InvalidStrategy,/* no strategy */
+								InvalidOid,		/* no strategy subtype */
+								InvalidOid,		/* no collation */
+								InvalidOid,		/* no reg proc for this */
+								(Datum) 0);		/* constant */
+	}
+
+	index_scan = index_beginscan(heap, index, &DirtySnapshot, indnkeyatts, 0);
+	index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0);
+	while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot))
+	{
+		Datum		values[INDEX_MAX_KEYS];
+		bool		nulls[INDEX_MAX_KEYS];
+
+		/*
+		 * Extract the index column values and isnull flags from the existing
+		 * tuple.
+		 */
+		FormIndexDatum(indexInfo, existing_slot, estate, values, nulls);
+
+		if (nulls[sk_attno - 1])
+		{
+			all_not_null = false;
+			break;
+		}
+	}
+	index_endscan(index_scan);
+	table_close(heap, NoLock);
+	index_close(index, AccessShareLock);
+	ExecDropSingleTupleTableSlot(existing_slot);
+	FreeExecutorState(estate);
+
+	return all_not_null;
+}
\ No newline at end of file
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 8c278f202b..45e2dd1cfa 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -45,6 +45,7 @@ typedef struct CookedConstraint
 	int16		inhcount;		/* number of times constraint is inherited */
 	bool		is_no_inherit;	/* constraint has local def and cannot be
 								 * inherited */
+	bool		pre_validated;  /* the constraint already validated? only for CONSTR_NOTNULL */
 } CookedConstraint;
 
 extern Relation heap_create(const char *relname,
@@ -113,6 +114,7 @@ extern List *AddRelationNewConstraints(Relation rel,
 									   bool allow_merge,
 									   bool is_local,
 									   bool is_internal,
+									   bool index_rebuild,
 									   const char *queryString);
 extern List *AddRelationNotNullConstraints(Relation rel,
 										   List *constraints,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 494ec4f2e5..355c3c192d 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -662,6 +662,7 @@ extern void check_exclusion_constraint(Relation heap, Relation index,
 									   const Datum *values, const bool *isnull,
 									   EState *estate, bool newIndex);
 
+extern bool index_check_notnull(Oid heap, Oid indexoid, AttrNumber colnum);
 /*
  * prototypes from functions in execReplication.c
  */
diff --git a/src/test/isolation/expected/fast-set-notnull.out b/src/test/isolation/expected/fast-set-notnull.out
new file mode 100644
index 0000000000..826f1dbe51
--- /dev/null
+++ b/src/test/isolation/expected/fast-set-notnull.out
@@ -0,0 +1,133 @@
+Parsed test spec with 2 sessions
+
+starting permutation: b1 m1 s0 s1 s2 r1 s2
+step b1: BEGIN ISOLATION LEVEL READ COMMITTED;
+step m1: DELETE FROM t;
+step s0: savepoint s0;
+step s1: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+step r1: ROLLBACK;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+(0 rows)
+
+
+starting permutation: b1 m1 s0 s1 s2 c1 s2
+step b1: BEGIN ISOLATION LEVEL READ COMMITTED;
+step m1: DELETE FROM t;
+step s0: savepoint s0;
+step s1: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+step c1: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+
+starting permutation: b1 b3 m1 hj c1 c3 s2
+step b1: BEGIN ISOLATION LEVEL READ COMMITTED;
+step b3: BEGIN ISOLATION LEVEL READ COMMITTED;
+step m1: DELETE FROM t;
+step hj: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a; <waiting ...>
+step c1: COMMIT;
+step hj: <... completed>
+step c3: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+
+starting permutation: b2 b4 m1 hj c1 c3 s2
+step b2: BEGIN;
+step b4: BEGIN;
+step m1: DELETE FROM t;
+step hj: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a; <waiting ...>
+step c1: COMMIT;
+step hj: <... completed>
+step c3: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+
+starting permutation: b1 b3 m2 s1 c3 c1 s2
+step b1: BEGIN ISOLATION LEVEL READ COMMITTED;
+step b3: BEGIN ISOLATION LEVEL READ COMMITTED;
+step m2: DELETE FROM t;
+step s1: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a; <waiting ...>
+step c3: COMMIT;
+step s1: <... completed>
+step c1: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+
+starting permutation: b1 b4 d1 m1 c2 s1 s2 r1
+step b1: BEGIN ISOLATION LEVEL READ COMMITTED;
+step b4: BEGIN;
+step d1: DROP INDEX t_ab_idx;
+step m1: DELETE FROM t; <waiting ...>
+step c2: ROLLBACK;
+step m1: <... completed>
+step s1: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+step r1: ROLLBACK;
+
+starting permutation: b1 b4 d1 m1 c3 s1 s2 c1
+step b1: BEGIN ISOLATION LEVEL READ COMMITTED;
+step b4: BEGIN;
+step d1: DROP INDEX t_ab_idx;
+step m1: DELETE FROM t; <waiting ...>
+step c3: COMMIT;
+step m1: <... completed>
+step s1: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+step c1: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 143109aa4d..32d7e51aa9 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -55,6 +55,7 @@ test: merge-delete
 test: merge-update
 test: merge-match-recheck
 test: merge-join
+test: fast-set-notnull
 test: delete-abort-savept
 test: delete-abort-savept-2
 test: aborted-keyrevoke
diff --git a/src/test/isolation/specs/fast-set-notnull.spec b/src/test/isolation/specs/fast-set-notnull.spec
new file mode 100644
index 0000000000..d8198c869a
--- /dev/null
+++ b/src/test/isolation/specs/fast-set-notnull.spec
@@ -0,0 +1,46 @@
+#
+# fast not-null tests
+#
+
+setup
+{
+  CREATE TABLE t (a int, b int);
+  CREATE INDEX t_ab_idx on t(a,b);
+  INSERT INTO t values (null, 1);
+  INSERT INTO t SELECT x, x*10 FROM generate_series(1,3) g(x);
+}
+
+teardown
+{
+  DROP TABLE t;
+}
+
+session s1
+step b1  { BEGIN ISOLATION LEVEL READ COMMITTED; }
+step b2  { BEGIN; }
+step m1  { DELETE FROM t;}
+step s0  { savepoint s0;}
+step s1  { ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;}
+step s2  { SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+         }
+step r1  { ROLLBACK; }
+step c1  { COMMIT; }
+
+session s2
+step b3  { BEGIN ISOLATION LEVEL READ COMMITTED; }
+step b4  { BEGIN; }
+step hj  { ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;}
+step m2  { DELETE FROM t;}
+step d1  { DROP INDEX t_ab_idx;}
+step c2  { ROLLBACK; }
+step c3  { COMMIT; }
+
+permutation b1 m1 s0 s1 s2 r1 s2
+permutation b1 m1 s0 s1 s2 c1 s2
+permutation b1 b3 m1 hj c1 c3 s2
+permutation b2 b4 m1 hj c1 c3 s2
+permutation b1 b3 m2 s1 c3 c1 s2
+permutation b1 b4 d1 m1 c2 s1 s2 r1
+permutation b1 b4 d1 m1 c3 s1 s2 c1
+
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 12852aa612..35b80b49e6 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -4738,3 +4738,41 @@ drop publication pub1;
 drop schema alter1 cascade;
 drop schema alter2 cascade;
 NOTICE:  drop cascades to table alter2.t1
+-----fast add not-null constraint tests.
+CREATE TABLE tp (a int, b int GENERATED ALWAYS AS (22) stored) PARTITION BY range(a);
+CREATE TABLE tpp1(a int, b int GENERATED ALWAYS AS (a+1) stored);
+CREATE TABLE tpp2(a int, b int GENERATED ALWAYS AS (a+10) stored);
+ALTER TABLE tp ATTACH PARTITION tpp1 FOR values from ( 1 ) to (100);
+ALTER TABLE tp ATTACH PARTITION tpp2 default;
+CREATE INDEX ON tp(b);
+insert into tp select g from generate_series(100,120) g;
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE tp alter column b set not null, ALTER COLUMN b SET EXPRESSION AS (a * 3);
+ALTER TABLE tp alter column b drop not null;
+ALTER TABLE tp add constraint nn not null b, ALTER COLUMN b SET EXPRESSION AS (a * 11);
+------test non-partitioned table
+CREATE TABLE t1(f1 INT, f2 int, f3 int,f4 int);
+INSERT INTO t1 select g, g+1, g+2, g+3 from generate_series(1, 20) g;
+CREATE INDEX t1_f1_f2_idx ON t1(f1,f2);
+CREATE UNIQUE INDEX t1_f3idx ON t1(f3);
+CREATE INDEX t1_f3f4idx ON t1(f3) include(f4);
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE t1 alter column f1 set not null, ALTER f1 TYPE BIGINT;
+ALTER TABLE t1 alter column f1 drop not null;
+ALTER TABLE t1 add constraint nn not null f1, ALTER f1 TYPE bigint;
+--ok. with ALTER COLUMN SET NOT NULL variant
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 alter column f2 set not null, alter column f1 set not null;
+--ok. with add constraint variants
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 add constraint nn not null f2, add constraint nnf1 not null f1;
+--ok.
+ALTER TABLE t1 add constraint t_f3_pk primary key using index t1_f3idx;
+NOTICE:  ALTER TABLE / ADD CONSTRAINT USING INDEX will rename index "t1_f3idx" to "t_f3_pk"
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--for column f4 now, can only for keycolumn, not include column.
+ALTER TABLE t1 add constraint nnf4 not null f4;
+drop table t1;
+drop table tpp1, tpp2, tp;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index c88f9eaab0..a92cddcdcc 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -3102,3 +3102,46 @@ alter table alter1.t1 set schema alter2;
 drop publication pub1;
 drop schema alter1 cascade;
 drop schema alter2 cascade;
+
+-----fast add not-null constraint tests.
+CREATE TABLE tp (a int, b int GENERATED ALWAYS AS (22) stored) PARTITION BY range(a);
+CREATE TABLE tpp1(a int, b int GENERATED ALWAYS AS (a+1) stored);
+CREATE TABLE tpp2(a int, b int GENERATED ALWAYS AS (a+10) stored);
+ALTER TABLE tp ATTACH PARTITION tpp1 FOR values from ( 1 ) to (100);
+ALTER TABLE tp ATTACH PARTITION tpp2 default;
+CREATE INDEX ON tp(b);
+insert into tp select g from generate_series(100,120) g;
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE tp alter column b set not null, ALTER COLUMN b SET EXPRESSION AS (a * 3);
+ALTER TABLE tp alter column b drop not null;
+ALTER TABLE tp add constraint nn not null b, ALTER COLUMN b SET EXPRESSION AS (a * 11);
+
+------test non-partitioned table
+CREATE TABLE t1(f1 INT, f2 int, f3 int,f4 int);
+INSERT INTO t1 select g, g+1, g+2, g+3 from generate_series(1, 20) g;
+CREATE INDEX t1_f1_f2_idx ON t1(f1,f2);
+CREATE UNIQUE INDEX t1_f3idx ON t1(f3);
+CREATE INDEX t1_f3f4idx ON t1(f3) include(f4);
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE t1 alter column f1 set not null, ALTER f1 TYPE BIGINT;
+ALTER TABLE t1 alter column f1 drop not null;
+ALTER TABLE t1 add constraint nn not null f1, ALTER f1 TYPE bigint;
+
+--ok. with ALTER COLUMN SET NOT NULL variant
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 alter column f2 set not null, alter column f1 set not null;
+--ok. with add constraint variants
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 add constraint nn not null f2, add constraint nnf1 not null f1;
+--ok.
+ALTER TABLE t1 add constraint t_f3_pk primary key using index t1_f3idx;
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--for column f4 now, can only for keycolumn, not include column.
+ALTER TABLE t1 add constraint nnf4 not null f4;
+drop table t1;
+drop table tpp1, tpp2, tp;
-- 
2.34.1



^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: using index to speedup add not null constraints to a table
@ 2025-02-05 08:24  jian he <[email protected]>
  parent: jian he <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: jian he @ 2025-02-05 08:24 UTC (permalink / raw)
  To: PostgreSQL Hackers <[email protected]>

hi

rebased new patch attached.
I also did some cosmetic changes. comments refined.
make sure using index_scan mechanism to fast check column not-null can
only be used via btree index.
isolation tests are simplified.


Attachments:

  [text/x-patch] v3-0001-using-index-to-speedup-add-not-null-constraint-to.patch (33.3K, 2-v3-0001-using-index-to-speedup-add-not-null-constraint-to.patch)
  download | inline diff:
From be8fea79986339cadff21dbe9c4415b330a296a3 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Wed, 5 Feb 2025 16:05:10 +0800
Subject: [PATCH v3 1/1] using index to speedup add not null constraint to
 existing table

This patch tries to use index_beginscan() / index_getnext() / index_endscan()
mentioned in [1] to speedup adding not-null constraints to the existing table.

There are two ways to add not-null constraints to an existing table.
1. ATExecAddConstraint->ATAddCheckNNConstraint->AddRelationNewConstraints
2. ATExecSetNotNull->AddRelationNewConstraints

The logic is:
1. In AddRelationNewConstraints, if condition meet (see get_index_for_validate_notnull)
   then attempt to use indexscan to pre check whether the column is NOT NULL.
2. AddRelationNewConstraints returned a list of CookedConstraint nodes.
   pass not-null cooked constraint to tab->constraints (a list of NewConstraint).
   code snippet:
    if (!ccon->skip_validation)
    {
        newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
        if (ccon->pre_validated)
            newcon->nn_attnum = ccon->attnum;
        tab->constraints = lappend(tab->constraints, newcon);
    }
3. In ATRewriteTable, if all attnums associated with not-null constraint have
already been prevalidated, then no need to needscan.

CAUTION:
ALTER TABLE SET DATA TYPE and ALTER TABLE SET EXPRESSION trigger an index
rebuild (refer to RememberAllDependentForRebuilding).  While the former cannot
be applied to partitioned tables, the latter can.  If an index rebuild occurs,
then indexscan cannot be used to prevalidate whether a column contains null
values.  Consequently, pre-validating a NOT NULL constraint is currently not
supported for partitioned tables.

concurrency concern:
ALTER TABLE SET NOT NULL will take an ACCESS EXCLUSIVE lock, so there is less
variant of racing issue can occur?  to prove accurate, I wrote some isolation
tests. see[2]

performance concern:
it will be slower than normal adding not-null constraint ONLY IF your index is
bloat and a large percent of bloat is related to null value data.
demo:

100% bloat and zero percent null value:
drop table if exists t \; create unlogged table t(a int, b int, c int);
insert into t select g, g+1 from generate_series(1,1_000_000) g;
create index t_idx_a on t(a);
delete from t;
alter table t add constraint t1 not null a;

patch Time:  1.873 ms
master Time: 648.312 ms
----------------------------------------------------
---- %20 percent column value is null and deleted from heap still on the index.
drop table if exists t \; create unlogged table t(a int, b int);
insert into t select case when g % 5 = 0 then null else g end, g+1
from generate_series(1,1_000_000) g;

create index t_idx_a on t(a);
delete from t where a is null;

alter table t add constraint t1 not null a;
patch Time:: 1080.407 ms (00:01.080)
master Time: 970.671 ms

references:
[1] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com
[2] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de
---
 src/backend/catalog/heap.c                    | 90 +++++++++++++++++
 src/backend/catalog/pg_constraint.c           |  1 +
 src/backend/commands/tablecmds.c              | 87 ++++++++++++++---
 src/backend/executor/execIndexing.c           | 97 +++++++++++++++++++
 src/include/catalog/heap.h                    |  3 +
 src/include/executor/executor.h               |  1 +
 .../isolation/expected/fast-set-notnull.out   | 97 +++++++++++++++++++
 src/test/isolation/isolation_schedule         |  1 +
 .../isolation/specs/fast-set-notnull.spec     | 43 ++++++++
 src/test/regress/expected/alter_table.out     | 41 ++++++++
 src/test/regress/sql/alter_table.sql          | 47 +++++++++
 11 files changed, 497 insertions(+), 11 deletions(-)
 create mode 100644 src/test/isolation/expected/fast-set-notnull.out
 create mode 100644 src/test/isolation/specs/fast-set-notnull.spec

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 57ef466acc..35cd64e87c 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -57,6 +57,7 @@
 #include "commands/tablecmds.h"
 #include "commands/typecmds.h"
 #include "common/int.h"
+#include "executor/executor.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
@@ -117,6 +118,7 @@ static Node *cookConstraint(ParseState *pstate,
 							char *relname);
 
 
+static Oid get_index_for_validate_notnull(Relation rel, int colnum);
 /* ----------------------------------------------------------------
  *				XXX UGLY HARD CODED BADNESS FOLLOWS XXX
  *
@@ -2286,6 +2288,8 @@ StoreConstraints(Relation rel, List *cooked_constraints, bool is_internal)
  * newConstraints: list of Constraint nodes
  * allow_merge: true if check constraints may be merged with existing ones
  * is_local: true if definition is local, false if it's inherited
+ * index_rebuild: true if the ALTER TABLE command and it's subcommands
+ * will do any index rebuild in the future.
  * is_internal: true if result of some internal process, not a user request
  * queryString: used during expression transformation of default values and
  *		cooked CHECK constraints
@@ -2310,6 +2314,7 @@ AddRelationNewConstraints(Relation rel,
 						  bool allow_merge,
 						  bool is_local,
 						  bool is_internal,
+						  bool index_rebuild,
 						  const char *queryString)
 {
 	List	   *cookedConstraints = NIL;
@@ -2398,6 +2403,7 @@ AddRelationNewConstraints(Relation rel,
 		cooked->is_local = is_local;
 		cooked->inhcount = is_local ? 0 : 1;
 		cooked->is_no_inherit = false;
+		cooked->validated = false;
 		cookedConstraints = lappend(cookedConstraints, cooked);
 	}
 
@@ -2532,6 +2538,7 @@ AddRelationNewConstraints(Relation rel,
 			cooked->is_local = is_local;
 			cooked->inhcount = is_local ? 0 : 1;
 			cooked->is_no_inherit = cdef->is_no_inherit;
+			cooked->validated = false;
 			cookedConstraints = lappend(cookedConstraints, cooked);
 		}
 		else if (cdef->contype == CONSTR_NOTNULL)
@@ -2540,6 +2547,7 @@ AddRelationNewConstraints(Relation rel,
 			AttrNumber	colnum;
 			int16		inhcount = is_local ? 0 : 1;
 			char	   *nnname;
+			Oid			indexoid = InvalidOid;
 
 			/* Determine which column to modify */
 			colnum = get_attnum(RelationGetRelid(rel), strVal(linitial(cdef->keys)));
@@ -2592,6 +2600,15 @@ AddRelationNewConstraints(Relation rel,
 								inhcount,
 								cdef->is_no_inherit);
 
+			/*
+			 * if there is any index to be rebuild, then we cannot do indexscan
+			 * in index_check_notnull; for re_added (is_internal is true)
+			 * not-null constraint also cannot use indexscan to check null
+			 * existence.
+			*/
+			if (rel->rd_rel->relkind == RELKIND_RELATION && !is_internal && !index_rebuild)
+				indexoid = get_index_for_validate_notnull(rel, colnum);
+
 			nncooked = (CookedConstraint *) palloc(sizeof(CookedConstraint));
 			nncooked->contype = CONSTR_NOTNULL;
 			nncooked->conoid = constrOid;
@@ -2603,7 +2620,14 @@ AddRelationNewConstraints(Relation rel,
 			nncooked->is_local = is_local;
 			nncooked->inhcount = inhcount;
 			nncooked->is_no_inherit = cdef->is_no_inherit;
+			nncooked->validated = false;
+			if (OidIsValid(indexoid))
+				nncooked->validated = index_check_notnull(RelationGetRelid(rel), indexoid, colnum);
 
+			if (nncooked->validated)
+				ereport(DEBUG1,
+						errmsg_internal("using index (%d) vertified relation \"%s\" column number (%d) don't have null values",
+										 indexoid, RelationGetRelationName(rel), colnum));
 			cookedConstraints = lappend(cookedConstraints, nncooked);
 		}
 	}
@@ -2620,6 +2644,72 @@ AddRelationNewConstraints(Relation rel,
 	return cookedConstraints;
 }
 
+static Oid
+get_index_for_validate_notnull(Relation relation, int colnum)
+{
+	Relation	indrel;
+	Relation	index_rel;
+	SysScanDesc indscan;
+	ScanKeyData skey;
+	HeapTuple	htup;
+	TupleDesc	tupdesc;
+	Form_pg_attribute attr;
+	List	   *result = NIL;
+
+	tupdesc = RelationGetDescr(relation);
+
+	/* Prepare to scan pg_index for entries having indrelid = this rel. */
+	ScanKeyInit(&skey,
+				Anum_pg_index_indrelid,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(RelationGetRelid(relation)));
+
+	indrel = table_open(IndexRelationId, AccessShareLock);
+	indscan = systable_beginscan(indrel, IndexIndrelidIndexId, true,
+								 NULL, 1, &skey);
+
+	while (HeapTupleIsValid(htup = systable_getnext(indscan)))
+	{
+		Form_pg_index index = (Form_pg_index) GETSTRUCT(htup);
+
+		if (!index->indimmediate || !index->indisvalid || !index->indislive)
+			continue;
+
+		if (!heap_attisnull(htup, Anum_pg_index_indexprs, NULL) ||
+			!heap_attisnull(htup, Anum_pg_index_indpred, NULL))
+			continue;
+
+		for (int i = 0; i < index->indnkeyatts; i++)
+		{
+			attr = TupleDescAttr(tupdesc, (index->indkey.values[i]) - 1);
+			if (attr->attnum == colnum)
+			{
+				Assert(OidIsValid(index->indexrelid));
+
+				/* currently, only validate column not-null can only done by btree index */
+				index_rel = index_open(index->indexrelid, AccessShareLock);
+				if (index_rel->rd_rel->relkind == RELKIND_INDEX &&
+					index_rel->rd_rel->relam == BTREE_AM_OID)
+					result = lappend_oid(result, index->indexrelid);
+
+				index_close(index_rel, AccessShareLock);
+				break;
+			}
+		}
+	}
+	systable_endscan(indscan);
+	table_close(indrel, AccessShareLock);
+
+	if(list_length(result) == 0)
+		return InvalidOid;
+	else
+	{
+		/* Sort the result list into OID order, make the result stable. */
+		list_sort(result, list_oid_cmp);
+		return list_nth_oid(result, 0);
+	}
+}
+
 /*
  * Check for a pre-existing check constraint that conflicts with a proposed
  * new one, and either adjust its conislocal/coninhcount settings or throw
diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c
index ac80652baf..ae1f779e8e 100644
--- a/src/backend/catalog/pg_constraint.c
+++ b/src/backend/catalog/pg_constraint.c
@@ -834,6 +834,7 @@ RelationGetNotNullConstraints(Oid relid, bool cooked, bool include_noinh)
 			cooked->is_local = true;
 			cooked->inhcount = 0;
 			cooked->is_no_inherit = conForm->connoinherit;
+			cooked->validated = false;
 
 			notnulls = lappend(notnulls, cooked);
 		}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 18f64db6e3..39f4fe54d6 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -210,15 +210,17 @@ typedef struct AlteredTableInfo
 } AlteredTableInfo;
 
 /* Struct describing one new constraint to check in Phase 3 scan */
-/* Note: new not-null constraints are handled elsewhere */
+/* Note: new not-null constraints are handled here also */
 typedef struct NewConstraint
 {
 	char	   *name;			/* Constraint name, or NULL if none */
-	ConstrType	contype;		/* CHECK or FOREIGN */
+	ConstrType	contype;		/* CHECK or FOREIGN or NOT-NULL */
 	Oid			refrelid;		/* PK rel, if FOREIGN */
 	Oid			refindid;		/* OID of PK's index, if FOREIGN */
 	bool		conwithperiod;	/* Whether the new FOREIGN KEY uses PERIOD */
 	Oid			conid;			/* OID of pg_constraint entry, if FOREIGN */
+	int			nn_attnum;		/* already validated NOT-NULL constraint attnum.
+								   -1 means not applicable */
 	Node	   *qual;			/* Check expr or CONSTR_FOREIGN Constraint */
 	ExprState  *qualstate;		/* Execution state for CHECK expr */
 } NewConstraint;
@@ -990,6 +992,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 			cooked->is_local = true;	/* not used for defaults */
 			cooked->inhcount = 0;	/* ditto */
 			cooked->is_no_inherit = false;
+			cooked->validated = false;
 			cookedDefaults = lappend(cookedDefaults, cooked);
 			attr->atthasdef = true;
 		}
@@ -1075,7 +1078,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	 */
 	if (rawDefaults)
 		AddRelationNewConstraints(rel, rawDefaults, NIL,
-								  true, true, false, queryString);
+								  true, true, false, false, queryString);
 
 	/*
 	 * Make column generation expressions visible for use by partitioning.
@@ -1306,7 +1309,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 	 */
 	if (stmt->constraints)
 		AddRelationNewConstraints(rel, NIL, stmt->constraints,
-								  true, true, false, queryString);
+								  true, true, false, false, queryString);
 
 	/*
 	 * Finally, merge the not-null constraints that are declared directly with
@@ -6071,6 +6074,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap)
 	TupleDesc	oldTupDesc;
 	TupleDesc	newTupDesc;
 	bool		needscan = false;
+	bool		validated = true;
 	List	   *notnull_attrs;
 	int			i;
 	ListCell   *l;
@@ -6132,6 +6136,10 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap)
 				needscan = true;
 				con->qualstate = ExecPrepareExpr((Expr *) con->qual, estate);
 				break;
+			case CONSTR_NOTNULL:
+				if(con->nn_attnum == -1)
+					validated = false;
+				break;
 			case CONSTR_FOREIGN:
 				/* Nothing to do here */
 				break;
@@ -6171,7 +6179,15 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap)
 			if (attr->attnotnull && !attr->attisdropped)
 				notnull_attrs = lappend_int(notnull_attrs, i);
 		}
-		if (notnull_attrs)
+
+		/*
+		 * we may did some pre-validation at phase 2. if validated is true, then
+		 * all the not-null constraints for this DDL have already been
+		 * pre-validated, no need to scan it again.  tab->constraints already
+		 * include all CONSTR_NOTNULL constraint. See AddRelationNewConstraints
+		 * and index_check_notnull.
+		*/
+		if (notnull_attrs && !validated)
 			needscan = true;
 	}
 
@@ -7317,7 +7333,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 		 * _list_ of defaults, but we just do one.
 		 */
 		AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-								  false, true, false, NULL);
+								  false, true, false, false, NULL);
 
 		/* Make the additional catalog changes visible */
 		CommandCounterIncrement();
@@ -7759,6 +7775,8 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName,
 	CookedConstraint *ccon;
 	List	   *cooked;
 	bool		is_no_inherit = false;
+	bool		index_rebuild = false;
+	AlteredTableInfo *tab = NULL;
 
 	/* Guard against stack overflow due to overly deep inheritance tree. */
 	check_stack_depth();
@@ -7874,10 +7892,45 @@ ATExecSetNotNull(List **wqueue, Relation rel, char *conName, char *colName,
 	constraint->is_no_inherit = is_no_inherit;
 	constraint->conname = conName;
 
+	/*
+	 * for ALTER TABLE ALTER COLUMN SET EXPRESSION, function
+	 * RememberIndexForRebuilding can only remember the partitioned index entry
+	 * (RELKIND_PARTITIONED_INDEX), not the regular index.
+	 * so we can not use index_check_notnull to fast check the column not-null
+	 * status for the partitioned table now.
+	 * that is why if recursing is true, we blindly assume index_rebuild is
+	 * true.  in future, we can iterate changedIndexOids, find out if there is
+	 * any index rebuild over that not-null column or not.
+	 */
+	tab = ATGetQueueEntry(wqueue, rel);
+	if (list_length(tab->changedIndexOids) > 0 || recursing)
+		index_rebuild = true;
+
 	/* and do it */
 	cooked = AddRelationNewConstraints(rel, NIL, list_make1(constraint),
-									   false, !recursing, false, NULL);
+									   false, !recursing, false, index_rebuild, NULL);
 	ccon = linitial(cooked);
+
+	if (!ccon->skip_validation)
+	{
+		NewConstraint *newcon;
+
+		newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
+		newcon->name = ccon->name;
+
+		Assert(ccon->contype == CONSTR_NOTNULL);
+		newcon->contype = ccon->contype;
+		if (ccon->validated)
+		{
+			newcon->nn_attnum = ccon->attnum;
+			Assert(ccon->attnum > 0);
+		}
+		else
+			newcon->nn_attnum = -1;
+
+		tab->constraints = lappend(tab->constraints, newcon);
+	}
+
 	ObjectAddressSet(address, ConstraintRelationId, ccon->conoid);
 
 	InvokeObjectPostAlterHook(RelationRelationId,
@@ -8025,7 +8078,7 @@ ATExecColumnDefault(Relation rel, const char *colName,
 		 * _list_ of defaults, but we just do one.
 		 */
 		AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-								  false, true, false, NULL);
+								  false, true, false, false, NULL);
 	}
 
 	ObjectAddressSubSet(address, RelationRelationId,
@@ -8487,7 +8540,7 @@ ATExecSetExpression(AlteredTableInfo *tab, Relation rel, const char *colName,
 
 	/* Store the generated expression */
 	AddRelationNewConstraints(rel, list_make1(rawEnt), NIL,
-							  false, true, false, NULL);
+							  false, true, false, false, NULL);
 
 	/* Make above new expression visible */
 	CommandCounterIncrement();
@@ -9591,6 +9644,10 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	List	   *children;
 	ListCell   *child;
 	ObjectAddress address = InvalidObjectAddress;
+	bool		index_rebuild = false;
+
+	if (list_length(tab->changedIndexOids) > 0 || recursing)
+		index_rebuild = true;
 
 	/* Guard against stack overflow due to overly deep inheritance tree. */
 	check_stack_depth();
@@ -9615,6 +9672,7 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 										recursing || is_readd,	/* allow_merge */
 										!recursing, /* is_local */
 										is_readd,	/* is_internal */
+										index_rebuild,
 										NULL);	/* queryString not available
 												 * here */
 
@@ -9626,15 +9684,19 @@ ATAddCheckNNConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	{
 		CookedConstraint *ccon = (CookedConstraint *) lfirst(lcon);
 
-		if (!ccon->skip_validation && ccon->contype != CONSTR_NOTNULL)
+		if (!ccon->skip_validation)
 		{
 			NewConstraint *newcon;
 
 			newcon = (NewConstraint *) palloc0(sizeof(NewConstraint));
 			newcon->name = ccon->name;
 			newcon->contype = ccon->contype;
+			if (ccon->validated && ccon->contype == CONSTR_NOTNULL)
+				newcon->nn_attnum = ccon->attnum;
+			else
+				newcon->nn_attnum = -1;
+
 			newcon->qual = ccon->expr;
-
 			tab->constraints = lappend(tab->constraints, newcon);
 		}
 
@@ -10732,6 +10794,7 @@ addFkRecurseReferencing(List **wqueue, Constraint *fkconstraint, Relation rel,
 			newcon->refindid = indexOid;
 			newcon->conid = parentConstr;
 			newcon->conwithperiod = fkconstraint->fk_with_period;
+			newcon->nn_attnum = -1;
 			newcon->qual = (Node *) fkconstraint;
 
 			tab->constraints = lappend(tab->constraints, newcon);
@@ -12146,6 +12209,7 @@ QueueFKConstraintValidation(List **wqueue, Relation conrel, Relation rel,
 		newcon->refrelid = con->confrelid;
 		newcon->refindid = con->conindid;
 		newcon->conid = con->oid;
+		newcon->nn_attnum = -1;
 		newcon->qual = (Node *) fkconstraint;
 
 		/* Find or create work queue entry for this table */
@@ -12285,6 +12349,7 @@ QueueCheckConstraintValidation(List **wqueue, Relation conrel, Relation rel,
 	newcon->refrelid = InvalidOid;
 	newcon->refindid = InvalidOid;
 	newcon->conid = con->oid;
+	newcon->nn_attnum = -1;
 
 	val = SysCacheGetAttrNotNull(CONSTROID, contuple,
 								 Anum_pg_constraint_conbin);
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 7c87f012c3..bcef26f3cc 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -108,6 +108,7 @@
 
 #include "access/genam.h"
 #include "access/relscan.h"
+#include "access/table.h"
 #include "access/tableam.h"
 #include "access/xact.h"
 #include "catalog/index.h"
@@ -1166,3 +1167,99 @@ ExecWithoutOverlapsNotEmpty(Relation rel, NameData attname, Datum attval, char t
 				 errmsg("empty WITHOUT OVERLAPS value found in column \"%s\" in relation \"%s\"",
 						NameStr(attname), RelationGetRelationName(rel))));
 }
+
+/*
+ * colnum is the attnum where not-null constraint will be created on.
+*/
+bool
+index_check_notnull(Oid table_oid, Oid indexoid, AttrNumber colnum)
+{
+	Relation index;
+	Relation heap;
+
+	SnapshotData DirtySnapshot;
+	IndexScanDesc index_scan;
+	ScanKeyData scankeys[INDEX_MAX_KEYS];
+	TupleTableSlot *existing_slot;
+	IndexInfo  *indexInfo;
+	EState	   *estate;
+	int			indnkeyatts;
+	ExprContext *econtext;
+	bool all_not_null = true;
+	AttrNumber	sk_attno = -1;
+
+	heap = table_open(table_oid, NoLock);
+	index = index_open(indexoid, AccessShareLock);
+
+	/*
+	 * Need an EState for slot to hold the current tuple.
+	 *
+	 */
+	estate = CreateExecutorState();
+	econtext = GetPerTupleExprContext(estate);
+	existing_slot = table_slot_create(heap, NULL);
+
+	/* Arrange for econtext's scan tuple to be the tuple under test */
+	econtext->ecxt_scantuple = existing_slot;
+
+	indexInfo = BuildIndexInfo(index);
+	indnkeyatts = IndexRelationGetNumberOfKeyAttributes(index);
+
+	/*
+	 * Search the tuples that are in the index for any violations, including
+	 * tuples that aren't visible yet.
+	 */
+	InitDirtySnapshot(DirtySnapshot);
+
+	for (int i = 0; i < indnkeyatts; i++)
+	{
+		if (indexInfo->ii_IndexAttrNumbers[i] == colnum)
+		{
+			sk_attno = i+1;
+			break;
+		}
+	}
+
+	if (sk_attno == -1)
+		elog(ERROR, "index %u should effect on column number %d", indexoid, colnum);
+
+	for (int i = 0; i < indnkeyatts; i++)
+	{
+		/* set up an IS NULL scan key so that we ignore not nulls */
+		ScanKeyEntryInitialize(&scankeys[i],
+								SK_ISNULL | SK_SEARCHNULL,
+								sk_attno,		/* index col to scan */
+								InvalidStrategy,/* no strategy */
+								InvalidOid,		/* no strategy subtype */
+								InvalidOid,		/* no collation */
+								InvalidOid,		/* no reg proc for this */
+								(Datum) 0);		/* constant */
+	}
+
+	index_scan = index_beginscan(heap, index, &DirtySnapshot, indnkeyatts, 0);
+	index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0);
+	while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot))
+	{
+		Datum		values[INDEX_MAX_KEYS];
+		bool		nulls[INDEX_MAX_KEYS];
+
+		/*
+		 * Extract the index column values and isnull flags from the existing
+		 * tuple.
+		 */
+		FormIndexDatum(indexInfo, existing_slot, estate, values, nulls);
+
+		if (nulls[sk_attno - 1])
+		{
+			all_not_null = false;
+			break;
+		}
+	}
+	index_endscan(index_scan);
+	table_close(heap, NoLock);
+	index_close(index, AccessShareLock);
+	ExecDropSingleTupleTableSlot(existing_slot);
+	FreeExecutorState(estate);
+
+	return all_not_null;
+}
\ No newline at end of file
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index cad830dc39..bbb75820c5 100644
--- a/src/include/catalog/heap.h
+++ b/src/include/catalog/heap.h
@@ -46,6 +46,8 @@ typedef struct CookedConstraint
 	int16		inhcount;		/* number of times constraint is inherited */
 	bool		is_no_inherit;	/* constraint has local def and cannot be
 								 * inherited */
+	bool		validated;  	/* is this constraint already validated?
+								 * only for CONSTR_NOTNULL in AddRelationNewConstraints */
 } CookedConstraint;
 
 extern Relation heap_create(const char *relname,
@@ -114,6 +116,7 @@ extern List *AddRelationNewConstraints(Relation rel,
 									   bool allow_merge,
 									   bool is_local,
 									   bool is_internal,
+									   bool index_rebuild,
 									   const char *queryString);
 extern List *AddRelationNotNullConstraints(Relation rel,
 										   List *constraints,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 45b80e6b98..d4f143c5aa 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -661,6 +661,7 @@ extern void check_exclusion_constraint(Relation heap, Relation index,
 									   const Datum *values, const bool *isnull,
 									   EState *estate, bool newIndex);
 
+extern bool index_check_notnull(Oid heap, Oid indexoid, AttrNumber colnum);
 /*
  * prototypes from functions in execReplication.c
  */
diff --git a/src/test/isolation/expected/fast-set-notnull.out b/src/test/isolation/expected/fast-set-notnull.out
new file mode 100644
index 0000000000..36ac14a2e4
--- /dev/null
+++ b/src/test/isolation/expected/fast-set-notnull.out
@@ -0,0 +1,97 @@
+Parsed test spec with 2 sessions
+
+starting permutation: b1 b3 m1 hj c1 c3 s2
+step b1: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step b3: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step m1: DELETE FROM t;
+step hj: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a; <waiting ...>
+step c1: COMMIT;
+step hj: <... completed>
+step c3: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+
+starting permutation: b2 b3 m1 hj c1 c3 s2
+step b2: BEGIN;
+step b3: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step m1: DELETE FROM t;
+step hj: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a; <waiting ...>
+step c1: COMMIT;
+step hj: <... completed>
+step c3: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+
+starting permutation: b1 b4 m1 hj c1 c3 s2
+step b1: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step b4: BEGIN;
+step m1: DELETE FROM t;
+step hj: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a; <waiting ...>
+step c1: COMMIT;
+step hj: <... completed>
+step c3: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+
+starting permutation: b1 b3 hj r1 c2
+step b1: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step b3: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step hj: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;
+ERROR:  column "a" of relation "t" contains null values
+step r1: ROLLBACK;
+step c2: ROLLBACK;
+
+starting permutation: b1 b4 d1 m1 c2 s1 s2 r1
+step b1: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step b4: BEGIN;
+step d1: DROP INDEX t_ab_idx;
+step m1: DELETE FROM t; <waiting ...>
+step c2: ROLLBACK;
+step m1: <... completed>
+step s1: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
+step r1: ROLLBACK;
+
+starting permutation: b2 b3 m1 d1 s0 s1 c1 c3 s2
+step b2: BEGIN;
+step b3: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step m1: DELETE FROM t;
+step d1: DROP INDEX t_ab_idx; <waiting ...>
+step s0: savepoint s0;
+step s1: ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;
+step c1: COMMIT;
+step d1: <... completed>
+step c3: COMMIT;
+step s2: SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+
+conname|conrelid|contype|convalidated
+-------+--------+-------+------------
+t1_nn  |t       |n      |t           
+(1 row)
+
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 143109aa4d..32d7e51aa9 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -55,6 +55,7 @@ test: merge-delete
 test: merge-update
 test: merge-match-recheck
 test: merge-join
+test: fast-set-notnull
 test: delete-abort-savept
 test: delete-abort-savept-2
 test: aborted-keyrevoke
diff --git a/src/test/isolation/specs/fast-set-notnull.spec b/src/test/isolation/specs/fast-set-notnull.spec
new file mode 100644
index 0000000000..a7e71b31e8
--- /dev/null
+++ b/src/test/isolation/specs/fast-set-notnull.spec
@@ -0,0 +1,43 @@
+#
+# fast not-null tests
+#
+
+setup
+{
+  CREATE TABLE t (a int, b int);
+  CREATE INDEX t_ab_idx on t(a,b);
+  INSERT INTO t values (null, 1);
+  INSERT INTO t SELECT x, x*10 FROM generate_series(1,3) g(x);
+}
+
+teardown
+{
+  DROP TABLE t;
+}
+
+session s1
+step b1  { BEGIN ISOLATION LEVEL REPEATABLE READ; }
+step b2  { BEGIN; }
+step m1  { DELETE FROM t;}
+step s0  { savepoint s0;}
+step s1  { ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;}
+step s2  { SELECT conname, conrelid::regclass, contype, convalidated
+           FROM pg_constraint WHERE conname = 't1_nn';
+         }
+step r1  { ROLLBACK; }
+step c1  { COMMIT; }
+
+session s2
+step b3  { BEGIN ISOLATION LEVEL REPEATABLE READ; }
+step b4  { BEGIN; }
+step hj  { ALTER TABLE t ADD CONSTRAINT t1_nn NOT NULL a;}
+step d1  { DROP INDEX t_ab_idx;}
+step c2  { ROLLBACK; }
+step c3  { COMMIT; }
+
+permutation b1 b3 m1 hj c1 c3 s2
+permutation b2 b3 m1 hj c1 c3 s2
+permutation b1 b4 m1 hj c1 c3 s2
+permutation b1 b3 hj r1 c2
+permutation b1 b4 d1 m1 c2 s1 s2 r1
+permutation b2 b3 m1 d1 s0 s1 c1 c3 s2
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 362f38856d..38e247a7d4 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -4762,3 +4762,44 @@ drop publication pub1;
 drop schema alter1 cascade;
 drop schema alter2 cascade;
 NOTICE:  drop cascades to table alter2.t1
+-----fast add not-null constraint tests.
+CREATE TABLE tp (a int, b int GENERATED ALWAYS AS (22) stored) PARTITION BY range(a);
+CREATE TABLE tpp1(a int, b int GENERATED ALWAYS AS (a+1) stored);
+CREATE TABLE tpp2(a int, b int GENERATED ALWAYS AS (a+10) stored);
+ALTER TABLE tp ATTACH PARTITION tpp1 FOR values from ( 1 ) to (100);
+ALTER TABLE tp ATTACH PARTITION tpp2 default;
+CREATE INDEX ON tp(b);
+insert into tp select g from generate_series(100,120) g;
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE tp alter column b set not null, ALTER COLUMN b SET EXPRESSION AS (a * 3);
+ALTER TABLE tp alter column b drop not null;
+ALTER TABLE tp add constraint nna not null a;
+------test non-partitioned table
+CREATE TABLE t1(f1 INT, f2 int, f3 int,f4 int, f5 int);
+INSERT INTO t1 select g, g+1, g+2, g+3, g+4 from generate_series(1, 20) g;
+CREATE INDEX t1_f1_f2_idx ON t1(f1,f2);
+CREATE UNIQUE INDEX t1_f3idx ON t1(f3);
+CREATE INDEX t1_f3f4idx ON t1(f3) include(f4);
+CREATE INDEX hash_f5_idx ON t1 USING hash (f5 int4_ops);
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE t1 alter column f1 set not null, ALTER f1 TYPE BIGINT;
+ALTER TABLE t1 alter column f1 drop not null;
+ALTER TABLE t1 add constraint nn not null f1, ALTER f1 TYPE bigint;
+--cannot fast ALTER TABLE SET NOT NULL by utilizing hash index.
+ALTER TABLE t1 add constraint nn_f5 not null f5;
+--ok. with ALTER COLUMN SET NOT NULL variant
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 alter column f2 set not null, alter column f1 set not null;
+--ok. with add constraint variants
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 add constraint nn not null f2, add constraint nnf1 not null f1;
+--ok.
+ALTER TABLE t1 add constraint t_f3_pk primary key using index t1_f3idx;
+NOTICE:  ALTER TABLE / ADD CONSTRAINT USING INDEX will rename index "t1_f3idx" to "t_f3_pk"
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--for column f4 now, can only for key columns, not include column.
+ALTER TABLE t1 add constraint nnf4 not null f4;
+drop table t1;
+drop table tpp1, tpp2, tp;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 84e93ef575..cd0584b6c6 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -3109,3 +3109,50 @@ alter table alter1.t1 set schema alter2;
 drop publication pub1;
 drop schema alter1 cascade;
 drop schema alter2 cascade;
+
+-----fast add not-null constraint tests.
+CREATE TABLE tp (a int, b int GENERATED ALWAYS AS (22) stored) PARTITION BY range(a);
+CREATE TABLE tpp1(a int, b int GENERATED ALWAYS AS (a+1) stored);
+CREATE TABLE tpp2(a int, b int GENERATED ALWAYS AS (a+10) stored);
+ALTER TABLE tp ATTACH PARTITION tpp1 FOR values from ( 1 ) to (100);
+ALTER TABLE tp ATTACH PARTITION tpp2 default;
+CREATE INDEX ON tp(b);
+insert into tp select g from generate_series(100,120) g;
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE tp alter column b set not null, ALTER COLUMN b SET EXPRESSION AS (a * 3);
+ALTER TABLE tp alter column b drop not null;
+ALTER TABLE tp add constraint nna not null a;
+
+------test non-partitioned table
+CREATE TABLE t1(f1 INT, f2 int, f3 int,f4 int, f5 int);
+INSERT INTO t1 select g, g+1, g+2, g+3, g+4 from generate_series(1, 20) g;
+CREATE INDEX t1_f1_f2_idx ON t1(f1,f2);
+CREATE UNIQUE INDEX t1_f3idx ON t1(f3);
+CREATE INDEX t1_f3f4idx ON t1(f3) include(f4);
+CREATE INDEX hash_f5_idx ON t1 USING hash (f5 int4_ops);
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--so won't interfere with the index rebuild.
+ALTER TABLE t1 alter column f1 set not null, ALTER f1 TYPE BIGINT;
+ALTER TABLE t1 alter column f1 drop not null;
+ALTER TABLE t1 add constraint nn not null f1, ALTER f1 TYPE bigint;
+
+--cannot fast ALTER TABLE SET NOT NULL by utilizing hash index.
+ALTER TABLE t1 add constraint nn_f5 not null f5;
+
+--ok. with ALTER COLUMN SET NOT NULL variant
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 alter column f2 set not null, alter column f1 set not null;
+--ok. with add constraint variants
+ALTER TABLE t1 alter column f1 drop not null, alter column f2 drop not null;
+ALTER TABLE t1 add constraint nn not null f2, add constraint nnf1 not null f1;
+--ok.
+ALTER TABLE t1 add constraint t_f3_pk primary key using index t1_f3idx;
+
+--cannot fast ALTER TABLE SET NOT NULL or ALTER TABLE ADD CONSTRAINT NOT NULL
+--for column f4 now, can only for key columns, not include column.
+ALTER TABLE t1 add constraint nnf4 not null f4;
+drop table t1;
+drop table tpp1, tpp2, tp;
-- 
2.34.1



^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2025-02-05 08:24 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-12-16 07:07 using index to speedup add not null constraints to a table jian he <[email protected]>
2024-12-31 16:09 ` jian he <[email protected]>
2025-02-05 08:24   ` jian he <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox