public inbox for [email protected]  
help / color / mirror / Atom feed
From: Mohamed ALi <[email protected]>
To: [email protected]
Subject: [PATCH] Fix: Partitioned parent index remains invalid after child indexes are repaired
Date: Wed, 8 Apr 2026 23:16:48 -0700
Message-ID: <CAGnOmWqi1D9ycBgUeOGf6mOCd2Dcf=6sKhbf4sHLs5xAcKVCMQ@mail.gmail.com> (raw)

Hi hackers,

A partitioned (parent) index in PostgreSQL can become permanently
stuck with `indisvalid = false` even after all of its child partition
indexes have been repaired and are valid. There is no built-in
mechanism to re-validate the parent index after a child is fixed via
`REINDEX`. This affects all currently supported PostgreSQL versions
(13 through 18)
The root cause is that `validatePartitionedIndex()` — the only
function that can mark a partitioned index as valid is never called
after `REINDEX` operations, and is skipped when re-running `ALTER
INDEX ATTACH PARTITION` on an already-attached index.

How the Bug Manifests

Typical Scenario :
1. A partitioned table has multiple partitions.
2. The user creates indexes on partitions concurrently. One fails (due
to deadlock, cancellation, timeout, etc.), leaving an invalid
partition index.
3. A parent index is created (or the invalid index is attached to an
existing parent). The parent is correctly marked `indisvalid = false`
because at least one child is invalid.
4. The user fixes the broken child index with `REINDEX INDEX CONCURRENTLY`.
5. The child index becomes valid (`indisvalid = true`).
6. The parent index remains `indisvalid = false` permanently. No SQL
command can fix it.

Reproduction steps:

```sql
-- ============================================================
-- SETUP: Partitioned table with two partitions and sample data
-- ============================================================
DROP TABLE IF EXISTS orders;
CREATE TABLE orders (
    id serial,
    order_date date NOT NULL,
    amount numeric
) PARTITION BY RANGE (order_date);
CREATE TABLE orders_2023 PARTITION OF orders
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
CREATE TABLE orders_2024 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
INSERT INTO orders (order_date, amount)
SELECT d, random() * 1000
FROM generate_series('2023-01-01'::date, '2023-12-31'::date, '1 day') d;
INSERT INTO orders (order_date, amount)
SELECT d, random() * 1000
FROM generate_series('2024-01-01'::date, '2024-12-31'::date, '1 day') d;
-- ============================================================
-- STEP 1: Create parent index with ONLY (starts as invalid)
-- ============================================================
CREATE INDEX orders_amount_idx ON ONLY orders (amount);
-- Verify: parent index is invalid (no children attached yet)
SELECT c.relname, i.indisvalid
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE c.relname LIKE 'orders%idx%'
ORDER BY c.relname;
-- Expected:
--  orders_amount_idx | f
-- ============================================================
-- STEP 2: Create valid index on first partition
-- ============================================================
CREATE INDEX CONCURRENTLY orders_2023_amount_idx ON orders_2023 (amount);
-- ============================================================
-- STEP 3: Create an INVALID index on second partition
-- ============================================================
-- In a separate session, hold a lock:
BEGIN; LOCK TABLE orders_2024 IN SHARE MODE;
-- Then in the main session:
SET statement_timeout = '1ms';
CREATE INDEX CONCURRENTLY orders_2024_amount_idx ON orders_2024 (amount);
RESET statement_timeout;
-- it will fail/timeout, leaving an invalid index.
-- Verify state:
SELECT c.relname, i.indisvalid
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE c.relname LIKE 'orders%idx%'
ORDER BY c.relname;
-- Expected:
--  orders_2023_amount_idx | t   (valid)
--  orders_2024_amount_idx | f   (invalid)
--  orders_amount_idx      | f   (invalid, created with ONLY)
-- ============================================================
-- STEP 4: Attach both partition indexes to the parent
-- ============================================================
-- Attach the invalid one first
ALTER INDEX orders_amount_idx ATTACH PARTITION orders_2024_amount_idx;
-- Succeeds. Parent stays invalid (correct — child is invalid).
-- Attach the valid one
ALTER INDEX orders_amount_idx ATTACH PARTITION orders_2023_amount_idx;
-- Succeeds. Parent still invalid (correct — one child still invalid).
-- Verify attachment and validity:
SELECT c.relname, i.indisvalid,
       pg_get_indexdef(i.indexrelid) AS indexdef
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE c.relname LIKE 'orders%amount%'
ORDER BY c.relname;
-- Expected:
--  orders_2023_amount_idx | t
--  orders_2024_amount_idx | f
--  orders_amount_idx      | f
-- ============================================================
-- STEP 5: Fix the invalid child index via REINDEX
-- ============================================================
REINDEX INDEX CONCURRENTLY orders_2024_amount_idx;
-- Verify: child is now valid
SELECT c.relname, i.indisvalid
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE c.relname LIKE 'orders%amount%'
ORDER BY c.relname;
-- ACTUAL (buggy) result:
--  orders_2023_amount_idx | t   (valid)
--  orders_2024_amount_idx | t   (valid — fixed by REINDEX)
--  orders_amount_idx      | f   (STILL INVALID — this is the bug!)
--
-- EXPECTED result (if bug were fixed):
--  orders_2023_amount_idx | t
--  orders_2024_amount_idx | t
--  orders_amount_idx      | t   (should be valid now)
-- ============================================================
-- STEP 6: Demonstrate that re-running ATTACH does not help
-- ============================================================
ALTER INDEX orders_amount_idx ATTACH PARTITION orders_2024_amount_idx;
-- Returns "ALTER INDEX" (succeeds silently, does nothing)
SELECT c.relname, i.indisvalid
FROM pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
WHERE c.relname LIKE 'orders%amount%'
ORDER BY c.relname;
-- Parent is STILL invalid. The "silently do nothing" path
-- skips validatePartitionedIndex() entirely.
-- ============================================================
-- CLEANUP
-- ============================================================
DROP TABLE orders;
```


Root Cause Analysis:

Where `validatePartitionedIndex()` Is Called

The function is called in exactly these code paths:
1. During `ALTER INDEX ... ATTACH PARTITION` — inside
`ATExecAttachPartitionIdx()`
2. During `ALTER TABLE ... ATTACH PARTITION` — via
`AttachPartitionEnsureIndexes()`
3. During `CREATE INDEX` on partitioned tables — via `DefineIndex()`
It is NOT called:
- After `REINDEX` of a partitioned index
- During any maintenance operation
- As any periodic validation check

Bug 1: REINDEX Does Not Validate Parent


When `reindex_index()` in `src/backend/catalog/index.c` marks a
partition index as valid (setting `indisvalid = true`), it does not
check whether the parent partitioned index should also become valid.
The function simply updates the child's `pg_index` entry and returns.

Bug 2: Re-running ATTACH Skips Validation


In `ATExecAttachPartitionIdx()` (tablecmds.c, around line 21923 in PG
16 / line ~22900 in HEAD):
https://github.com/postgres/postgres/blob/master/src/backend/commands/tablecmds.c#L21923

```c
/* Silently do nothing if already in the right state */
currParent = partIdx->rd_rel->relispartition ?
    get_partition_parent(partIdxId, false) : InvalidOid;
if (currParent != RelationGetRelid(parentIdx))
{
    // ... all validation checks and attachment logic ...
    validatePartitionedIndex(parentIdx, parentTbl);  // ONLY called here
}
// If already attached, entire block is skipped — no validation!
```

When the child is already attached (`currParent == parentIdx`), the
condition is false, the entire if-block is skipped, and
`validatePartitionedIndex()` is never called. The comment "Silently do
nothing if already in the right state" is misleading  "already
attached" does not mean "parent validity is correct."

Proposed Fixes:

Fix 1 : Always Validate Parent Index in ALTER INDEX ATTACH PARTITION

Patch File : 0001-Always-validate-parent-index-in-ALTER-INDEX-ATTACH.patch

Move the validatePartitionedIndex() call outside the if-block so it runs
unconditionally — both when a new attachment is made and when the partition is
already attached. This provides a user-accessible recovery path: after fixing a
child index with REINDEX, re-running ALTER INDEX ATTACH PARTITION triggers
parent validation.

When the partition is already attached, a NOTICE is emitted:

NOTICE:  partition index "child_idx" is already attached to
"parent_idx", validating parent index


This follows PostgreSQL's existing convention of using NOTICE for
informational messages about no-op or reduced-scope operations (e.g.,
DROP TABLE IF EXISTS, CREATE INDEX IF NOT EXISTS). It tells the user:

1- Nothing went wrong
2- The index was already attached (so they know the state)
3-  Validation still happened (so they know the fix path works)


Fix 2: Validate Parent Partitioned Index After REINDEX of Child

Patch File : 0001-Validate-parent-partitioned-index-after-REINDEX.patch

Same underlying bug but this patch addresses it from the
REINDEX side. When a partition index is repaired via REINDEX or
REINDEX CONCURRENTLY, the parent partitioned index remains permanently
stuck with indisvalid = false even though all children are now valid.

This is because validatePartitionedIndex() — the only function that can
mark a partitioned index as valid is never called from any REINDEX code
path.


validatePartitionedIndex() is only called during:

1- ALTER INDEX ... ATTACH PARTITION (tablecmds.c)
2- ALTER TABLE ... ATTACH PARTITION (tablecmds.c)
3- CREATE INDEX on partitioned tables (indexcmds.c)

It is NOT called after:

1- REINDEX INDEX (regular) — handled by reindex_index() in index.c
2- REINDEX INDEX CONCURRENTLY — handled by ReindexRelationConcurrently()

in indexcmds.c, which uses index_concurrently_swap() in index.c

Three changes are made:

1. Make validatePartitionedIndex() public
The function was static in tablecmds.c. It is now exported via
tablecmds.h so it can be called from index.c and indexcmds.c.

Files changed:

src/backend/commands/tablecmds.c — remove static, update comment
src/include/commands/tablecmds.h — add extern declaration

2. Call from reindex_index() (regular REINDEX)
After reindex_index() marks a partition index as valid (indisvalid = true),
check if the index is a partition (iRel->rd_rel->relispartition) and if so,
look up the parent and call validatePartitionedIndex().

A CommandCounterIncrement() is required before the call so that the child's
updated indisvalid is visible to the syscache lookup that
validatePartitionedIndex() performs internally.

File changed: src/backend/catalog/index.c

3. Call from ReindexRelationConcurrently() (REINDEX CONCURRENTLY)
REINDEX CONCURRENTLY uses a completely different code path: it creates a new
index, builds it concurrently, then swaps it with the old one via
index_concurrently_swap(). The new index inherits the old index's partition
status during the swap.

After the swap and the existing CommandCounterIncrement() (which makes the
swap visible), check if the new index is a partition and call
validatePartitionedIndex() on the parent.

File changed: src/backend/commands/indexcmds.c

Multi-level Hierarchy Support
validatePartitionedIndex() already handles multi-level partition hierarchies.
When it marks a mid-level parent valid, it checks if that parent is itself a
partition and recursively validates the grandparent. No additional recursion
logic is needed in the REINDEX patches.


Thanks,
Mohamed Ali
Senior DBE
AWS RDS


Attachments:

  [application/octet-stream] 0001-Validate-parent-partitioned-index-after-REINDEX.patch (6.4K, 2-0001-Validate-parent-partitioned-index-after-REINDEX.patch)
  download | inline diff:
From 50da7cad2cd0f71c669b364d7eb682d71c143491 Mon Sep 17 00:00:00 2001
From: Mohamed Ali <[email protected]>
Date: Fri, 27 Mar 2026 19:38:32 -0700
Subject: [PATCH] fix: Validate parent partitioned index after REINDEX of child

After REINDEX (or REINDEX CONCURRENTLY) repairs an invalid partition
index, the parent partitioned index remains permanently stuck with
indisvalid=false because validatePartitionedIndex() is never called
from any REINDEX code path.

Fix this by:

1. Making validatePartitionedIndex() non-static and adding its
   declaration to tablecmds.h so it can be called from other modules.

2. Calling validatePartitionedIndex() from reindex_index() (for
   regular REINDEX) after marking a partition index valid.  A
   CommandCounterIncrement() is needed before the call so that the
   child's updated indisvalid is visible to the syscache lookup
   performed by validatePartitionedIndex().

3. Calling validatePartitionedIndex() from
   ReindexRelationConcurrently() (for REINDEX CONCURRENTLY) after
   the index swap and CommandCounterIncrement, at which point the
   new valid index has taken the old invalid index's place in the
   partition hierarchy.

validatePartitionedIndex() already handles multi-level partition
hierarchies by recursing upward when it marks a mid-level parent
valid, so this fix automatically cascades through grandparent and
higher-level partitioned indexes.

---
 src/backend/catalog/index.c      | 32 ++++++++++++++++++++++++++++++++
 src/backend/commands/indexcmds.c | 29 +++++++++++++++++++++++++++++
 src/backend/commands/tablecmds.c |  9 ++++++---
 src/include/commands/tablecmds.h |  2 ++
 4 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d8219b1..bf1a25c 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3885,6 +3885,38 @@ reindex_index(const ReindexStmt *stmt, Oid indexId,
 			CacheInvalidateRelcache(heapRelation);
 		}
 
+		/*
+		 * If this index is a partition of a partitioned index, and we just
+		 * marked it valid, check if the parent partitioned index can now be
+		 * marked valid too.  This handles the case where an invalid partition
+		 * index was attached to a partitioned index (making the parent
+		 * invalid), then later fixed via REINDEX.  validatePartitionedIndex
+		 * will recurse up the hierarchy if needed.
+		 */
+		if (index_bad && iRel->rd_rel->relispartition)
+		{
+			Oid			parentIdxId;
+
+			/* Make the child's indisvalid update visible for validation */
+			CommandCounterIncrement();
+
+			parentIdxId = get_partition_parent(indexId, false);
+			if (OidIsValid(parentIdxId))
+			{
+				Relation	parentIdx;
+				Relation	parentTbl;
+
+				parentIdx = index_open(parentIdxId, AccessShareLock);
+				parentTbl = table_open(parentIdx->rd_index->indrelid,
+									   AccessShareLock);
+
+				validatePartitionedIndex(parentIdx, parentTbl);
+
+				table_close(parentTbl, AccessShareLock);
+				index_close(parentIdx, AccessShareLock);
+			}
+		}
+
 		table_close(pg_index, RowExclusiveLock);
 	}
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index dd593cc..ab677d7 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -34,6 +34,7 @@
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_database.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/partition.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_tablespace.h"
@@ -4317,6 +4318,34 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
 		 * characters.
 		 */
 		CommandCounterIncrement();
+
+		/*
+		 * If the new (swapped-in) index is a partition of a partitioned
+		 * index, check if the parent can now be marked valid.  This handles
+		 * REINDEX CONCURRENTLY on a partition index that was previously
+		 * invalid: the new (valid) index replaces the old (invalid) one,
+		 * and the parent should be re-validated.
+		 */
+		if (get_rel_relispartition(newidx->indexId))
+		{
+			Oid			parentIdxId;
+
+			parentIdxId = get_partition_parent(newidx->indexId, true);
+			if (OidIsValid(parentIdxId))
+			{
+				Relation	parentIdx;
+				Relation	parentTbl;
+
+				parentIdx = index_open(parentIdxId, AccessShareLock);
+				parentTbl = table_open(parentIdx->rd_index->indrelid,
+									   AccessShareLock);
+
+				validatePartitionedIndex(parentIdx, parentTbl);
+
+				table_close(parentTbl, AccessShareLock);
+				index_close(parentIdx, AccessShareLock);
+			}
+		}
 	}
 
 	/* Commit this transaction and make index swaps visible */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c69c12d..8612067 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -746,7 +746,6 @@ static void DetachPartitionFinalize(Relation rel, Relation partRel,
 static ObjectAddress ATExecDetachPartitionFinalize(Relation rel, RangeVar *name);
 static ObjectAddress ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx,
 											  RangeVar *name);
-static void validatePartitionedIndex(Relation partedIdx, Relation partedTbl);
 static void refuseDupeIndexAttach(Relation parentIdx, Relation partIdx,
 								  Relation partitionTbl);
 static void verifyPartitionIndexNotNull(IndexInfo *iinfo, Relation partition);
@@ -22058,12 +22057,16 @@ refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTb
 }
 
 /*
+ * validatePartitionedIndex
+ *
  * Verify whether the set of attached partition indexes to a parent index on
  * a partitioned table is complete.  If it is, mark the parent index valid.
  *
- * This should be called each time a partition index is attached.
+ * This should be called each time a partition index is attached, and also
+ * after a partition index is repaired via REINDEX, so that the parent can
+ * be marked valid once all children are valid.
  */
-static void
+void
 validatePartitionedIndex(Relation partedIdx, Relation partedTbl)
 {
 	Relation	inheritsRel;
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index c3d8518..84983fb 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -108,4 +108,6 @@ extern void RangeVarCallbackOwnsRelation(const RangeVar *relation,
 extern bool PartConstraintImpliedByRelConstraint(Relation scanrel,
 												 List *partConstraint);
 
+extern void validatePartitionedIndex(Relation partedIdx, Relation partedTbl);
+
 #endif							/* TABLECMDS_H */
-- 
2.50.1 (Apple Git-155)



  [application/octet-stream] 0001-Always-validate-parent-index-in-ALTER-INDEX-ATTACH.patch (3.0K, 3-0001-Always-validate-parent-index-in-ALTER-INDEX-ATTACH.patch)
  download | inline diff:
From af4c2b7b438585240f134aa50ec92c0b838e4573 Mon Sep 17 00:00:00 2001
From: Mohamed Ali <[email protected]>
Date: Fri, 27 Mar 2026 19:56:28 -0700
Subject: [PATCH] fix: Always validate parent index in ALTER INDEX ATTACH
 PARTITION

When ALTER INDEX ... ATTACH PARTITION is executed on a partition index
that is already attached to the parent, the entire if-block is skipped,
including the call to validatePartitionedIndex(). This means that if a
previously invalid partition index has been repaired via REINDEX, there
is no way to trigger re-validation of the parent partitioned index.

Move the validatePartitionedIndex() call outside the if-block so it
runs unconditionally. This allows users to re-run ALTER INDEX ATTACH
PARTITION on an already-attached index to trigger parent validation
after a child has been fixed.

When the partition is already attached, emit a NOTICE informing the
user that the index is already attached and that only parent validation
is being performed. This follows the existing PostgreSQL convention of
using NOTICE for informational messages about no-op or reduced-scope
operations (e.g., DROP IF EXISTS, CREATE INDEX IF NOT EXISTS).

NOTICE:  partition index "child_idx" is already attached to "parent_idx", validating parent index

validatePartitionedIndex() is idempotent and cheap (it scans
pg_inherits and counts valid children), so calling it when not strictly
needed has negligible performance impact.

---
 src/backend/commands/tablecmds.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c69c12d..0b29180 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -21917,7 +21917,7 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
 
 	ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partIdx));
 
-	/* Silently do nothing if already in the right state */
+	/* Check if already attached to this parent */
 	currParent = partIdx->rd_rel->relispartition ?
 		get_partition_parent(partIdxId, false) : InvalidOid;
 	if (currParent != RelationGetRelid(parentIdx))
@@ -22023,9 +22023,22 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
 										  RelationGetRelid(partTbl));
 
 		free_attrmap(attmap);
-
-		validatePartitionedIndex(parentIdx, parentTbl);
 	}
+	else
+	{
+		ereport(NOTICE,
+				(errmsg("partition index \"%s\" is already attached to \"%s\", validating parent index",
+						RelationGetRelationName(partIdx),
+						RelationGetRelationName(parentIdx))));
+	}
+
+	/*
+	 * Always validate the parent partitioned index, even if the partition
+	 * was already attached.  This handles the case where a previously
+	 * invalid partition index has been repaired (e.g., via REINDEX) and
+	 * the parent can now be marked valid.
+	 */
+	validatePartitionedIndex(parentIdx, parentTbl);
 
 	relation_close(parentTbl, AccessShareLock);
 	/* keep these locks till commit */
-- 
2.50.1 (Apple Git-155)



view thread (12+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: [PATCH] Fix: Partitioned parent index remains invalid after child indexes are repaired
  In-Reply-To: <CAGnOmWqi1D9ycBgUeOGf6mOCd2Dcf=6sKhbf4sHLs5xAcKVCMQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox