pg_publication_tables: return NULL attnames when no column list is specified

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Roberto Mello <[email protected]>
To: PostgreSQL Developers <[email protected]>
Subject: pg_publication_tables: return NULL attnames when no column list is specified
Date: Wed, 25 Mar 2026 13:50:45 -0600
Message-ID: <CAKz==b+k7OwOib__LgfDadUzW9jFrbQ9FqdERpsufi8d7F=vkQ@mail.gmail.com> (raw)

Hi,

When a table is added to a publication without a column list:

ALTER PUBLICATION my_pub ADD TABLE my_table;

Any new column added to my_table will automatically be replicated. The
underlying catalog (pg_publication_rel.prattrs) stores NULL for this case.

When a table is added with an explicit column list:

ALTER PUBLICATION my_pub ADD TABLE my_table (id, name, status);

New columns will NOT be replicated until the publication is explicitly
altered.
The catalog stores an int2vector of the specified attribute numbers.

The problem is that pg_get_publication_tables() in pg_publication.c (the
set-returning
function backing the pg_publication_tables view) erases this distinction.
When prattrs is NULL, it opens the table, iterates all eligible attributes,
and builds a synthetic int2vector of all current columns. The view then
shows
the same attnames output for both cases.

| Scenario                 | prattrs in catalog | attrs from SRF      |
attnames in view |
|--------------------------|--------------------|---------------------|------------------|
| column list (a, b)       | {1, 2  }           | {1,2}               | {a,
b}           |
| No column list           | NULL               | {1,2,3,...} (synth) | {a,
b, c, ...}   |
| FOR ALL TABLES/IN SCHEMA | no catalog row     | {1,2,3,...} (synth) | {a,
b, c, ...}   |

This is a problem for operations: schema migrations that add columns to
published
tables may or may not replicate depending on how the publication was
originally
defined, and the only way to check is querying pg_publication_rel directly.

Additionally, tablesync.c has a workaround heuristic that tries to reverse
this synthesis by comparing array_length(gpt.attrs, 1) against c.relnatts.
This heuristic is buggy: relnatts counts all user attributes including
dropped
columns, but the synthesized list excludes dropped columns and conditionally
excludes generated columns. So for any table with a dropped column, the
heuristic incorrectly treats "no column list" as if an explicit list were
specified.

The problematic code is the result of commits that incrementally
built on each other, with later ones introducing workarounds for side
effects
of earlier ones:

1. fd0b9dcebda (06/2022) Amit Kapila (author: Hou Zhijie)
Prohibit combining publications with different column lists

Added the subscriber-side check that prevents combining publications with
different column lists for the same table. The subscriber runs
SELECT DISTINCT gpt.attrs and expects at most one row. To handle the fact
that pg_get_publication_tables() returned NULL for no-column-list
publications (which would always be distinct from any explicit list), this
commit introduced the relnatts heuristic in tablesync.c:

CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)
     THEN NULL ELSE gpt.attrs END

The idea was: if the SRF returns a synthesized list whose length equals
relnatts, it must be the "all columns" case, so collapse it back to NULL.
But at this point the SRF didn't synthesize anything yet, so the heuristic
was anticipating the next commit.

2. b7ae0395369 (01/2023) Amit Kapila (author: Shi yu)
Ignore dropped and generated columns from the column list

This is the commit that introduced the synthesis loop. It "solved" a problem
where a table in two publications (one with a column list naming all
columns,
one without a column list) was erroring with "cannot use different column
lists"
because one returned NULL and the other returned an int2vector. The fix was
to synthesize a column list when none was specified, filtering out dropped
and generated columns, so that both publications would produce identical
int2vectors and DISTINCT would collapse them to one row.

But this synthesis made it impossible to distinguish "all columns" from
"explicit list of all columns" in the view. And the relnatts heuristic from
fd0b9dcebda - which was supposed to reverse the synthesis - was broken from
the start because relnatts includes dropped columns while the synthesized
list
excludes them.

The synthesis in b7ae0395369 tried to make "no column list" and "explicit
list
of all columns" look identical. But they have genuinely different
semantics:

- No column list (NULL): all current and future columns are replicated.
  ALTER TABLE ADD COLUMN automatically replicates the new column.
- Explicit full list: only the named columns are replicated. New columns
  are NOT replicated until the publication is explicitly altered.

By making them indistinguishable, the synthesis hid a real conflict from
users
who had a table in two publications with different column semantics on the
same subscription. I am proposing a fix that restores the distinction and
correctly
(IMO) surfaces this conflict.

The fix is to stop synthesizing the full column list in
pg_get_publication_tables().
When prattrs is NULL in the catalog, let attrs remain NULL in the SRF
output.
Remove the buggy CASE WHEN heuristic in tablesync.c since it is no longer
needed.

There is one scenario where there is a change for users: one pub no list +
one pub explicit all columns. Anyone with that specific configuration will
see a new error on the
next tablesync or subscription refresh after upgrading. The fix for those
users is to
either remove the explicit column list from the second publication (making
both "all
columns"), or keep the difference and use separate subscriptions.

AFAICT this is safe except for the change in behavior I describe above.
psql and pg_dump query pg_publication_rel directly and not the affected
view. New subscriber
talking to old publisher still works (old pub synthesizes a list, new sub
handles it).
Old subscriber talking to new publisher also works (the old CASE WHEN
evaluates
array_length(NULL, 1) which returns NULL, falling through to the ELSE branch
returning NULL).

Attached is a patch with the fix, including updates to documentation, 7
updated
regression tests where existing expected outputs where attnames changes from
{a} to NULL, and added new test cases that verify both cases are
distinguishable in
the same query.

Roberto Mello
Snowflake

Attachments:

  [application/octet-stream] v1-0001-Fix-pg_publication_tables-to-return-NULL-attnames.patch (12.6K, 3-v1-0001-Fix-pg_publication_tables-to-return-NULL-attnames.patch)
  download | inline diff:
From 7f2581395567b76efd85975d2133ee2c484f18ff Mon Sep 17 00:00:00 2001
From: Roberto Mello <[email protected]>
Date: Tue, 24 Mar 2026 19:20:10 -0600
Subject: [PATCH v1] Fix pg_publication_tables to return NULL attnames for
 all-column publications

Previously pg_get_publication_tables() synthesized a column list even
when no explicit column list was specified in the publication. This made
it impossible for consumers of pg_publication_tables to distinguish
between "all columns are published" (new columns will automatically be
replicated) and an explicit column list (only listed columns are
replicated).

Worse, the subscriber-side workaround in fetch_remote_table_info()
compared array_length(gpt.attrs, 1) against pg_class.relnatts to detect
the "all columns" case, but relnatts includes dropped columns while the
synthesized list excluded them, so tables with dropped columns were
misidentified as having an explicit column list.

Fix by simply leaving attrs as NULL when no column list was specified
(prattrs is NULL) in the publication catalog. This is consistent with
how prattrs itself is stored and removes the need for the relnatts
heuristic in tablesync.c.

The real replication column filtering (which must account for dropped
columns, generated columns, and pub->pubgencols_type) is performed
downstream in pgoutput.c by pub_form_cols_map() and
check_and_fetch_column_list(), both of which are unchanged by this
patch.

---
 doc/src/sgml/system-views.sgml              |  9 ++--
 src/backend/catalog/pg_publication.c        | 52 ++++---------------
 src/backend/replication/logical/tablesync.c |  9 ++--
 src/test/regress/expected/publication.out   | 57 ++++++++++++++++++---
 src/test/regress/sql/publication.sql        | 34 ++++++++++++
 5 files changed, 102 insertions(+), 59 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..6c926860e41 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2712,9 +2712,12 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        (references <link linkend="catalog-pg-attribute"><structname>pg_attribute</structname></link>.<structfield>attname</structfield>)
       </para>
       <para>
-       Names of table columns included in the publication. This contains all
-       the columns of the table when the user didn't specify the column list
-       for the table.
+       Names of the table columns included in the publication, or NULL if
+       no explicit column list was specified for the table.  When NULL, all
+       current and future columns of the table are published; new columns
+       added to the table will automatically be replicated.  When non-NULL,
+       only the listed columns are replicated, and newly added columns will
+       not appear until the publication is explicitly altered.
       </para></entry>
      </row>

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index c92ff3f51c3..a2fa4906f08 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1439,49 +1439,15 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 			nulls[3] = true;
 		}

-		/* Show all columns when the column list is not specified. */
-		if (nulls[2])
-		{
-			Relation	rel = table_open(relid, AccessShareLock);
-			int			nattnums = 0;
-			int16	   *attnums;
-			TupleDesc	desc = RelationGetDescr(rel);
-			int			i;
-
-			attnums = palloc_array(int16, desc->natts);
-
-			for (i = 0; i < desc->natts; i++)
-			{
-				Form_pg_attribute att = TupleDescAttr(desc, i);
-
-				if (att->attisdropped)
-					continue;
-
-				if (att->attgenerated)
-				{
-					/* We only support replication of STORED generated cols. */
-					if (att->attgenerated != ATTRIBUTE_GENERATED_STORED)
-						continue;
-
-					/*
-					 * User hasn't requested to replicate STORED generated
-					 * cols.
-					 */
-					if (pub->pubgencols_type != PUBLISH_GENCOLS_STORED)
-						continue;
-				}
-
-				attnums[nattnums++] = att->attnum;
-			}
-
-			if (nattnums > 0)
-			{
-				values[2] = PointerGetDatum(buildint2vector(attnums, nattnums));
-				nulls[2] = false;
-			}
-
-			table_close(rel, AccessShareLock);
-		}
+		/*
+		 * When no column list is specified (prattrs is NULL), we leave
+		 * attrs as NULL rather than synthesizing a list of all current
+		 * columns.  This allows consumers of pg_publication_tables to
+		 * distinguish between "all columns are published" (attrs IS
+		 * NULL - new columns will automatically be replicated) and an
+		 * explicit column list (attrs IS NOT NULL - only listed columns
+		 * are replicated).
+		 */

 		rettuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);

diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f49a4852ecb..5f6d892e595 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -799,13 +799,10 @@ fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
 		 */
 		resetStringInfo(&cmd);
 		appendStringInfo(&cmd,
-						 "SELECT DISTINCT"
-						 "  (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
-						 "   THEN NULL ELSE gpt.attrs END)"
+						 "SELECT DISTINCT gpt.attrs"
 						 "  FROM pg_publication p,"
-						 "  LATERAL pg_get_publication_tables(p.pubname) gpt,"
-						 "  pg_class c"
-						 " WHERE gpt.relid = %u AND c.oid = gpt.relid"
+						 "  LATERAL pg_get_publication_tables(p.pubname) gpt"
+						 " WHERE gpt.relid = %u"
 						 "   AND p.pubname IN ( %s )",
 						 lrel->remoteid,
 						 pub_names->data);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index a220f48b285..011c6026816 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -2072,7 +2072,7 @@ CREATE PUBLICATION pub FOR TABLES IN SCHEMA sch2 WITH (PUBLISH_VIA_PARTITION_ROO
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename  | attnames | rowfilter
 ---------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      |
+ pub     | sch2       | tbl1_part1 |          |
 (1 row)

 DROP PUBLICATION pub;
@@ -2081,7 +2081,7 @@ CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH (PUBLISH_VIA_PARTITION_ROO
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename  | attnames | rowfilter
 ---------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      |
+ pub     | sch2       | tbl1_part1 |          |
 (1 row)

 -- Table publication that includes both the parent table and the child table
@@ -2089,7 +2089,7 @@ ALTER PUBLICATION pub ADD TABLE sch1.tbl1;
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename | attnames | rowfilter
 ---------+------------+-----------+----------+-----------
- pub     | sch1       | tbl1      | {a}      |
+ pub     | sch1       | tbl1      |          |
 (1 row)

 DROP PUBLICATION pub;
@@ -2098,7 +2098,7 @@ CREATE PUBLICATION pub FOR TABLES IN SCHEMA sch2 WITH (PUBLISH_VIA_PARTITION_ROO
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename  | attnames | rowfilter
 ---------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      |
+ pub     | sch2       | tbl1_part1 |          |
 (1 row)

 DROP PUBLICATION pub;
@@ -2107,7 +2107,7 @@ CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH (PUBLISH_VIA_PARTITION_ROO
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename  | attnames | rowfilter
 ---------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      |
+ pub     | sch2       | tbl1_part1 |          |
 (1 row)

 -- Table publication that includes both the parent table and the child table
@@ -2115,7 +2115,7 @@ ALTER PUBLICATION pub ADD TABLE sch1.tbl1;
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename  | attnames | rowfilter
 ---------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      |
+ pub     | sch2       | tbl1_part1 |          |
 (1 row)

 DROP PUBLICATION pub;
@@ -2130,7 +2130,7 @@ CREATE PUBLICATION pub FOR TABLES IN SCHEMA sch1 WITH (PUBLISH_VIA_PARTITION_ROO
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename | attnames | rowfilter
 ---------+------------+-----------+----------+-----------
- pub     | sch1       | tbl1      | {a}      |
+ pub     | sch1       | tbl1      |          |
 (1 row)

 RESET client_min_messages;
@@ -2139,6 +2139,49 @@ DROP TABLE sch1.tbl1;
 DROP SCHEMA sch1 cascade;
 DROP SCHEMA sch2 cascade;
 -- ======================================================
+-- Test that pg_publication_tables distinguishes between tables with
+-- an explicit column list and tables without one (attnames should be
+-- NULL when no column list was specified).
+CREATE TABLE pub_col_test (id int, name text, status text);
+CREATE PUBLICATION pub_nocols FOR TABLE pub_col_test;
+CREATE PUBLICATION pub_cols FOR TABLE pub_col_test (id, name);
+-- Without column list: attnames should be NULL
+SELECT pubname, attnames IS NULL AS all_columns FROM pg_publication_tables
+  WHERE tablename = 'pub_col_test' ORDER BY pubname;
+  pubname   | all_columns
+------------+-------------
+ pub_cols   | f
+ pub_nocols | t
+(2 rows)
+
+-- With column list: attnames should list specific columns
+SELECT pubname, attnames FROM pg_publication_tables
+  WHERE tablename = 'pub_col_test' AND attnames IS NOT NULL ORDER BY pubname;
+ pubname  | attnames
+----------+-----------
+ pub_cols | {id,name}
+(1 row)
+
+DROP PUBLICATION pub_nocols;
+DROP PUBLICATION pub_cols;
+DROP TABLE pub_col_test;
+-- Test that a table with a dropped column still shows attnames as NULL
+-- when no explicit column list was specified.  The old implementation
+-- compared the synthesized column count against relnatts, but relnatts
+-- includes dropped columns, so the heuristic was wrong for this case.
+CREATE TABLE pub_dropped_test (id int, dropped_col text, name text);
+ALTER TABLE pub_dropped_test DROP COLUMN dropped_col;
+CREATE PUBLICATION pub_dropped FOR TABLE pub_dropped_test;
+SELECT pubname, attnames IS NULL AS all_columns FROM pg_publication_tables
+  WHERE tablename = 'pub_dropped_test';
+   pubname   | all_columns
+-------------+-------------
+ pub_dropped | t
+(1 row)
+
+DROP PUBLICATION pub_dropped;
+DROP TABLE pub_dropped_test;
+-- ======================================================
 -- Test the 'publish_generated_columns' parameter with the following values:
 -- 'stored', 'none'.
 SET client_min_messages = 'ERROR';
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 22e0a30b5c7..0f3c443a180 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -1327,6 +1327,40 @@ DROP SCHEMA sch1 cascade;
 DROP SCHEMA sch2 cascade;
 -- ======================================================

+-- Test that pg_publication_tables distinguishes between tables with
+-- an explicit column list and tables without one (attnames should be
+-- NULL when no column list was specified).
+CREATE TABLE pub_col_test (id int, name text, status text);
+CREATE PUBLICATION pub_nocols FOR TABLE pub_col_test;
+CREATE PUBLICATION pub_cols FOR TABLE pub_col_test (id, name);
+
+-- Without column list: attnames should be NULL
+SELECT pubname, attnames IS NULL AS all_columns FROM pg_publication_tables
+  WHERE tablename = 'pub_col_test' ORDER BY pubname;
+
+-- With column list: attnames should list specific columns
+SELECT pubname, attnames FROM pg_publication_tables
+  WHERE tablename = 'pub_col_test' AND attnames IS NOT NULL ORDER BY pubname;
+
+DROP PUBLICATION pub_nocols;
+DROP PUBLICATION pub_cols;
+DROP TABLE pub_col_test;
+
+-- Test that a table with a dropped column still shows attnames as NULL
+-- when no explicit column list was specified.  The old implementation
+-- compared the synthesized column count against relnatts, but relnatts
+-- includes dropped columns, so the heuristic was wrong for this case.
+CREATE TABLE pub_dropped_test (id int, dropped_col text, name text);
+ALTER TABLE pub_dropped_test DROP COLUMN dropped_col;
+CREATE PUBLICATION pub_dropped FOR TABLE pub_dropped_test;
+
+SELECT pubname, attnames IS NULL AS all_columns FROM pg_publication_tables
+  WHERE tablename = 'pub_dropped_test';
+
+DROP PUBLICATION pub_dropped;
+DROP TABLE pub_dropped_test;
+-- ======================================================
+
 -- Test the 'publish_generated_columns' parameter with the following values:
 -- 'stored', 'none'.
 SET client_min_messages = 'ERROR';

base-commit: 7baa39c468fa1d703d50cd697e9dd05f6831ae38
--
2.50.1 (Apple Git-155)

view thread (12+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: pg_publication_tables: return NULL attnames when no column list is specified
  In-Reply-To: <CAKz==b+k7OwOib__LgfDadUzW9jFrbQ9FqdERpsufi8d7F=vkQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox