Re: Initial COPY of Logical Replication is too slow

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Masahiko Sawada <[email protected]>
To: Zhijie Hou (Fujitsu) <[email protected]>
Cc: Amit Kapila <[email protected]>
Cc: Peter Smith <[email protected]>
Cc: Jan Wieck <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Initial COPY of Logical Replication is too slow
Date: Tue, 31 Mar 2026 12:40:14 -0700
Message-ID: <CAD21AoBB3ovY4Hpr+HM6VkCY+sj3_71ZRi3oOP67w5ivvy4zuQ@mail.gmail.com> (raw)
In-Reply-To: <TY4PR01MB169070A34D1C74867EF5A2DE49453A@TY4PR01MB16907.jpnprd01.prod.outlook.com>
References: <CAB-JLwbBFNuASyEnZWP0Tck9uNkthBZqi6WoXNevUT6+mV8XmA@mail.gmail.com>
	<CAD21AoA6i2ui8FMZeuU_KxX4t-fM8G==zTW2Dp6-goujttrpew@mail.gmail.com>
	<CAB-JLwZpp=7c9_r0beWWJxRh2BS_2Vvth8UDv7H57DBeaqggVg@mail.gmail.com>
	<CAD21AoDT3sL2COprsRumM9zEpL1Bk5VWboK4V2mRnjGua8xfeA@mail.gmail.com>
	<CAD21AoDQM62GOtaTzD_CVMSsFhv6o9c0Au1dSM1QuxeKFkWAKw@mail.gmail.com>
	<CAD21AoCz7HjEr3oeb=haK31YHxHZLcvD_wx_a-+xLPKywq++3A@mail.gmail.com>
	<TY4PR01MB16907733B75A99117F013AFCA947FA@TY4PR01MB16907.jpnprd01.prod.outlook.com>
	<CAD21AoA9YgiY1rVKMPZwB00WU_G4UfzoawY=7hyd7hpvBPcK6w@mail.gmail.com>
	<CAA4eK1KoSi60dtakJzn0MxNnHF1Yf4indSAffTjJxQG_31jsgQ@mail.gmail.com>
	<CAD21AoB4B3MOxJ7-v9YLjV5fTOtaLRUhX3jN3kqhEi7D7-uY4A@mail.gmail.com>
	<[email protected]>
	<CAD21AoCmHpKrNg9D3mcOA973CZ5N_dBLxb8pERpSxEeRLSQxpA@mail.gmail.com>
	<CAD21AoAEVyxwn_bMWHvcU-Gcz3aUVjAtMbdgfoJ8MZNiLLEh0g@mail.gmail.com>
	<CAA4eK1Jkouj=w+PHzMB6v890ES3QOLf=cUTvZmGFr-WMQW2OnA@mail.gmail.com>
	<CAD21AoB4_n7+s=uM9apX1JVtvGvgM8ismAx_uMxvDmUXfQULsw@mail.gmail.com>
	<CAD21AoBJcxRcaWQot302diaxoDcsnezRhnZa7p8UrPh5AGNeHQ@mail.gmail.com>
	<CAHut+PuSkabUB8H_hcwQz=BX5TWEj-8Ba+CP_PX78zN1fkhtKA@mail.gmail.com>
	<CAA4eK1K+rumWz=mHDLVVCig-i_cWWSzwDE1eMySq0WYc7_ve+Q@mail.gmail.com>
	<TY4PR01MB169070A34D1C74867EF5A2DE49453A@TY4PR01MB16907.jpnprd01.prod.outlook.com>

On Tue, Mar 31, 2026 at 5:07 AM Zhijie Hou (Fujitsu)
<[email protected]> wrote:
>
> On Tuesday, March 31, 2026 5:36 PM Amit Kapila <[email protected]> wrote:
> >
> > On Wed, Mar 25, 2026 at 2:19 PM Peter Smith <[email protected]>
> > wrote:
> > >
> > > There are many return points, and most of those "if" blocks cannot
> > > fall through (they return).
> > >
> > > I found it slightly difficult to read the code because I kept having
> > > to think, "OK, if we reached here, it means pubviaroot must be false,"
> > > or "OK, if we reached this far, then puballtables must be false, and
> > > pubviaroot must be false," etc.
> > >
> >
> > I can't say exactly why, but I find it difficult to read this function. So, I share
> > your concerns about the code of this function.
> > Because of its complexity it is difficult to ascertain that the functionality is
> > correct or we missed something. Also, considering it is correct today, in its
> > current form, it may become difficult to enhance it in future.
> >
>
> I attempted to refactor the code a bit based on my preferred style, as shown in
> the attachment. While the number of return points couldn't be reduced, I tried
> to eliminate if-else branches where possible. Sharing this top-up patch as a
> reference for an alternative style that reduces code size.
>

Thanks. It looks like a good refactoring! I'd prefer to free the
ancestors list to avoid memory leak.

I've attached the patch that incorporated all comments I got so far.
Feedback is very welcome.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Attachments:

  [application/octet-stream] v7-0001-Add-target_relid-parameter-to-pg_get_publication_.patch (32.4K, 2-v7-0001-Add-target_relid-parameter-to-pg_get_publication_.patch)
  download | inline diff:
From f64b6d4611fa146433709d30b9c8288d15d2e240 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Fri, 27 Feb 2026 15:42:38 -0800
Subject: [PATCH v7] Add target_relid parameter to pg_get_publication_tables().

When a tablesync worker checks whether a specific table is published,
it previously called pg_get_publication_tables() and filtered the
result by relid on the subscriber side. This forced a full enumeration
of all tables in the publication before any filtering could occur. For
publications covering a large number of tables, this resulted in
expensive scans on the publisher and unnecessary overhead.

This commit adds a new overloaded form of pg_get_publication_tables()
that accepts an array of publication names and a target table
OID. Instead of enumerating all published tables, it evaluates
membership for the specified relation via syscache lookups, using the
new is_table_publishable_in_publication() helper. This helper
correctly accounts for publish_via_partition_root, ALL TABLES with
EXCEPT clauses, schema publications, and partition inheritance, while
avoiding the overhead of building the complete published table list.

The existing a VARIADIC array form of pg_get_publication_tables() is
preserved for backward compatibility. Tablesync workers use the new
two-argument form when connected to a publisher running PostgreSQL 19
or later.

Bump catalog version.

Reported-by: Marcos Pegoraro <[email protected]>
Reviewed-by: Zhijie Hou <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Reviewed-by: Peter Smith <[email protected]>
Reviewed-by: Hayato Kuroda <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Haoyan Wang <[email protected]>
Discussion: https://postgr.es/m/CAB-JLwbBFNuASyEnZWP0Tck9uNkthBZqi6WoXNevUT6+mV8XmA@mail.gmail.com
---
 src/backend/catalog/pg_publication.c        | 256 +++++++++++++++++---
 src/backend/replication/logical/tablesync.c |  70 ++++--
 src/backend/replication/pgoutput/pgoutput.c |   7 +-
 src/include/catalog/pg_proc.dat             |  11 +-
 src/include/catalog/pg_publication.h        |   2 +
 src/test/regress/expected/publication.out   | 225 +++++++++++++++++
 src/test/regress/sql/publication.sql        | 107 ++++++++
 7 files changed, 624 insertions(+), 54 deletions(-)

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index a3192f19d35..b90e9794e7d 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -163,6 +163,37 @@ is_publishable_relation(Relation rel)
 	return is_publishable_class(RelationGetRelid(rel), rel->rd_rel);
 }
 
+/*
+ * Similar to is_publishable_class() but checks whether the given OID
+ * is a publishable "table" or not.
+ */
+static bool
+is_publishable_table(Oid tableoid)
+{
+	HeapTuple	tuple;
+	Form_pg_class relform;
+
+	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(tableoid));
+	if (!HeapTupleIsValid(tuple))
+		return false;
+
+	relform = (Form_pg_class) GETSTRUCT(tuple);
+
+	/*
+	 * Sequences are publishable according to is_publishable_class() so
+	 * explicitly exclude here.
+	 */
+	if (relform->relkind != RELKIND_SEQUENCE &&
+		is_publishable_class(tableoid, relform))
+	{
+		ReleaseSysCache(tuple);
+		return true;
+	}
+
+	ReleaseSysCache(tuple);
+	return false;
+}
+
 /*
  * SQL-callable variant of the above
  *
@@ -451,6 +482,26 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level
 	return topmost_relid;
 }
 
+/*
+ * A variant of GetTopMostAncestorInPublication() returns the top most
+ * published ancestor of the given relid.
+ */
+Oid
+GetTopMostAncestorInPublicationRelid(Oid pubid, Oid relid,
+									 int *ancestor_level)
+{
+	List	   *ancestors = get_partition_ancestors(relid);
+	Oid			ancestor;
+
+	ancestor = GetTopMostAncestorInPublication(pubid, ancestors,
+											   ancestor_level);
+
+	if (ancestors)
+		list_free(ancestors);
+
+	return ancestor;
+}
+
 /*
  * attnumstoint2vector
  *		Convert a Bitmapset of AttrNumbers into an int2vector.
@@ -1264,12 +1315,111 @@ GetPublicationByName(const char *pubname, bool missing_ok)
 }
 
 /*
- * Get information of the tables in the given publication array.
+ * Returns true if the table of the given relid is published for the specified
+ * publication.
+ *
+ * This function evaluates the effective published OID based on the
+ * publish_via_partition_root setting, rather than just checking catalog entries
+ * (e.g., pg_publication_rel). For instance, when publish_via_partition_root is
+ * false, it returns false for a parent partitioned table and true for its leaf
+ * partitions, even if the parent is the one explicitly added to the publication.
  *
- * Returns pubid, relid, column list, row filter for each table.
+ * For performance reasons, this function avoids the overhead of constructing
+ * the complete list of published tables during the evaluation. It can execute
+ * quickly even when the publication contains a large number of relations.
  */
-Datum
-pg_get_publication_tables(PG_FUNCTION_ARGS)
+static bool
+is_table_publishable_in_publication(Oid relid, Publication *pub)
+{
+	bool		relispartition;
+
+	/*
+	 * For non-pubviaroot publications, a partitioned table is never the
+	 * effective published OID; only its leaf partitions can be.
+	 */
+	if (!pub->pubviaroot && get_rel_relkind(relid) == RELKIND_PARTITIONED_TABLE)
+		return false;
+
+	relispartition = get_rel_relispartition(relid);
+
+	if (pub->alltables)
+	{
+		Oid			target_relid = relid;
+
+		/*
+		 * ALL TABLES with pubviaroot includes only regular tables or top-most
+		 * partitioned tables -- never child partitions.
+		 */
+		if (pub->pubviaroot && relispartition)
+			return false;
+
+		if (relispartition)
+		{
+			List	   *ancestors = get_partition_ancestors(relid);
+
+			/*
+			 * Only the top-most ancestor can appear in the EXCEPT clause.
+			 * Therefore, for a partition, exclusion must be evaluated at the
+			 * top-most ancestor.
+			 */
+			target_relid = llast_oid(ancestors);
+			list_free(ancestors);
+		}
+
+		/*
+		 * The table is published unless it appears in the EXCEPT clause. ALL
+		 * TABLES publications store only EXCEPT'ed tables in
+		 * pg_publication_rel, so checking existence is sufficient.
+		 */
+		return !SearchSysCacheExists2(PUBLICATIONRELMAP,
+									  ObjectIdGetDatum(target_relid),
+									  ObjectIdGetDatum(pub->oid));
+	}
+
+	/*
+	 * Non-alltables publication.
+	 */
+
+	if (relispartition &&
+		OidIsValid(GetTopMostAncestorInPublicationRelid(pub->oid,
+														relid, NULL)))
+	{
+		/*
+		 * If pubviaroot is true, the ancestor is published instead of the
+		 * partition, so exclude it. Otherwise, the ancestor covers the
+		 * partition, so include it.
+		 */
+		return !pub->pubviaroot;
+	}
+
+	/*
+	 * Check whether the table is explicitly published via pg_publication_rel
+	 * or pg_publication_namespace.
+	 */
+	return (SearchSysCacheExists2(PUBLICATIONRELMAP,
+								  ObjectIdGetDatum(relid),
+								  ObjectIdGetDatum(pub->oid)) ||
+			SearchSysCacheExists2(PUBLICATIONNAMESPACEMAP,
+								  ObjectIdGetDatum(get_rel_namespace(relid)),
+								  ObjectIdGetDatum(pub->oid)));
+}
+
+/*
+ * Helper function to get information of the tables in the given
+ * publication(s).
+ *
+ * If filter_by_relid is true, only the row for target_relid is returned;
+ * if target_relid does not exist or is not part of the publications, zero
+ * rows are returned.  If filter_by_relid is false, rows for all tables
+ * within the specified publications are returned and target_relid is
+ * ignored.
+ *
+ * Returns pubid, relid, column list, and row filter for each table.
+ */
+static Datum
+pg_get_publication_tables(FunctionCallInfo fcinfo, ArrayType *pubnames,
+						  Oid target_relid, bool filter_by_relid,
+						  bool pub_missing_ok)
 {
 #define NUM_PUBLICATION_TABLES_ELEM	4
 	FuncCallContext *funcctx;
@@ -1280,7 +1430,6 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 	{
 		TupleDesc	tupdesc;
 		MemoryContext oldcontext;
-		ArrayType  *arr;
 		Datum	   *elems;
 		int			nelems,
 					i;
@@ -1289,6 +1438,14 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		/* create a function context for cross-call persistence */
 		funcctx = SRF_FIRSTCALL_INIT();
 
+		/*
+		 * Preliminary check if the specified table can be published in the
+		 * first place. If not, we can return early without checking the given
+		 * publications and the table.
+		 */
+		if (filter_by_relid && !is_publishable_table(target_relid))
+			SRF_RETURN_DONE(funcctx);
+
 		/* switch to memory context appropriate for multiple function calls */
 		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
@@ -1296,8 +1453,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		 * Deconstruct the parameter into elements where each element is a
 		 * publication name.
 		 */
-		arr = PG_GETARG_ARRAYTYPE_P(0);
-		deconstruct_array_builtin(arr, TEXTOID, &elems, NULL, &nelems);
+		deconstruct_array_builtin(pubnames, TEXTOID, &elems, NULL, &nelems);
 
 		/* Get Oids of tables from each publication. */
 		for (i = 0; i < nelems; i++)
@@ -1306,32 +1462,48 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 			List	   *pub_elem_tables = NIL;
 			ListCell   *lc;
 
-			pub_elem = GetPublicationByName(TextDatumGetCString(elems[i]), false);
+			pub_elem = GetPublicationByName(TextDatumGetCString(elems[i]),
+											pub_missing_ok);
 
-			/*
-			 * Publications support partitioned tables. If
-			 * publish_via_partition_root is false, all changes are replicated
-			 * using leaf partition identity and schema, so we only need
-			 * those. Otherwise, get the partitioned table itself.
-			 */
-			if (pub_elem->alltables)
-				pub_elem_tables = GetAllPublicationRelations(pub_elem->oid,
-															 RELKIND_RELATION,
-															 pub_elem->pubviaroot);
+			if (pub_elem == NULL)
+				continue;
+
+			if (filter_by_relid)
+			{
+				/* Check if the given table is published for the publication */
+				if (is_table_publishable_in_publication(target_relid, pub_elem))
+				{
+					pub_elem_tables = list_make1_oid(target_relid);
+				}
+			}
 			else
 			{
-				List	   *relids,
-						   *schemarelids;
-
-				relids = GetIncludedPublicationRelations(pub_elem->oid,
-														 pub_elem->pubviaroot ?
-														 PUBLICATION_PART_ROOT :
-														 PUBLICATION_PART_LEAF);
-				schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
-																pub_elem->pubviaroot ?
-																PUBLICATION_PART_ROOT :
-																PUBLICATION_PART_LEAF);
-				pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
+				/*
+				 * Publications support partitioned tables. If
+				 * publish_via_partition_root is false, all changes are
+				 * replicated using leaf partition identity and schema, so we
+				 * only need those. Otherwise, get the partitioned table
+				 * itself.
+				 */
+				if (pub_elem->alltables)
+					pub_elem_tables = GetAllPublicationRelations(pub_elem->oid,
+																 RELKIND_RELATION,
+																 pub_elem->pubviaroot);
+				else
+				{
+					List	   *relids,
+							   *schemarelids;
+
+					relids = GetIncludedPublicationRelations(pub_elem->oid,
+															 pub_elem->pubviaroot ?
+															 PUBLICATION_PART_ROOT :
+															 PUBLICATION_PART_LEAF);
+					schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+																	pub_elem->pubviaroot ?
+																	PUBLICATION_PART_ROOT :
+																	PUBLICATION_PART_LEAF);
+					pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
+				}
 			}
 
 			/*
@@ -1491,6 +1663,30 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 	SRF_RETURN_DONE(funcctx);
 }
 
+Datum
+pg_get_publication_tables_a(PG_FUNCTION_ARGS)
+{
+	/*
+	 * Get information for all tables in the given publications.
+	 * filter_by_relid is false so all tables are returned; pub_missing_ok is
+	 * false for backward compatibility.
+	 */
+	return pg_get_publication_tables(fcinfo, PG_GETARG_ARRAYTYPE_P(0),
+									 InvalidOid, false, false);
+}
+
+Datum
+pg_get_publication_tables_b(PG_FUNCTION_ARGS)
+{
+	/*
+	 * Get information for the specified table in the given publications. The
+	 * SQL-level function is declared STRICT, so target_relid is guaranteed to
+	 * be non-NULL here.
+	 */
+	return pg_get_publication_tables(fcinfo, PG_GETARG_ARRAYTYPE_P(0),
+									 PG_GETARG_OID(1), true, true);
+}
+
 /*
  * Returns Oids of sequences in a publication.
  */
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index f49a4852ecb..eb718114297 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -798,17 +798,35 @@ fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
 		 * publications).
 		 */
 		resetStringInfo(&cmd);
-		appendStringInfo(&cmd,
-						 "SELECT DISTINCT"
-						 "  (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
-						 "   THEN NULL ELSE gpt.attrs END)"
-						 "  FROM pg_publication p,"
-						 "  LATERAL pg_get_publication_tables(p.pubname) gpt,"
-						 "  pg_class c"
-						 " WHERE gpt.relid = %u AND c.oid = gpt.relid"
-						 "   AND p.pubname IN ( %s )",
-						 lrel->remoteid,
-						 pub_names->data);
+
+		if (server_version >= 190000)
+		{
+			/*
+			 * We can pass both publication names and relid to
+			 * pg_get_publication_tables() since version 19.
+			 */
+			appendStringInfo(&cmd,
+							 "SELECT DISTINCT"
+							 "  (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
+							 "   THEN NULL ELSE gpt.attrs END)"
+							 "  FROM pg_get_publication_tables(ARRAY[%s], %u) gpt,"
+							 "  pg_class c"
+							 " WHERE c.oid = gpt.relid",
+							 pub_names->data,
+							 lrel->remoteid);
+		}
+		else
+			appendStringInfo(&cmd,
+							 "SELECT DISTINCT"
+							 "  (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
+							 "   THEN NULL ELSE gpt.attrs END)"
+							 "  FROM pg_publication p,"
+							 "  LATERAL pg_get_publication_tables(p.pubname) gpt,"
+							 "  pg_class c"
+							 " WHERE gpt.relid = %u AND c.oid = gpt.relid"
+							 "   AND p.pubname IN ( %s )",
+							 lrel->remoteid,
+							 pub_names->data);
 
 		pubres = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data,
 							 lengthof(attrsRow), attrsRow);
@@ -982,14 +1000,28 @@ fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
 
 		/* Check for row filters. */
 		resetStringInfo(&cmd);
-		appendStringInfo(&cmd,
-						 "SELECT DISTINCT pg_get_expr(gpt.qual, gpt.relid)"
-						 "  FROM pg_publication p,"
-						 "  LATERAL pg_get_publication_tables(p.pubname) gpt"
-						 " WHERE gpt.relid = %u"
-						 "   AND p.pubname IN ( %s )",
-						 lrel->remoteid,
-						 pub_names->data);
+
+		if (server_version >= 190000)
+		{
+			/*
+			 * We can pass both publication names and relid to
+			 * pg_get_publication_tables() since version 19.
+			 */
+			appendStringInfo(&cmd,
+							 "SELECT DISTINCT pg_get_expr(gpt.qual, gpt.relid)"
+							 "  FROM pg_get_publication_tables(ARRAY[%s], %u) gpt",
+							 pub_names->data,
+							 lrel->remoteid);
+		}
+		else
+			appendStringInfo(&cmd,
+							 "SELECT DISTINCT pg_get_expr(gpt.qual, gpt.relid)"
+							 "  FROM pg_publication p,"
+							 "  LATERAL pg_get_publication_tables(p.pubname) gpt"
+							 " WHERE gpt.relid = %u"
+							 "   AND p.pubname IN ( %s )",
+							 lrel->remoteid,
+							 pub_names->data);
 
 		res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 1, qualRow);
 
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 4ecfcbff7ab..f4531efe7ec 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -2263,11 +2263,10 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
 				{
 					Oid			ancestor;
 					int			level;
-					List	   *ancestors = get_partition_ancestors(relid);
 
-					ancestor = GetTopMostAncestorInPublication(pub->oid,
-															   ancestors,
-															   &level);
+					ancestor = GetTopMostAncestorInPublicationRelid(pub->oid,
+																	relid,
+																	&level);
 
 					if (ancestor != InvalidOid)
 					{
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3579cec5744..afdcc915f08 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12468,7 +12468,16 @@
   proallargtypes => '{_text,oid,oid,int2vector,pg_node_tree}',
   proargmodes => '{v,o,o,o,o}',
   proargnames => '{pubname,pubid,relid,attrs,qual}',
-  prosrc => 'pg_get_publication_tables' },
+  prosrc => 'pg_get_publication_tables_a' },
+{ oid => '8060',
+  descr => 'get information of the specified table that is part of the specified publications',
+  proname => 'pg_get_publication_tables', prorows => '10',
+  proretset => 't', provolatile => 's',
+  prorettype => 'record', proargtypes => '_text oid',
+  proallargtypes => '{_text,oid,oid,oid,int2vector,pg_node_tree}',
+  proargmodes => '{i,i,o,o,o,o}',
+  proargnames => '{pubnames,target_relid,pubid,relid,attrs,qual}',
+  prosrc => 'pg_get_publication_tables_b' },
 { oid => '8052', descr => 'get OIDs of sequences in a publication',
   proname => 'pg_get_publication_sequences', prorows => '1000', proretset => 't',
   provolatile => 's', prorettype => 'oid', proargtypes => 'text',
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 89b4bb14f62..ad309e26e02 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -192,6 +192,8 @@ extern List *GetPubPartitionOptionRelations(List *result,
 											Oid relid);
 extern Oid	GetTopMostAncestorInPublication(Oid puboid, List *ancestors,
 											int *ancestor_level);
+extern Oid	GetTopMostAncestorInPublicationRelid(Oid puboid, Oid relid,
+												 int *ancestor_level);
 
 extern bool is_publishable_relation(Relation rel);
 extern bool is_schema_publication(Oid pubid);
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 91332e75eeb..3b0eaec4f21 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -2292,6 +2292,231 @@ DROP TABLE testpub_merge_pk;
 RESET SESSION AUTHORIZATION;
 DROP ROLE regress_publication_user, regress_publication_user2;
 DROP ROLE regress_publication_user_dummy;
+-- Test pg_get_publication_tables(text[], oid) function
+CREATE SCHEMA gpt_test_sch;
+CREATE TABLE gpt_test_sch.tbl_sch (id int);
+CREATE TABLE tbl_normal (id int);
+CREATE TABLE tbl_parent (id1 int, id2 int, id3 int) PARTITION BY RANGE (id1);
+CREATE TABLE tbl_part1 PARTITION OF tbl_parent FOR VALUES FROM (1) TO (10);
+CREATE VIEW gpt_test_view AS SELECT * FROM tbl_normal;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION pub_all FOR ALL TABLES WITH (publish_via_partition_root = true);
+CREATE PUBLICATION pub_all_no_viaroot FOR ALL TABLES WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_all_except FOR ALL TABLES EXCEPT (TABLE tbl_parent, gpt_test_sch.tbl_sch) WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_all_except_no_viaroot FOR ALL TABLES EXCEPT (TABLE tbl_parent, gpt_test_sch.tbl_sch) WITH (publish_via_partition_root = true);
+CREATE PUBLICATION pub_schema FOR TABLES IN SCHEMA gpt_test_sch;
+CREATE PUBLICATION pub_normal FOR TABLE tbl_normal WHERE (id < 10);
+CREATE PUBLICATION pub_part_leaf FOR TABLE tbl_part1 WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_part_parent FOR TABLE tbl_parent (id1, id2) WHERE (id1 = 10) WITH (publish_via_partition_root = true);
+CREATE PUBLICATION pub_part_parent_no_viaroot FOR TABLE tbl_parent WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_part_parent_child FOR TABLE tbl_parent, tbl_part1 WITH (publish_via_partition_root = true);
+RESET client_min_messages;
+CREATE FUNCTION test_gpt(pubnames text[], relname text)
+RETURNS TABLE (
+  pubname text,
+  relname name,
+  attrs text,
+  qual text
+)
+BEGIN ATOMIC
+  SELECT p.pubname, c.relname, gpt.attrs::text, pg_get_expr(gpt.qual, gpt.relid)
+    FROM pg_get_publication_tables(pubnames, relname::regclass::oid) gpt
+    JOIN pg_publication p ON p.oid = gpt.pubid
+    JOIN pg_class c ON c.oid = gpt.relid
+  ORDER BY p.pubname, c.relname;
+END;
+SELECT * FROM test_gpt(ARRAY['pub_normal'], 'tbl_normal');
+  pubname   |  relname   | attrs |   qual    
+------------+------------+-------+-----------
+ pub_normal | tbl_normal | 1     | (id < 10)
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_normal'], 'gpt_test_sch.tbl_sch'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_schema'], 'gpt_test_sch.tbl_sch');
+  pubname   | relname | attrs | qual 
+------------+---------+-------+------
+ pub_schema | tbl_sch | 1     | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_schema'], 'tbl_normal'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent'], 'tbl_parent');
+     pubname     |  relname   | attrs |    qual    
+-----------------+------------+-------+------------
+ pub_part_parent | tbl_parent | 1 2   | (id1 = 10)
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent'], 'tbl_part1'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_no_viaroot'], 'tbl_part1');
+          pubname           |  relname  | attrs | qual 
+----------------------------+-----------+-------+------
+ pub_part_parent_no_viaroot | tbl_part1 | 1 2 3 | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_no_viaroot'], 'tbl_parent'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_leaf'], 'tbl_part1');
+    pubname    |  relname  | attrs | qual 
+---------------+-----------+-------+------
+ pub_part_leaf | tbl_part1 | 1 2 3 | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_leaf'], 'tbl_parent'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_all'], 'tbl_parent');
+ pubname |  relname   | attrs | qual 
+---------+------------+-------+------
+ pub_all | tbl_parent | 1 2 3 | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_all'], 'tbl_part1'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_no_viaroot'], 'tbl_part1');
+      pubname       |  relname  | attrs | qual 
+--------------------+-----------+-------+------
+ pub_all_no_viaroot | tbl_part1 | 1 2 3 | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_no_viaroot'], 'tbl_parent'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_child'], 'tbl_parent');
+        pubname        |  relname   | attrs | qual 
+-----------------------+------------+-------+------
+ pub_part_parent_child | tbl_parent | 1 2 3 | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_child'], 'tbl_part1'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+-- test for the EXCLUDE clause
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'tbl_normal');
+    pubname     |  relname   | attrs | qual 
+----------------+------------+-------+------
+ pub_all_except | tbl_normal | 1     | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'gpt_test_sch.tbl_sch'); -- no result (excluded)
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'tbl_parent'); -- no result (excluded)
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'tbl_part1'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'tbl_normal');
+          pubname          |  relname   | attrs | qual 
+---------------------------+------------+-------+------
+ pub_all_except_no_viaroot | tbl_normal | 1     | 
+(1 row)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'gpt_test_sch.tbl_sch'); -- no result (excluded)
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'tbl_parent'); -- no result (excluded)
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'tbl_part1'); -- no result
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+-- two rows with different row filter
+SELECT * FROM test_gpt(ARRAY['pub_all', 'pub_normal'], 'tbl_normal');
+  pubname   |  relname   | attrs |   qual    
+------------+------------+-------+-----------
+ pub_all    | tbl_normal | 1     | 
+ pub_normal | tbl_normal | 1     | (id < 10)
+(2 rows)
+
+-- one row with 'pub_part_parent'
+SELECT * FROM test_gpt(ARRAY['pub_part_parent', 'pub_part_parent_no_viaroot'], 'tbl_parent');
+     pubname     |  relname   | attrs |    qual    
+-----------------+------------+-------+------------
+ pub_part_parent | tbl_parent | 1 2   | (id1 = 10)
+(1 row)
+
+-- no result, tbl_parent is the effective published OID due to pubviaroot
+SELECT * FROM test_gpt(ARRAY['pub_part_parent', 'pub_all'], 'tbl_part1');
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+-- no result, non-existent publication
+SELECT * FROM test_gpt(ARRAY['no_such_pub'], 'tbl_normal');
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+-- no result, non-table object
+SELECT * FROM test_gpt(ARRAY['pub_all'], 'gpt_test_view');
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+-- no result, empty publication array
+SELECT * FROM test_gpt(ARRAY[]::text[], 'tbl_normal');
+ pubname | relname | attrs | qual 
+---------+---------+-------+------
+(0 rows)
+
+-- no result, OID 0 as target_relid
+SELECT * FROM pg_get_publication_tables(ARRAY['pub_normal'], 0::oid);
+ pubid | relid | attrs | qual 
+-------+-------+-------+------
+(0 rows)
+
+-- Clean up
+DROP FUNCTION test_gpt(text[], text);
+DROP PUBLICATION pub_all;
+DROP PUBLICATION pub_all_no_viaroot;
+DROP PUBLICATION pub_all_except;
+DROP PUBLICATION pub_all_except_no_viaroot;
+DROP PUBLICATION pub_schema;
+DROP PUBLICATION pub_normal;
+DROP PUBLICATION pub_part_leaf;
+DROP PUBLICATION pub_part_parent;
+DROP PUBLICATION pub_part_parent_no_viaroot;
+DROP PUBLICATION pub_part_parent_child;
+DROP VIEW gpt_test_view;
+DROP TABLE tbl_normal, tbl_parent, tbl_part1;
+DROP SCHEMA gpt_test_sch CASCADE;
+NOTICE:  drop cascades to table gpt_test_sch.tbl_sch
 -- stage objects for pg_dump tests
 CREATE SCHEMA pubme CREATE TABLE t0 (c int, d int) CREATE TABLE t1 (c int);
 CREATE SCHEMA pubme2 CREATE TABLE t0 (c int, d int);
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 6bafad27571..94908e4f965 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -1438,6 +1438,113 @@ RESET SESSION AUTHORIZATION;
 DROP ROLE regress_publication_user, regress_publication_user2;
 DROP ROLE regress_publication_user_dummy;
 
+-- Test pg_get_publication_tables(text[], oid) function
+CREATE SCHEMA gpt_test_sch;
+CREATE TABLE gpt_test_sch.tbl_sch (id int);
+CREATE TABLE tbl_normal (id int);
+CREATE TABLE tbl_parent (id1 int, id2 int, id3 int) PARTITION BY RANGE (id1);
+CREATE TABLE tbl_part1 PARTITION OF tbl_parent FOR VALUES FROM (1) TO (10);
+CREATE VIEW gpt_test_view AS SELECT * FROM tbl_normal;
+
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION pub_all FOR ALL TABLES WITH (publish_via_partition_root = true);
+CREATE PUBLICATION pub_all_no_viaroot FOR ALL TABLES WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_all_except FOR ALL TABLES EXCEPT (TABLE tbl_parent, gpt_test_sch.tbl_sch) WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_all_except_no_viaroot FOR ALL TABLES EXCEPT (TABLE tbl_parent, gpt_test_sch.tbl_sch) WITH (publish_via_partition_root = true);
+CREATE PUBLICATION pub_schema FOR TABLES IN SCHEMA gpt_test_sch;
+CREATE PUBLICATION pub_normal FOR TABLE tbl_normal WHERE (id < 10);
+CREATE PUBLICATION pub_part_leaf FOR TABLE tbl_part1 WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_part_parent FOR TABLE tbl_parent (id1, id2) WHERE (id1 = 10) WITH (publish_via_partition_root = true);
+CREATE PUBLICATION pub_part_parent_no_viaroot FOR TABLE tbl_parent WITH (publish_via_partition_root = false);
+CREATE PUBLICATION pub_part_parent_child FOR TABLE tbl_parent, tbl_part1 WITH (publish_via_partition_root = true);
+RESET client_min_messages;
+
+CREATE FUNCTION test_gpt(pubnames text[], relname text)
+RETURNS TABLE (
+  pubname text,
+  relname name,
+  attrs text,
+  qual text
+)
+BEGIN ATOMIC
+  SELECT p.pubname, c.relname, gpt.attrs::text, pg_get_expr(gpt.qual, gpt.relid)
+    FROM pg_get_publication_tables(pubnames, relname::regclass::oid) gpt
+    JOIN pg_publication p ON p.oid = gpt.pubid
+    JOIN pg_class c ON c.oid = gpt.relid
+  ORDER BY p.pubname, c.relname;
+END;
+
+SELECT * FROM test_gpt(ARRAY['pub_normal'], 'tbl_normal');
+SELECT * FROM test_gpt(ARRAY['pub_normal'], 'gpt_test_sch.tbl_sch'); -- no result
+
+SELECT * FROM test_gpt(ARRAY['pub_schema'], 'gpt_test_sch.tbl_sch');
+SELECT * FROM test_gpt(ARRAY['pub_schema'], 'tbl_normal'); -- no result
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent'], 'tbl_parent');
+SELECT * FROM test_gpt(ARRAY['pub_part_parent'], 'tbl_part1'); -- no result
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_no_viaroot'], 'tbl_part1');
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_no_viaroot'], 'tbl_parent'); -- no result
+
+SELECT * FROM test_gpt(ARRAY['pub_part_leaf'], 'tbl_part1');
+SELECT * FROM test_gpt(ARRAY['pub_part_leaf'], 'tbl_parent'); -- no result
+
+SELECT * FROM test_gpt(ARRAY['pub_all'], 'tbl_parent');
+SELECT * FROM test_gpt(ARRAY['pub_all'], 'tbl_part1'); -- no result
+
+SELECT * FROM test_gpt(ARRAY['pub_all_no_viaroot'], 'tbl_part1');
+SELECT * FROM test_gpt(ARRAY['pub_all_no_viaroot'], 'tbl_parent'); -- no result
+
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_child'], 'tbl_parent');
+SELECT * FROM test_gpt(ARRAY['pub_part_parent_child'], 'tbl_part1'); -- no result
+
+-- test for the EXCLUDE clause
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'tbl_normal');
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'gpt_test_sch.tbl_sch'); -- no result (excluded)
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'tbl_parent'); -- no result (excluded)
+SELECT * FROM test_gpt(ARRAY['pub_all_except'], 'tbl_part1'); -- no result
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'tbl_normal');
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'gpt_test_sch.tbl_sch'); -- no result (excluded)
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'tbl_parent'); -- no result (excluded)
+SELECT * FROM test_gpt(ARRAY['pub_all_except_no_viaroot'], 'tbl_part1'); -- no result
+
+-- two rows with different row filter
+SELECT * FROM test_gpt(ARRAY['pub_all', 'pub_normal'], 'tbl_normal');
+
+-- one row with 'pub_part_parent'
+SELECT * FROM test_gpt(ARRAY['pub_part_parent', 'pub_part_parent_no_viaroot'], 'tbl_parent');
+
+-- no result, tbl_parent is the effective published OID due to pubviaroot
+SELECT * FROM test_gpt(ARRAY['pub_part_parent', 'pub_all'], 'tbl_part1');
+
+-- no result, non-existent publication
+SELECT * FROM test_gpt(ARRAY['no_such_pub'], 'tbl_normal');
+
+-- no result, non-table object
+SELECT * FROM test_gpt(ARRAY['pub_all'], 'gpt_test_view');
+
+-- no result, empty publication array
+SELECT * FROM test_gpt(ARRAY[]::text[], 'tbl_normal');
+
+-- no result, OID 0 as target_relid
+SELECT * FROM pg_get_publication_tables(ARRAY['pub_normal'], 0::oid);
+
+-- Clean up
+DROP FUNCTION test_gpt(text[], text);
+DROP PUBLICATION pub_all;
+DROP PUBLICATION pub_all_no_viaroot;
+DROP PUBLICATION pub_all_except;
+DROP PUBLICATION pub_all_except_no_viaroot;
+DROP PUBLICATION pub_schema;
+DROP PUBLICATION pub_normal;
+DROP PUBLICATION pub_part_leaf;
+DROP PUBLICATION pub_part_parent;
+DROP PUBLICATION pub_part_parent_no_viaroot;
+DROP PUBLICATION pub_part_parent_child;
+DROP VIEW gpt_test_view;
+DROP TABLE tbl_normal, tbl_parent, tbl_part1;
+DROP SCHEMA gpt_test_sch CASCADE;
+
 -- stage objects for pg_dump tests
 CREATE SCHEMA pubme CREATE TABLE t0 (c int, d int) CREATE TABLE t1 (c int);
 CREATE SCHEMA pubme2 CREATE TABLE t0 (c int, d int);
-- 
2.47.3

view thread (51+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Initial COPY of Logical Replication is too slow
  In-Reply-To: <CAD21AoBB3ovY4Hpr+HM6VkCY+sj3_71ZRi3oOP67w5ivvy4zuQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox