public inbox for [email protected]
help / color / mirror / Atom feedFrom: Chao Li <[email protected]>
To: Postgres hackers <[email protected]>
Subject: Improve logical replication usability when tables lack primary keys
Date: Mon, 10 Nov 2025 16:06:36 +0800
Message-ID: <CAEoWx2mMorbMwjKbT4YCsjDyL3r9Mp+z0bbK57VZ+OkJTgJQVQ@mail.gmail.com> (raw)
* BACKGROUND
This requirement comes from several users operating large deployments,
particularly in HIS (Hospital Information Systems). The situation can be
summarized as follows:
- A central DB operations team maintains the main database and configures
logical replication for all tables.
- Multiple third-party application vendors are allowed to create new tables
in that database.
- Some of these newly created tables lack a primary key. Since logical
replication with `REPLICATION IDENTITY DEFAULT` requires a primary key,
such tables silently fail to replicate.
- The DB operations team must then spend significant effort identifying the
affected tables and correcting them manually.
In practice, these environments would benefit from a safe fallback: if a
table has no primary key, logical replication should automatically switch
from `REPLICATION IDENTITY DEFAULT` to `FULL`, ensuring replication
continues rather than breaking.
I don't intend to debate whether this operational model is ideal; it is
simply the reality in many deployments. These database operations teams
have developed and refined their practices over many years, and as a
database vendor we have limited influence over how they manage their
environments.
* PROPOSED SOLUTION
I evaluated a few approaches and am proposing the following:
- Introduce a new GUC: `logical_replication_fallback_to_full_identity`.
- When enabled, if a table being logically replicated has no primary key,
the system automatically uses `REPLICATION IDENTITY FULL` for that table.
- This setting can be applied at the database level, so large systems do
not need to enable it cluster-wide unless desired.
- When the WAL sender transmits relation metadata, if fallback has
occurred, it explicitly reports `FULL` as the replication identity to the
subscriber, so there is limited impact on the subscriber.
* NEXT STEPS
The attached patch is an initial implementation. It does not yet include
tests or documentation updates. I would appreciate feedback on the design
approach first. If the direction seems reasonable, I will proceed with
refining the patch and adding documentation and tests.
Thanks in advance for your review.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Attachments:
[application/octet-stream] v1-0001-Fallback-default-replication-identity-to-full.patch (8.8K, 3-v1-0001-Fallback-default-replication-identity-to-full.patch)
download | inline diff:
From 79169a50a53da93ea5d2ba03239a884b4d660158 Mon Sep 17 00:00:00 2001
From: "Chao Li (Evan)" <[email protected]>
Date: Mon, 10 Nov 2025 15:28:03 +0800
Subject: [PATCH v1] Fallback default replication identity to full
Author: Chao Li <[email protected]>
---
src/backend/access/heap/heapam.c | 7 ++---
src/backend/commands/publicationcmds.c | 5 ++--
src/backend/executor/execReplication.c | 2 +-
src/backend/replication/logical/proto.c | 6 ++++-
src/backend/replication/logical/relation.c | 26 +++++++++++++++++++
src/backend/utils/misc/guc_parameters.dat | 7 +++++
src/backend/utils/misc/postgresql.conf.sample | 2 ++
src/include/replication/logicalrelation.h | 4 +++
8 files changed, 52 insertions(+), 7 deletions(-)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 36fee9c994e..6c47d140bd1 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -45,6 +45,7 @@
#include "commands/vacuum.h"
#include "pgstat.h"
#include "port/pg_bitutils.h"
+#include "replication/logicalrelation.h"
#include "storage/lmgr.h"
#include "storage/predicate.h"
#include "storage/procarray.h"
@@ -3119,7 +3120,7 @@ l1:
if (old_key_tuple != NULL)
{
- if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ if (logicalrep_identity_is_full(relation))
xlrec.flags |= XLH_DELETE_CONTAINS_OLD_TUPLE;
else
xlrec.flags |= XLH_DELETE_CONTAINS_OLD_KEY;
@@ -8941,7 +8942,7 @@ log_heap_update(Relation reln, Buffer oldbuf,
xlrec.flags |= XLH_UPDATE_CONTAINS_NEW_TUPLE;
if (old_key_tuple)
{
- if (reln->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ if (logicalrep_identity_is_full(reln))
xlrec.flags |= XLH_UPDATE_CONTAINS_OLD_TUPLE;
else
xlrec.flags |= XLH_UPDATE_CONTAINS_OLD_KEY;
@@ -9167,7 +9168,7 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_required,
if (replident == REPLICA_IDENTITY_NOTHING)
return NULL;
- if (replident == REPLICA_IDENTITY_FULL)
+ if (logicalrep_identity_is_full(relation))
{
/*
* When logging the entire old tuple, it very well could contain
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 1faf3a8c372..d3fa42123e2 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -37,6 +37,7 @@
#include "parser/parse_clause.h"
#include "parser/parse_collate.h"
#include "parser/parse_relation.h"
+#include "replication/logicalrelation.h"
#include "rewrite/rewriteHandler.h"
#include "storage/lmgr.h"
#include "utils/acl.h"
@@ -281,7 +282,7 @@ pub_rf_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
* FULL means all columns are in the REPLICA IDENTITY, so all columns are
* allowed in the row filter and we can skip the validation.
*/
- if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ if (logicalrep_identity_is_full(relation))
return false;
/*
@@ -389,7 +390,7 @@ pub_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
pub = GetPublication(pubid);
check_and_fetch_column_list(pub, publish_as_relid, NULL, &columns);
- if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ if (logicalrep_identity_is_full(relation))
{
/* With REPLICA IDENTITY FULL, no column list is allowed. */
*invalid_column_list = (columns != NULL);
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index def32774c90..22863c80154 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -1088,7 +1088,7 @@ CheckCmdReplicaIdentity(Relation rel, CmdType cmd)
return;
/* REPLICA IDENTITY FULL is also good for UPDATE/DELETE. */
- if (rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ if (logicalrep_identity_is_full(rel))
return;
/*
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index f0a913892b9..aa23e6e75f9 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -17,6 +17,7 @@
#include "catalog/pg_type.h"
#include "libpq/pqformat.h"
#include "replication/logicalproto.h"
+#include "replication/logicalrelation.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -669,6 +670,7 @@ logicalrep_write_rel(StringInfo out, TransactionId xid, Relation rel,
PublishGencolsType include_gencols_type)
{
char *relname;
+ char relreplident = rel->rd_rel->relreplident;
pq_sendbyte(out, LOGICAL_REP_MSG_RELATION);
@@ -685,7 +687,9 @@ logicalrep_write_rel(StringInfo out, TransactionId xid, Relation rel,
pq_sendstring(out, relname);
/* send replica identity */
- pq_sendbyte(out, rel->rd_rel->relreplident);
+ if (logicalrep_identity_is_full(rel))
+ relreplident = REPLICA_IDENTITY_FULL;
+ pq_sendbyte(out, relreplident);
/* send the attribute info */
logicalrep_write_attrs(out, rel, columns, include_gencols_type);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 745fd3bab64..2c756fca469 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -30,6 +30,7 @@
#include "utils/lsyscache.h"
#include "utils/syscache.h"
+bool logical_replication_fallback_to_full_identity = false;
static MemoryContext LogicalRepRelMapContext = NULL;
@@ -958,3 +959,28 @@ FindLogicalRepLocalIndex(Relation localrel, LogicalRepRelation *remoterel,
return InvalidOid;
}
+
+/*
+ * logicalrep_identity_is_full
+ *
+ * Check whether the replica identity of the relation is full or not.
+ * When a table's replica identity is default, but there is no primary key,
+ * if logical_replication_fallback_to_full_identity is true, we consider the
+ * replica identity as full. This funciton should only be called on the
+ * publisher.
+ */
+bool
+logicalrep_identity_is_full(Relation relation)
+{
+ Form_pg_class relform = RelationGetForm(relation);
+
+ if (relform->relreplident == REPLICA_IDENTITY_FULL)
+ return true;
+
+ if (relform->relreplident == REPLICA_IDENTITY_DEFAULT &&
+ logical_replication_fallback_to_full_identity &&
+ !OidIsValid(RelationGetReplicaIndex(relation)))
+ return true;
+
+ return false;
+}
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..a07edb05254 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1833,6 +1833,13 @@
max => 'MAX_KILOBYTES',
},
+{ name => 'logical_replication_fallback_to_full_identity', type => 'bool', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING',
+ short_desc => 'Use REPLICA IDENTITY FULL automatically when a table with DEFAULT identity has no primary key.',
+ long_desc => 'When enabled, logical replication will automatically send full-row data for tables that specify REPLICA IDENTITY DEFAULT but lack a primary key, instead of raising an error.',
+ variable => 'logical_replication_fallback_to_full_identity',
+ boot_val => 'false',
+},
+
{ name => 'maintenance_io_concurrency', type => 'int', context => 'PGC_USERSET', group => 'RESOURCES_IO',
short_desc => 'A variant of "effective_io_concurrency" that is used for maintenance work.',
long_desc => '0 disables simultaneous requests.',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f62b61967ef..dc17caa2c2a 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -346,6 +346,8 @@
#wal_sender_timeout = 60s # in milliseconds; 0 disables
#track_commit_timestamp = off # collect timestamp of transaction commit
# (change requires restart)
+#logical_replication_fallback_to_full_identity = off # fallback default
+ # replication identity to full if no primary key
# - Primary Server -
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 7a561a8e8d8..0c6299d3bca 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -16,6 +16,9 @@
#include "catalog/index.h"
#include "replication/logicalproto.h"
+/* GUC variables */
+extern PGDLLIMPORT bool logical_replication_fallback_to_full_identity;
+
typedef struct LogicalRepRelMapEntry
{
LogicalRepRelation remoterel; /* key is remoterel.remoteid */
@@ -48,6 +51,7 @@ extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *r
Relation partrel, AttrMap *map);
extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
LOCKMODE lockmode);
+extern bool logicalrep_identity_is_full(Relation relation);
extern bool IsIndexUsableForReplicaIdentityFull(Relation idxrel, AttrMap *attrmap);
extern Oid GetRelationIdentityOrPK(Relation rel);
--
2.39.5 (Apple Git-154)
view thread (26+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Improve logical replication usability when tables lack primary keys
In-Reply-To: <CAEoWx2mMorbMwjKbT4YCsjDyL3r9Mp+z0bbK57VZ+OkJTgJQVQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox