Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVnuJ-008FDz-2D for pgsql-hackers@arkaria.postgresql.org; Wed, 17 Dec 2025 09:33:20 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vVnuI-00BhSW-0m for pgsql-hackers@arkaria.postgresql.org; Wed, 17 Dec 2025 09:33:19 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVnuH-00BhSK-2d for pgsql-hackers@lists.postgresql.org; Wed, 17 Dec 2025 09:33:18 +0000 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vVnuG-0017el-2d for pgsql-hackers@lists.postgresql.org; Wed, 17 Dec 2025 09:33:17 +0000 Received: by mail-pj1-x102a.google.com with SMTP id 98e67ed59e1d1-34b75f7a134so3174221a91.0 for ; Wed, 17 Dec 2025 01:33:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765963995; x=1766568795; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wcCuWkbvSZGufRnvtoHb6LjWh45AN0GdQaMaKB4EqKQ=; b=YSmE0Z+fMr8iEWjrdWvD8bFSBQd86rnk7HIdtx6l+AmyC/8QLrenRXts4cl5b0Taro bw0bLj3AuzOXNzrqDL1DY+zzak1nSXPlCiaOCg5zjyaKXq1bgtM2oIU3ow5uilndkv1F DexCE3sUawz39k6Rx0kSUE3kl6M42OIQFhgH4EpVtdRhBwq6B57nQSHaefgG+rHqupbk 9bXq7tzQbhJYmnYIeJyaQDtnYnXUsU0E1luiuM5lr+Y/zz71AP+PkmydjtRnkFuJJ+yK saqwoaLe9+yZgWxQGcwetzBHOwlBhPXLM9AkuWPiI4eAqw6YTfasUaC+4K0gUMXORtXe Pmkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765963995; x=1766568795; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=wcCuWkbvSZGufRnvtoHb6LjWh45AN0GdQaMaKB4EqKQ=; b=wAhEfXe1MGOaqrPPuCB+e29jJNkt/+RKcXkEz80PuqyEJDWcdj37aI14IV7WV1ocwM BulFwEfN1NPEJUj36busyEmJs2ltHTBiXmCfAFuTvjyVNls9jCR2oJleK0Z6TDFNQQLq 9gm04ul34yKZOEBpf+/3ciVIbiMjbIMzUjzE5Dvaf49NzvHLOYMGrx5R1jBXQta4rddY c/QyrORI210cX+CVBEU3fEqXVUhY0XhcNRzqB+bmJAtHj1LFdO6Nlq2iwKyUzBUVqSiM 0EE5I7cuEYS3zIy3WnTBW7WWTgEA9CWejn6zTermQRKxZPepvYVP3vjDbfouQAx2/ali lrNg== X-Forwarded-Encrypted: i=1; AJvYcCVI5lGtyAmYl0JnUgMKcBQ/IcbMZaqD/pChyCYhaFYyM89g2OOMtWg1eMaJkbCSrA2qc+XB2cZ20dRp5qBU@lists.postgresql.org X-Gm-Message-State: AOJu0Yw7WYolUBhC4C6NhHiHqnNRiogutuzYBP7LXs0Fu7VzcKWPatgt +Py4NXxwrfVCU+t0O6CNT924t7Ox09AmFsyAXDmHlBJdNdEgd13T+QK+EbMqwnqbc9G73NX0Qds jC/RPT8riYX3+/ZMApU8C8jMQ/iq27oM= X-Gm-Gg: AY/fxX60YiEsgtZmGzIRmBGq0pAxPDoJ0pRnIUSNbCWrCwVFTG+YXoQXClYqzthfZCE gZryX80EBzA99wQNiD7tHb6+7eS5Lum07Eyr2kmMb5EoFDO7L3fwgCTSZRAgxEdX5IGBj19h67G 8uK2BMY5bXtANLiwggC3Uzr5qaZvKTIQ7/YUyeu9EgjNtC3s/CFj1KVnTc/K0j9al7QfFiyrg6M N1LEHfFlBxUrB8Q6T6E/NZljIXfUrfBtMxPmcS9kpfpTO3OciP24Po8IwxOUOfKt+xPs7Oec/G6 CjAl1lZTIkYMIXfoLLF6PUDea7Ff X-Google-Smtp-Source: AGHT+IHqbHl/B3VTumgNJ6r3ciXYGYBxIRZuyxffZBqMNJif4E2NEMVlWqIfsqWYiJPYfdKzv+K/PLHRV6XBsdZM9V8= X-Received: by 2002:a17:90b:3912:b0:343:d70e:bef0 with SMTP id 98e67ed59e1d1-34abd8488c6mr15938055a91.21.1765963995332; Wed, 17 Dec 2025 01:33:15 -0800 (PST) MIME-Version: 1.0 References: <5ABD7727-CD22-4112-A186-0E788EE78109@gmail.com> <23A24BFF-18A7-4FE9-AAFA-13E1AA207DD0@gmail.com> In-Reply-To: From: GRANT ZHOU Date: Wed, 17 Dec 2025 01:32:39 -0800 X-Gm-Features: AQt7F2qU8nYqDVJLjF6MwugKsVHBwGrrgnZ74pA5i261RiBjUi40_nP92jEdOD0 Message-ID: Subject: Re: Improve logical replication usability when tables lack primary keys To: Chao Li Cc: Amit Kapila , Dilip Kumar , Postgres hackers Content-Type: multipart/alternative; boundary="000000000000a0c69e06462288b3" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000a0c69e06462288b3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Dec 16, 2025 at 4:59=E2=80=AFPM Chao Li wr= ote: > > On Dec 15, 2025, at 13:48, Amit Kapila wrote: > > > So, without patch, there is no way we can silently replicate the > > UPDATE/DELETE. Ideally, users should alter the tables and make RI as > > FULL in such cases if they don't have PK for such tables. Falling back > > to FULL for DEFAULT when the table doesn't have PK based on GUC has a > > downside that it will increase WAL volume by a large amount. > > I agree that this downside exists, but it is an inherent cost that users > must accept if they choose to replicate all tables, including those witho= ut > a primary key. In practice, users who opt into such a configuration are > typically aware of the WAL overhead and make that trade-off consciously. > > > I don't know what is a good way to give to users who don't want to do > > the required setup but if we really want to provide something, it is > > better to allow such a thing via the publication option instead. > > Using a publication-level option could also work. One complication, > however, is that a table can belong to multiple publications. For example= , > if table_a belongs to both pub_a and pub_b, and only pub_a is configured > with fallback_to_full while pub_b keeps the default behavior > (fallback_to_none), then the effective behavior for table_a would need to > remain fallback_to_none, meaning that UPDATE/DELETE would still not be > allowed if table_a has not a primary key. > > > I think it would be good to do such an enhancement if we have more > > community support and some other users also appreciate such a feature. > > Otherwise, adding something which is specific to a particular user > > sounds like a recipe of maintenance burden especially when we already > > provide a way to achieve the same thing as is required by the user. > > Let me elaborate on that point. > > My company has a very large user base in China, with over 100K deployment= s > across multiple industries. However, there is currently a significant gap > between this large user population and direct participation in the PG > community. I joined the company in July this year as a full-time > contributor to the PG community, and one of my responsibilities is to hel= p > bridge this gap and bring real-world user feedback into community > discussions. > > As I mentioned in my earlier email, this requirement comes from > large-scale deployments. The database owners in these environments have > operational models that may not always align with what we consider the > ideal or fully optimized setup, but they are the result of years of > accumulated practice and operational experience. For these users, the > proposed feature would significantly simplify their day-to-day operations > and reduce operational friction. > +1 on the importance of addressing these large-scale operational realities. Beyond the scale issue, I believe there is a noticeable inconsistency between the documentation's promise of automation and the actual behavior of Replica Identity. 1. The "Practical Gap" of Schema Automation According to the documentation for FOR TABLES IN SCHEMA [1], the feature matches "all tables in the specified list of schemas, including tables created in the future". This explicitly promises an unattended, automated workflow for new tables. However, this promise is immediately broken by the default Replica Identity rules: 1) New tables are created with REPLICA IDENTITY DEFAULT [2] by standard. 2) For tables without a primary key, DEFAULT identity "cannot support UPDATE or DELETE operations" and "attempting such operations will result in an error on the publisher"[3]. This creates a logical trap: The system automatically adds the new table to the publication (as promised), but then immediately fails on the first UPDATE operation because the table creates with an incompatible default identity. This forces manual intervention (ALTER TABLE) in what is supposed to be an automated workflow. 2. Regarding the solution: I support Amit's suggestion of a Publication Option. It avoids the risks of a global GUC while allowing users to explicitly opt-in to the trade-off (accepting higher WAL volume) to ensure the automation provided by FOR TABLES IN SCHEMA is functionally complete. [1] https://www.postgresql.org/docs/current/sql-createpublication.html#SQL-CREA= TEPUBLICATION-PARAMS-FOR-TABLES-IN-SCHEMA [2] https://www.postgresql.org/docs/18/sql-altertable.html#SQL-ALTERTABLE-REPLI= CA-IDENTITY [3] https://www.postgresql.org/docs/18/logical-replication-publication.html#LOG= ICAL-REPLICATION-PUBLICATION-REPLICA-IDENTITY -- Grant Zhou at Highgo Software --000000000000a0c69e06462288b3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Dec 16, 2025 at = 4:59=E2=80=AFPM Chao Li <li.evan.chao@gmail.com> wrote:
> On Dec 15, 2025, at 13:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
> So, without patch, there is no way we can silently replicate the<= br> > UPDATE/DELETE. Ideally, users should alter the tables and make RI as > FULL in such cases if they don't have PK for such tables. Falling = back
> to FULL for DEFAULT when the table doesn't have PK based on GUC ha= s a
> downside that it will increase WAL volume by a large amount.

I agree that this downside exists, but it is an inherent cost that users mu= st accept if they choose to replicate all tables, including those without a= primary key. In practice, users who opt into such a configuration are typi= cally aware of the WAL overhead and make that trade-off consciously.
> I don't know what is a good way to give to users who don't wan= t to do
> the required setup but if we really want to provide something, it is > better to allow such a thing via the publication option instead.

Using a publication-level option could also work. One complication, however= , is that a table can belong to multiple publications. For example, if tabl= e_a belongs to both pub_a and pub_b, and only pub_a is configured with fall= back_to_full while pub_b keeps the default behavior (fallback_to_none), the= n the effective behavior for table_a would need to remain fallback_to_none,= meaning that UPDATE/DELETE would still not be allowed if table_a has not a= primary key.

> I think it would be good to do such an enhancement if we have more
> community support and some other users also appreciate such a feature.=
> Otherwise, adding something which is specific to a particular user
> sounds like a recipe of maintenance burden especially when we already<= br> > provide a way to achieve the same thing as is required by the user.
Let me elaborate on that point.

My company has a very large user base in China, with over 100K deployments = across multiple industries. However, there is currently a significant gap b= etween this large user population and direct participation in the PG commun= ity. I joined the company in July this year as a full-time contributor to t= he PG community, and one of my responsibilities is to help bridge this gap = and bring real-world user feedback into community discussions.

As I mentioned in my earlier email, this requirement comes from large-scale= deployments. The database owners in these environments have operational mo= dels that may not always align with what we consider the ideal or fully opt= imized setup, but they are the result of years of accumulated practice and = operational experience. For these users, the proposed feature would signifi= cantly simplify their day-to-day operations and reduce operational friction= .

+1 on the importance of addressing these l= arge-scale operational realities.

Beyond the scale issue, I believe = there is a noticeable inconsistency between the documentation's promise= of automation and the actual behavior of Replica Identity.

1. The &= quot;Practical Gap" of Schema Automation
According to the documenta= tion for FOR TABLES IN SCHEMA [1], the feature matches "all tables in = the specified list of schemas, including tables created in the future"= . This explicitly promises an unattended, automated workflow for new tables= .

However, this promise is immediately broken by the default Replica= Identity rules:
1) New tables are created with REPLICA IDENTITY DEFAULT= [2] by standard.
2) For tables without a primary key, DEFAULT identity= "cannot support UPDATE or DELETE operations" and "attemptin= g such operations will result in an error on the publisher"[3].
This creates a logical trap: The system automatically adds the new table t= o the publication (as promised), but then immediately fails on the first UP= DATE operation because the table creates with an incompatible default ident= ity.
This forces manual intervention (ALTER TABLE) in what is supposed = to be an automated workflow.

2. Regarding the solution:
I support= Amit's suggestion of a Publication Option. It avoids the risks of a gl= obal GUC while allowing users to explicitly opt-in to the trade-off (accept= ing higher WAL volume) to ensure the automation provided by FOR TABLES IN S= CHEMA is functionally complete.

[1] https://www.postgresql.org/docs/cur= rent/sql-createpublication.html#SQL-CREATEPUBLICATION-PARAMS-FOR-TABLES-IN-= SCHEMA
[2] https://www.pos= tgresql.org/docs/18/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY=
--
Grant Zhou at Highgo Software=C2=A0
--000000000000a0c69e06462288b3--