Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCVq2-0023eR-27 for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 04:57:27 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wCVq0-009pAp-1M for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 04:57:25 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCVq0-009pAg-0A for pgsql-hackers@lists.postgresql.org; Tue, 14 Apr 2026 04:57:24 +0000 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wCVpy-00000000xzg-2QBv for pgsql-hackers@lists.postgresql.org; Tue, 14 Apr 2026 04:57:24 +0000 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-2aaed195901so23118305ad.0 for ; Mon, 13 Apr 2026 21:57:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1776142640; cv=none; d=google.com; s=arc-20240605; b=WgBU5j/6ugXyfqrHE4m202Er63WjAsaJ7CupaeUpG1l7nyiipZKFwMOb+JckE/kSMI wqyoydYoOP8Yvb2e6NzqVZEwITfeaygDQCMlnnM5QMdvu8rbPcKkQ+LtFMUdjJm6ZZUJ OOAdta0m2J6IuM4X9N0ve8xTLoPDghDzfrECc4v381Wb6eUXZt4vUYuTt7fNGikajJxL VVodRBiTiUaSmO+cV61JoCRj/J7pGRwDQRhzbj/RRdC+NdVpxk4p7daWsjdKtzVHDxm4 rjavOliCqk8YUnlsgqUwYIDpvshvKMDmxjCTDR8mglDa5RTbpwWYa3Dj6gbCPXFOGJnQ Sbcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=nqyFnw+1rF1zvnjV0deLUWSKNPzUUQZhBEGmQ7OtVhM=; fh=LYetjB5/Z96VBehDdWUddnAHlS2/PVDVDOAhjLjKsFE=; b=Xj+9V/Bdjy8PjHuGKYU8nPHjT5i7PvsPXC8E7CpAy2m8/JrkoqoLGlh9zt00c9vAIc b9JW+kIZNKJdnepwN8OaTjUYd1hF+nqee2ArgutLPIC6nJ2QN5yRSYEMxsVFC3FogZsy 72nCoX2muIlt0Z5QP8osEITQjoBdf868bSNYiBj+tZvSo8rYEcqrSOa+cr5234CukStE 8B2C3I5ZlaAOJ0zpl9BNKrytOAwyHcNt4UCHyRG56v8btmbmvvrY9Q97qVpqznXqFwup fBZxYH6lBlQqeJSIa5AfvsPK7+hTHx9pzRK7Lyv8As2YHoQwJ7XPZpNmtpHl3gfSmNgI csIg==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776142640; x=1776747440; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nqyFnw+1rF1zvnjV0deLUWSKNPzUUQZhBEGmQ7OtVhM=; b=C5cnY/b31FCfVCYuzP11mCQHKQMLvjA3wtsF6DwAhD5PYLKJCVDh3BXK2pnE86M2Bf 45OQHTvi2MfKZLuN0bLg51Dy7uXZo5wnEUKIbZL+ehzHxllaoJu3MVnUb+WDXKjeeKu0 MESnLSJQou/7MWDw4LkLDghN1Bdew6VRi1XhOX/7cj3ugUN8KKyt2ePvfokRSYRQkes8 4u84Z13IWnVIG+ZyHv/ZanJn+VTZneVHdhDtJjws4+WtI2G/56drUdFLshzhfc98cPYR 0vzs2rI7M5VWTwl3+lwXOmuJ0a7xtcgXvntvK8NTfp6cZJFITdKc2IbBXdMLn51YWGct O/Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776142640; x=1776747440; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=nqyFnw+1rF1zvnjV0deLUWSKNPzUUQZhBEGmQ7OtVhM=; b=anUs9qPyvhtxwQjzkDk3eIO+dp6sIkP1k7//JmBB4CnJJObIwis7F3ad00xm0PaX+B /Jy8OMlrWQYfqbH7mvXZOSqkCIzP0fX4A8rdnX13XRHidy52wvQa0l3jvpy45/9y8B0C YkLRxeOTgN+hNkc1Sm8XY1vNoKIhwUmBnx6QSjYnWLpVSbmNY1eWvRCdpoLdxDVn908I tweYyFQKAVilXaSfOc2Exaulv1hn8i/zAa1M/UA2CyYxIWVRvORogsjdN9zlYd+b2eJw FFmkaN4FWTs/MiQknnoHcehKRjgcwcnkJBDHS/ptQzFIYOb2mF157G5N4dskFfJ4yVaR acsw== X-Forwarded-Encrypted: i=1; AFNElJ9SRib30ZCHf1MpaF+CT5YpioouVjFfweSfikZozFQKInA1Cqzg/qCQJCZF0oTEWrrfygxF7gToBBBcTWk0@lists.postgresql.org X-Gm-Message-State: AOJu0YyxtG/+iXzj+mdrQ0Gp7/ba8PcRlzHGQk9g7hbAI68aIijs/JEm jBs26W0FrYqoBCapsPcHHq2K63OmYG0Yoz+4zRKwzCLuFrxrGyyPUyuCfKomD1ZMuegRz7nbW1H 7TvRKtqZICL0Ep3VcWfVBXGfrWWz++fY= X-Gm-Gg: AeBDietIcfquwhcNOsjPGK+RN5vg2Lro4M3ZnNn3nN11j6Icbgz9TnFRCZ4QobExk1Z OjlNTxDs9PnIeojidY0QU3S05XXl7SQzl+yNwaB6/Ibl5ZyNqCb0cBKzJp/9el3lL1Ivves+5gg 4rwd5Zv4InQqNbYfmi55eMs5XVE56rK4IxhusD/F8mxM9ULyeN7TntVKihu0Sdzd9sl1xQZeqDK hkVLfRHxS8VuDlJfZI4DhH0jYmb//nGOB2IipnczJLWbi0dQED4Zs7t+fvyUgMbJCTWM9PSlyt0 g8F7Oul/iH25LD4TrSGpNvHjFg9wbgYDmVMfLYz7ibl4Z64tEN63 X-Received: by 2002:a17:902:f642:b0:2b4:5986:cd80 with SMTP id d9443c01a7336-2b45986d953mr75249185ad.26.1776142639797; Mon, 13 Apr 2026 21:57:19 -0700 (PDT) MIME-Version: 1.0 References: <5ABD7727-CD22-4112-A186-0E788EE78109@gmail.com> <23A24BFF-18A7-4FE9-AAFA-13E1AA207DD0@gmail.com> <19dac243-1f46-4720-bdec-bf0f851d03b9@app.fastmail.com> <875BBCC0-CF08-4136-8E9E-F03DF75C3A11@gmail.com> <6BD39478-8F32-48BD-9595-678E5B4BEE8C@gmail.com> In-Reply-To: <6BD39478-8F32-48BD-9595-678E5B4BEE8C@gmail.com> From: shveta malik Date: Tue, 14 Apr 2026 10:27:08 +0530 X-Gm-Features: AQROBzBVEdOUX3RTV7g8_uqtR21scgI-HYp415xGkOlPOcUQMJzefiP4ee50CtQ Message-ID: Subject: Re: Improve logical replication usability when tables lack primary keys To: Chao Li Cc: Euler Taveira , GRANT ZHOU , "houzj.fnst@fujitsu.com" , Amit Kapila , Dilip Kumar , Postgres hackers , shveta malik Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Apr 14, 2026 at 9:17=E2=80=AFAM Chao Li wr= ote: > > > > > On Apr 8, 2026, at 18:22, shveta malik wrote: > > > > On Fri, Dec 19, 2025 at 1:39=E2=80=AFPM Chao Li wrote: > >> > >> > >> > >>> On Dec 18, 2025, at 22:49, Euler Taveira wrote: > >>> > >>> On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote: > >>>> On Wed, Dec 17, 2025 at 12:50=E2=80=AFPM Euler Taveira wrote: > >>>>> Each table needs to say what's its row identifier. The user created= a table > >>>>> without primary key. Well, create a primary key. There are dozens o= f thousands > >>>>> of objects. Use a script. > >>>> However, I=E2=80=99d like to share a user perspective regarding the = "use a > >>>> script" approach. The main value of `FOR TABLES IN SCHEMA` is > >>>> *in-database automation*. If users still need to maintain external > >>>> scripts to monitor and `ALTER` new tables to prevent replication > >>>> errors, it significantly diminishes the value of that automation. > >>>> > >>> > >>> As I tried to explain in the previous email, the problem with FOR ALL= TABLES > >>> and FOR TABLES IN SCHEMA syntax is that the is no catalog information= about the > >>> relations; the list of relations is collected at runtime. > >>> > >>> When I suggested "use a script" I was referring to fix the logical re= plication > >>> setup regarding the lack of primary key. There is no need to have an = automation > >>> outside the database, use an event trigger. If your lazy user doesn't= create > >>> the primary key, assign REPLICA IDENTITY FULL. Something like > >>> > >>> -- This example is far from being a complete solution for fixing the = lack of > >>> -- primary key in a logical replication scenario. > >>> -- ALTER TABLE should be supported too > >>> CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity() > >>> RETURNS event_trigger LANGUAGE plpgsql AS $$ > >>> DECLARE > >>> obj record; > >>> rec record; > >>> ricnt integer :=3D 0; > >>> BEGIN > >>> FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() > >>> LOOP > >>> IF obj.command_tag =3D 'CREATE TABLE' THEN > >>> SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid =3D= obj.objid AND indisprimary; > >>> RAISE NOTICE 'ricnt: %', ricnt; > >>> IF ricnt =3D 0 THEN > >>> EXECUTE 'ALTER TABLE ' || obj.object_identity || ' REPL= ICA IDENTITY FULL'; > >>> END IF; > >>> END IF; > >>> END LOOP; > >>> END; > >>> $$; > >>> > >>> CREATE EVENT TRIGGER event_trigger_for_replica_identity > >>> ON ddl_command_end > >>> EXECUTE FUNCTION event_trigger_for_replica_identity(); > >>> > >>> CREATE TABLE event_trigger_test_1 (a int); > >>> \d+ event_trigger_test_1 > >>> CREATE TABLE event_trigger_test_2 (a int primary key); > >>> \d+ event_trigger_test_2 > >>> CREATE TABLE event_trigger_test_3 (a int, b text not null, primary ke= y(b)); > >>> \d+ event_trigger_test_3 > >>> --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT event_trigger_test= _3_pkey; > >>> --\d+ event_trigger_test_3 > >>> > >>> DROP EVENT TRIGGER event_trigger_for_replica_identity; > >>> DROP FUNCTION event_trigger_for_replica_identity; > >>> DROP TABLE event_trigger_test_1, event_trigger_test_2, event_trigger_= test_3; > >>> > >>> 8<-------------------------------------------------------------------= ---------8< > >>> > >>>> Additionally, tables without Primary Keys are valid SQL and extremel= y > >>>> common in enterprise environments (e.g., audit logs, data warehousin= g). > >>>> In large-scale deployments, enforcing PKs on every single table isn'= t > >>>> always practical. > >>>> > >>> > >>> I'm not saying users shouldn't create tables without a primary key. I= 'm arguing > >>> that this decision should take into account what adjustments need to = be made to > >>> use these tables in logical replication. > >>> > >>>> > >>>> I think the goal of this proposal is not to change the underlying ta= ble > >>>> property design, but rather to seek a mechanism (like a Publication > >>>> option) to ensure this automation functions safely without external > >>>> intervention. It is simply about allowing the database to handle the= se > >>>> valid, common scenarios gracefully when automation is enabled. > >>>> > >>> > >>> You didn't get it. You already have one property to handle it and you= are > >>> proposing to add a second property to handle it. > >>> > >>> I think you are pursuing the wrong solution. IMO we need a solution t= o enforce > >>> that the logical replication contract is valid. If you create or modi= fy a table > >>> that is part of a publication, there is no validation that that table= complies > >>> with the publication properties (update and delete properties should = require an > >>> appropriate replica identity). We should close the gaps in both publi= cation and > >>> table. > >>> > >> > >> If I summarize Euler=E2=80=99s position in short words: discipline ove= r convenience. I actually strongly agree with that. In PG we generally pref= er explicit over implicit behavior, and predictability over magic. > >> > >> Based on the discussion so far, I think we share the following design = goals: > >> > >> 1) Keep replica identity as a table property. > >> 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in ta= bles without primary keys. > >> 3) Avoid global or implicit behavior changes. > >> 4) Preserve explicit opt-in for higher WAL cost. > >> 5) Keep the logical replication contract explicit and enforceable. > >> > >> I=E2=80=99ve been thinking about whether adding a new replica identity= could meet these goals. > >> > >> Today we have four replica identities: DEFAULT (PK, fallback to NONE),= INDEX, FULL, and NONE. > >> > >> What if we introduce a new replica identity, tentatively called =E2=80= =9CFORCE=E2=80=9D: PK with fallback to FULL. (Let=E2=80=99s keep our focus = on the design, not argue the name for now.) > >> > >> With this approach: > >> > >> 1) Replica identity remains a table property. > >> 2) Publication membership is still evaluated at runtime, so FOR TABLES= IN SCHEMA is not special-cased. > >> 3) No new GUCs are required. > >> 4) The user must explicitly opt in by setting the replica identity. On= ce FORCE is chosen, adding or dropping a primary key later does not silentl= y break UPDATE/DELETE replication. > >> 5) The logical replication contract remains explicit; the table declar= es that it is safe for UPDATE/DELETE replication even without a PK, at the = cost of higher WAL volume. > >> > >> This feels like a small, explicit extension of the existing RI semanti= cs. Notably, REPLICA IDENTITY DEFAULT already has conditional behavior (PK = fallback > >> to NONE), so conditional RI behavior is not new, this would just make = a different fallback explicit and user-chosen. > > > > > > Hi Shveta, > > Thanks for your feedback. > > > Even though I like the idea of keeping replica identity as a table > > property, this raises a question about the need for a new FORCE > > option. With this approach, the user must explicitly opt REPLICA > > IDENTITY FORCE for tables without a primary key (pt 4 in your design). > > But if the user is already expected to take an explicit action, they > > could simply set REPLICA IDENTITY FULL themselves at table creation > > time or via ALTER TABLE. > > This is the tricky part. The need for this feature comes from some compli= cated operational models in practice. Table creation is often handled by an= application team, which focuses on data logic and usually does not care ab= out replication details. Data replication, on the other hand, is maintained= by a different team, which does not care what data the tables contain, but= only needs to ensure that all data is replicated correctly. In that kind o= f setup, we cannot expect REPLICA IDENTITY to be configured properly at tab= le creation time. > > Such operational models also imply that the database operations team is u= sually the central database team, with superuser privileges, while an appli= cation typically has privileges only on the tables it owns. The central tea= m can set configurations at the database or cluster level. In most cases, w= hat they want is to replicate all tables by default, except for a small num= ber of tables that are explicitly excluded. > > Actually, this feature request is coming from the replication team. Today= , they have to manually identify newly created tables, usually by running s= ome script, and then configure REPLICA IDENTITY for them. That takes effort= and is error-prone. So what they want is a DEFAULT-to-FULL fallback when a= table has no primary key. Okay, I see. If the goal is specifically to address this use-case, introducing a FORCE table property does not seem to solve the underlying problem. Using 'ALTER TABLE t REPLICA IDENTITY FORCE' is effectively no different from 'ALTER TABLE t REPLICA IDENTITY FULL'. In both cases, a manual action must still be taken on each table, which is precisely what the replication team is trying to avoid. > That=E2=80=99s why my initial proposal was to add a GUC to control the fa= llback behavior, which would be convenient for the central database team. > > > From what I understood, the original > > requirement was to ensure that tables without a primary key do not > > silently fail replication, and that this is handled automatically > > without manual intervention. In contrast, the FORCE approach still > > requires users to explicitly configure replica identity for each > > table, which seems to reintroduce the same operational burden. > > > > For example, instead of: > > ALTER TABLE t1 REPLICA IDENTITY FORCE; > > the user could simply do: > > ALTER TABLE t1 REPLICA IDENTITY FULL > > > > By the time, the user is explicitly configuring replica identity, they > > would already need to understand the replication semantics, so it's > > not clear what additional benefit FORCE provides over FULL. Am I > > missing something here? > > > > thanks > > Shveta > > > -- > Chao Li (Evan) > HighGo Software Co., Ltd. > https://www.highgo.com/ > > > >