Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wAQ3M-002Mld-0U for pgsql-hackers@arkaria.postgresql.org; Wed, 08 Apr 2026 10:22:32 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wAQ3K-006YMo-2C for pgsql-hackers@arkaria.postgresql.org; Wed, 08 Apr 2026 10:22:31 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wAQ3K-006YMg-1A for pgsql-hackers@lists.postgresql.org; Wed, 08 Apr 2026 10:22:30 +0000 Received: from mail-pj1-x1036.google.com ([2607:f8b0:4864:20::1036]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wAQ3I-00000001HNI-3Lsq for pgsql-hackers@lists.postgresql.org; Wed, 08 Apr 2026 10:22:30 +0000 Received: by mail-pj1-x1036.google.com with SMTP id 98e67ed59e1d1-35da9692ec3so5407627a91.1 for ; Wed, 08 Apr 2026 03:22:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1775643747; cv=none; d=google.com; s=arc-20240605; b=UTjNme/4t/dc82xe43N8euAQVMaW16bJPEnDL3dIXl3h4McofHkL0tQo4rm7/nZm1q EfaE8TNevBOFpw4DiLltq0vJmaw+EO5QvEgK6ovQGgMzLIz2NWy/526sozwoeaDrA8O+ BH9inLWzO1MPEmtZfyJZS9s9wihqeOVK7pNbeqUq3A9Jcqr46+wAjVHshKGRVSru+juG QU7GL3ENtXcjoV7SKz+dUWodatiXrsCMM98SdJRDKMFqrOAxaKPcDxhC19Bnw8J7awFn 4NZLMWQ9wJQqbXE9tSF3fG6ygoQc9u4s/OkH3fYJnDonM9xAP4jzvnDzNBQ/7cFojdVf UxuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=BQy0UqZslSJu3Tims8wY3ZsJMVROicmDIf8Hoc11+20=; fh=NiL91BUYMcyKgITa7+sE4/1vPzd4H3ditnhMxR7nzlw=; b=YfgyAfernapGGWLOTtwP3tnycKQg2xTugiIcauzKo2JFI4EkY2aobKr4LLNNxhzq4J mNiNbroarZcNsbDBYyowBi3nWK83BpxgR8Pq2VxIxqULwqYvvV4HmBOf5tNnqjpr1Wze l3l2MWhYXSzPk/OPXgBjywRXQCRb848VYjmNFoxSVmyaEiRZvAiMxt9r25RZlRPaeufR QufDLyxSYpDAtRt2B3eCPxh5xbJPrFcpVk/Nso3G8aHRNVsldFPa+2f4gTMuxk2CQh/9 ZiRBE/wKVqhRpDyUIM/784X2WY4JZlROeouO9Zz10FLRZXoKfUXD0jvO+71VHkSAF/wK UnOg==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775643747; x=1776248547; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BQy0UqZslSJu3Tims8wY3ZsJMVROicmDIf8Hoc11+20=; b=Bh3GK2RMAni1JS+cSeV3DdaEixvmLQ5hQ07FzW3nNIbuqOCopF7bX1DLaTqlLlq8e/ boCsH4HLFPrdhHfKB2oxbZYDpokA0GgdJN0Bgl6d8QloV3psbk5gCpBIROmP3AtnKir5 B5S7DsGwIJD7v9N+/x151peurX3LLEeh9ttTxurG4t7kccyrAkE678TF7ZfShzgvWwin 3riIbvj9mGsIl2mCxgDBYGfv9MgvENtZeT40pZ/qrrkNYSBSYOlnuUa3dQGPjHQnNxFx c3L7iM5FyMoPipfK83+x2ijZ3QdSZBtzxKXUGdGAQ6gxNLiAXjjTGA7T6/F8sbgVJvKF k41Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775643747; x=1776248547; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=BQy0UqZslSJu3Tims8wY3ZsJMVROicmDIf8Hoc11+20=; b=ZrWzItCCWbiM/XB8SUYe6n3O+EHkl0JEGGWB8hAGd9n6z/+KCdT8QXRmPzGx3Hk84x Jaa+QOeMLHBth4vaagOFnP35gKVCMFDXMNSj1jZSMpXHBxSDa5Qm4/yx6/BIhRhYsHcw KWnL4yx1euk5uaY1lxQeV9LJbxXvr3Q+nlZjXk2C2XgqhG+77tyOD/VEiGIWIwcjyNdx qVQy8GobGqFPyEsGSoUoKllcuQHl9evhhYyrEjKxeBH6vJUYRk9RimrfofxAi2Lr//5G rltrryB4m6JM4D3Z9N3AaQ7TkLfR98NCepJ+uuhh+okTw4pf8riIRjI4P9ertPuFo+qh bhhQ== X-Forwarded-Encrypted: i=1; AJvYcCV5xOwcsczXcYuH3oqmE28+hCFGlFDikyJ/eh3B7RJOG1XrLVzGwPj42Fp5boufWLEUlmScyeLhVmkHTtj9@lists.postgresql.org X-Gm-Message-State: AOJu0YzGfu6N5iHyBhVyuQdpaNjqgxsQmxCaDo0cTWLSL70bPpzSLVtV M0MOE/oCTPy4z8jmlOOmCr0ZYWuBeYYbmQpOkPnPjDTUTs2t2gWryUQ7GeVscRXWReZh16Xch74 INHVofUj+7XpdKjZTIWTt6RpcyywKmDQ= X-Gm-Gg: AeBDieuo+utbDGmMeCWMMeWdku+uS6wyBlBnvK6F7uNTPAaDXlV2N9bHk9rVoci4fVq r/LyH5vmfsq7QbxEjIWrH30qJmV3834SLeOhwchJ8/SuAgtQP/rHbynFTpwz47pwqOSWFjIbn+f 4UStcNL1m4RF078C7PCMNSBx/2UzbUqna6G4ccX/mL/J7huTKwk/hEl2cYsh7XOnFhU1u3XpF4d LlWsBAeCZAEXgMOenGovONDKDq8oLqRevpto1HGHQ23wAqGDusfrqWhhJA4hCqi/NVqIsfPcvNJ I7IK8u7eRLN5jalidTtp10/bc47rBOWZDrGsknjFWW7iFgHDSwc= X-Received: by 2002:a17:903:2ec7:b0:2b0:4554:9c24 with SMTP id d9443c01a7336-2b2817a20e9mr216134995ad.32.1775643746644; Wed, 08 Apr 2026 03:22:26 -0700 (PDT) MIME-Version: 1.0 References: <5ABD7727-CD22-4112-A186-0E788EE78109@gmail.com> <23A24BFF-18A7-4FE9-AAFA-13E1AA207DD0@gmail.com> <19dac243-1f46-4720-bdec-bf0f851d03b9@app.fastmail.com> <875BBCC0-CF08-4136-8E9E-F03DF75C3A11@gmail.com> In-Reply-To: <875BBCC0-CF08-4136-8E9E-F03DF75C3A11@gmail.com> From: shveta malik Date: Wed, 8 Apr 2026 15:52:14 +0530 X-Gm-Features: AQROBzDjNSaN4sCIUuRJh5yKhthzaR_qUesdGexBuoCeoiCgvvlDro7zCx7uPkg Message-ID: Subject: Re: Improve logical replication usability when tables lack primary keys To: Chao Li Cc: Euler Taveira , GRANT ZHOU , "houzj.fnst@fujitsu.com" , Amit Kapila , Dilip Kumar , Postgres hackers , shveta malik Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Dec 19, 2025 at 1:39=E2=80=AFPM Chao Li wr= ote: > > > > > On Dec 18, 2025, at 22:49, Euler Taveira wrote: > > > > On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote: > >> On Wed, Dec 17, 2025 at 12:50=E2=80=AFPM Euler Taveira wrote: > >>> Each table needs to say what's its row identifier. The user created a= table > >>> without primary key. Well, create a primary key. There are dozens of = thousands > >>> of objects. Use a script. > >> However, I=E2=80=99d like to share a user perspective regarding the "u= se a > >> script" approach. The main value of `FOR TABLES IN SCHEMA` is > >> *in-database automation*. If users still need to maintain external > >> scripts to monitor and `ALTER` new tables to prevent replication > >> errors, it significantly diminishes the value of that automation. > >> > > > > As I tried to explain in the previous email, the problem with FOR ALL T= ABLES > > and FOR TABLES IN SCHEMA syntax is that the is no catalog information a= bout the > > relations; the list of relations is collected at runtime. > > > > When I suggested "use a script" I was referring to fix the logical repl= ication > > setup regarding the lack of primary key. There is no need to have an au= tomation > > outside the database, use an event trigger. If your lazy user doesn't c= reate > > the primary key, assign REPLICA IDENTITY FULL. Something like > > > > -- This example is far from being a complete solution for fixing the la= ck of > > -- primary key in a logical replication scenario. > > -- ALTER TABLE should be supported too > > CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity() > > RETURNS event_trigger LANGUAGE plpgsql AS $$ > > DECLARE > > obj record; > > rec record; > > ricnt integer :=3D 0; > > BEGIN > > FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() > > LOOP > > IF obj.command_tag =3D 'CREATE TABLE' THEN > > SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid =3D = obj.objid AND indisprimary; > > RAISE NOTICE 'ricnt: %', ricnt; > > IF ricnt =3D 0 THEN > > EXECUTE 'ALTER TABLE ' || obj.object_identity || ' REPLI= CA IDENTITY FULL'; > > END IF; > > END IF; > > END LOOP; > > END; > > $$; > > > > CREATE EVENT TRIGGER event_trigger_for_replica_identity > > ON ddl_command_end > > EXECUTE FUNCTION event_trigger_for_replica_identity(); > > > > CREATE TABLE event_trigger_test_1 (a int); > > \d+ event_trigger_test_1 > > CREATE TABLE event_trigger_test_2 (a int primary key); > > \d+ event_trigger_test_2 > > CREATE TABLE event_trigger_test_3 (a int, b text not null, primary key(= b)); > > \d+ event_trigger_test_3 > > --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT event_trigger_test_3= _pkey; > > --\d+ event_trigger_test_3 > > > > DROP EVENT TRIGGER event_trigger_for_replica_identity; > > DROP FUNCTION event_trigger_for_replica_identity; > > DROP TABLE event_trigger_test_1, event_trigger_test_2, event_trigger_te= st_3; > > > > 8<---------------------------------------------------------------------= -------8< > > > >> Additionally, tables without Primary Keys are valid SQL and extremely > >> common in enterprise environments (e.g., audit logs, data warehousing)= . > >> In large-scale deployments, enforcing PKs on every single table isn't > >> always practical. > >> > > > > I'm not saying users shouldn't create tables without a primary key. I'm= arguing > > that this decision should take into account what adjustments need to be= made to > > use these tables in logical replication. > > > >> > >> I think the goal of this proposal is not to change the underlying tabl= e > >> property design, but rather to seek a mechanism (like a Publication > >> option) to ensure this automation functions safely without external > >> intervention. It is simply about allowing the database to handle these > >> valid, common scenarios gracefully when automation is enabled. > >> > > > > You didn't get it. You already have one property to handle it and you a= re > > proposing to add a second property to handle it. > > > > I think you are pursuing the wrong solution. IMO we need a solution to = enforce > > that the logical replication contract is valid. If you create or modify= a table > > that is part of a publication, there is no validation that that table c= omplies > > with the publication properties (update and delete properties should re= quire an > > appropriate replica identity). We should close the gaps in both publica= tion and > > table. > > > > If I summarize Euler=E2=80=99s position in short words: discipline over c= onvenience. I actually strongly agree with that. In PG we generally prefer = explicit over implicit behavior, and predictability over magic. > > Based on the discussion so far, I think we share the following design goa= ls: > > 1) Keep replica identity as a table property. > 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in table= s without primary keys. > 3) Avoid global or implicit behavior changes. > 4) Preserve explicit opt-in for higher WAL cost. > 5) Keep the logical replication contract explicit and enforceable. > > I=E2=80=99ve been thinking about whether adding a new replica identity co= uld meet these goals. > > Today we have four replica identities: DEFAULT (PK, fallback to NONE), IN= DEX, FULL, and NONE. > > What if we introduce a new replica identity, tentatively called =E2=80=9C= FORCE=E2=80=9D: PK with fallback to FULL. (Let=E2=80=99s keep our focus on = the design, not argue the name for now.) > > With this approach: > > 1) Replica identity remains a table property. > 2) Publication membership is still evaluated at runtime, so FOR TABLES IN= SCHEMA is not special-cased. > 3) No new GUCs are required. > 4) The user must explicitly opt in by setting the replica identity. Once = FORCE is chosen, adding or dropping a primary key later does not silently b= reak UPDATE/DELETE replication. > 5) The logical replication contract remains explicit; the table declares = that it is safe for UPDATE/DELETE replication even without a PK, at the cos= t of higher WAL volume. > > This feels like a small, explicit extension of the existing RI semantics.= Notably, REPLICA IDENTITY DEFAULT already has conditional behavior (PK fal= lback > to NONE), so conditional RI behavior is not new, this would just make a d= ifferent fallback explicit and user-chosen. Even though I like the idea of keeping replica identity as a table property, this raises a question about the need for a new FORCE option. With this approach, the user must explicitly opt REPLICA IDENTITY FORCE for tables without a primary key (pt 4 in your design). But if the user is already expected to take an explicit action, they could simply set REPLICA IDENTITY FULL themselves at table creation time or via ALTER TABLE. From what I understood, the original requirement was to ensure that tables without a primary key do not silently fail replication, and that this is handled automatically without manual intervention. In contrast, the FORCE approach still requires users to explicitly configure replica identity for each table, which seems to reintroduce the same operational burden. For example, instead of: ALTER TABLE t1 REPLICA IDENTITY FORCE; the user could simply do: ALTER TABLE t1 REPLICA IDENTITY FULL By the time, the user is explicitly configuring replica identity, they would already need to understand the replication semantics, so it's not clear what additional benefit FORCE provides over FULL. Am I missing something here? thanks Shveta