Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vWVYG-002RYE-0n for pgsql-hackers@arkaria.postgresql.org; Fri, 19 Dec 2025 08:09:29 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vWVYE-00739D-1n for pgsql-hackers@arkaria.postgresql.org; Fri, 19 Dec 2025 08:09:27 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vWVYE-007395-0O for pgsql-hackers@lists.postgresql.org; Fri, 19 Dec 2025 08:09:27 +0000 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vWVYC-001TJk-0v for pgsql-hackers@lists.postgresql.org; Fri, 19 Dec 2025 08:09:26 +0000 Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-2a0d52768ccso19229065ad.1 for ; Fri, 19 Dec 2025 00:09:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766131763; x=1766736563; darn=lists.postgresql.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=d8UIiwQA4s71ZRZVri+uAfdGqkNlL9a4sY6IXwHdOUw=; b=Nv6BG6M0XTmq1l4fyhpuXI/EBXVyXZ1XfXoVZ6ZuZSa/qQYr6S3X0wmxAd3s0N18Qg LUfNWlSAxwJYFbVdSKPoMXj/a84k8UARTuPMjmUULvYjFCQg8kwO/NbTREEDlLNlw7r5 p4mdqYNf/30YSrNryV+nL4Lqx03EX5ot7MaC2ywUzDGSkeRRg5k7I8Wv+XjcOE0MRKR9 75/ApLCoW3XULQB73Xosk9BBTI5txRnmWSNMwIyIbvtXiIC0eJ0ExTGJxaLHpKSlwZiX OoXIWLttbIAQua/M/gAP9WlFVQt4alUi2cytRIFki6MkkulalX7upX86eaI8dNA0crni KWfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766131763; x=1766736563; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=d8UIiwQA4s71ZRZVri+uAfdGqkNlL9a4sY6IXwHdOUw=; b=QwQUzy2WcrDaABfvVTBoMruJsoJdJ/bibtzhFMb0Zde4pr2ciqMJeXOgd5S43o8GoR yjT7DPHSChR/yHQTyEmwgabMWuFAAHsye8BuklprLe8QKxY8RZUzXa3qgT6LKNR0QRqW VaNYudonk2cM9OvExaaiwr6kjjLktTlI/iX2WvF5wk1LK0l4Nle+H3CK0E7ebT/n0V// ZlOrTEGMWNH9x2oD9QlVoIFYZLnVjMulEeIVELmyoeNwXe4eYHTsvKVQa6r1ilVJ+3C4 1b07V6opYleigh9WvKeC2q0EpzxPYrhctUNtgfay/JMTlPMGyjQ9Q0LTxLGy7CvqlIb1 p0KA== X-Forwarded-Encrypted: i=1; AJvYcCVnAleKeNGc1rpwEpt0gFZ82mhzcLqdS6Aa2eYOlVJqCjiUfEguuMCkqLKnGx3NiuKxE2Iy+SfYXXD6zeuA@lists.postgresql.org X-Gm-Message-State: AOJu0YzcNVLIIC2L2loqhar6m/0bRpzhYPoSAX8UVcO2CxQcnJ9BZQi8 YCZ4hOQ1YoXtuIxSbrz6bMgG2DLtGXEjeLMjfqJpBD08elX59h4SVt0C X-Gm-Gg: AY/fxX5jZX1iO0U3M33UUQbiODTHwVQ/AiDfqiokRLcobbSuj+kP9H/Oa4ZoncfOHnN LSa0y+o0+eoyvLR6datmrDLYkyW3B2NyC4WJZ+193lJvsp/k87CDyfEzzZpJMKZsG78Tg+UT2qN y7AePuWtiRtwObB+XYTeRJejKr5KKEJZWUW2YpeRYiIHN/Ja+b0ONT4RynoC7XGMlR4HEexf1yt /UuvS9VoPlcSrdX+C70nJcWbJzhIgL76aI3aY9Z9+uiianf+SJVhRQSkSNyVBrfzq49bhuy1fa3 LYB0TG6XlXKuvBdDm/NheoemCF5CutlxvlZooPg+5oEyXUVnvaLqx0WnyYiXDJj9zshlNs5rOhj 2mvlkNcg6JpEyNyzXxBwFxAV5er+s66TLY3YaytYpzeR+MZew22S1Sb5MyMT4Ji6MTyjwihWB/P Nv/S9JcQx00mJgvglOu94kmrRmx8Dsqdo= X-Google-Smtp-Source: AGHT+IFF69VzmKojKD9TI3MNVszLYMNreBbXbTs3BMIGEMK8WmuamvZI1b1KMEB/M+jtQ+7Kz+GXmA== X-Received: by 2002:a05:7022:503:b0:11b:9386:8260 with SMTP id a92af1059eb24-1217231390fmr1919846c88.45.1766131762848; Fri, 19 Dec 2025 00:09:22 -0800 (PST) Received: from smtpclient.apple ([142.171.105.12]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1217253c058sm5626220c88.11.2025.12.19.00.09.20 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 19 Dec 2025 00:09:22 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81\)) Subject: Re: Improve logical replication usability when tables lack primary keys From: Chao Li In-Reply-To: <19dac243-1f46-4720-bdec-bf0f851d03b9@app.fastmail.com> Date: Fri, 19 Dec 2025 16:08:39 +0800 Cc: GRANT ZHOU , "houzj.fnst@fujitsu.com" , Amit Kapila , Dilip Kumar , Postgres hackers Content-Transfer-Encoding: quoted-printable Message-Id: <875BBCC0-CF08-4136-8E9E-F03DF75C3A11@gmail.com> References: <5ABD7727-CD22-4112-A186-0E788EE78109@gmail.com> <23A24BFF-18A7-4FE9-AAFA-13E1AA207DD0@gmail.com> <19dac243-1f46-4720-bdec-bf0f851d03b9@app.fastmail.com> To: Euler Taveira X-Mailer: Apple Mail (2.3826.700.81) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On Dec 18, 2025, at 22:49, Euler Taveira wrote: >=20 > On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote: >> On Wed, Dec 17, 2025 at 12:50=E2=80=AFPM Euler Taveira = wrote: >>> Each table needs to say what's its row identifier. The user created = a table >>> without primary key. Well, create a primary key. There are dozens of = thousands >>> of objects. Use a script. >> However, I=E2=80=99d like to share a user perspective regarding the = "use a=20 >> script" approach. The main value of `FOR TABLES IN SCHEMA` is=20 >> *in-database automation*. If users still need to maintain external=20 >> scripts to monitor and `ALTER` new tables to prevent replication=20 >> errors, it significantly diminishes the value of that automation. >>=20 >=20 > As I tried to explain in the previous email, the problem with FOR ALL = TABLES > and FOR TABLES IN SCHEMA syntax is that the is no catalog information = about the > relations; the list of relations is collected at runtime. >=20 > When I suggested "use a script" I was referring to fix the logical = replication > setup regarding the lack of primary key. There is no need to have an = automation > outside the database, use an event trigger. If your lazy user doesn't = create > the primary key, assign REPLICA IDENTITY FULL. Something like >=20 > -- This example is far from being a complete solution for fixing the = lack of > -- primary key in a logical replication scenario. > -- ALTER TABLE should be supported too > CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity() > RETURNS event_trigger LANGUAGE plpgsql AS $$ > DECLARE > obj record; > rec record; > ricnt integer :=3D 0; > BEGIN > FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() > LOOP > IF obj.command_tag =3D 'CREATE TABLE' THEN > SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid =3D = obj.objid AND indisprimary; > RAISE NOTICE 'ricnt: %', ricnt; > IF ricnt =3D 0 THEN > EXECUTE 'ALTER TABLE ' || obj.object_identity || ' = REPLICA IDENTITY FULL'; > END IF; > END IF; > END LOOP; > END; > $$; >=20 > CREATE EVENT TRIGGER event_trigger_for_replica_identity > ON ddl_command_end > EXECUTE FUNCTION event_trigger_for_replica_identity(); >=20 > CREATE TABLE event_trigger_test_1 (a int); > \d+ event_trigger_test_1 > CREATE TABLE event_trigger_test_2 (a int primary key); > \d+ event_trigger_test_2 > CREATE TABLE event_trigger_test_3 (a int, b text not null, primary = key(b)); > \d+ event_trigger_test_3 > --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT = event_trigger_test_3_pkey; > --\d+ event_trigger_test_3 >=20 > DROP EVENT TRIGGER event_trigger_for_replica_identity; > DROP FUNCTION event_trigger_for_replica_identity; > DROP TABLE event_trigger_test_1, event_trigger_test_2, = event_trigger_test_3; >=20 > = 8<------------------------------------------------------------------------= ----8< >=20 >> Additionally, tables without Primary Keys are valid SQL and extremely=20= >> common in enterprise environments (e.g., audit logs, data = warehousing).=20 >> In large-scale deployments, enforcing PKs on every single table isn't=20= >> always practical.=20 >>=20 >=20 > I'm not saying users shouldn't create tables without a primary key. = I'm arguing > that this decision should take into account what adjustments need to = be made to > use these tables in logical replication. >=20 >>=20 >> I think the goal of this proposal is not to change the underlying = table=20 >> property design, but rather to seek a mechanism (like a Publication=20= >> option) to ensure this automation functions safely without external=20= >> intervention. It is simply about allowing the database to handle = these=20 >> valid, common scenarios gracefully when automation is enabled. >>=20 >=20 > You didn't get it. You already have one property to handle it and you = are > proposing to add a second property to handle it. >=20 > I think you are pursuing the wrong solution. IMO we need a solution to = enforce > that the logical replication contract is valid. If you create or = modify a table > that is part of a publication, there is no validation that that table = complies > with the publication properties (update and delete properties should = require an > appropriate replica identity). We should close the gaps in both = publication and > table. >=20 If I summarize Euler=E2=80=99s position in short words: discipline over = convenience. I actually strongly agree with that. In PG we generally = prefer explicit over implicit behavior, and predictability over magic. Based on the discussion so far, I think we share the following design = goals: 1) Keep replica identity as a table property. 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in = tables without primary keys. 3) Avoid global or implicit behavior changes. 4) Preserve explicit opt-in for higher WAL cost. 5) Keep the logical replication contract explicit and enforceable. I=E2=80=99ve been thinking about whether adding a new replica identity = could meet these goals. Today we have four replica identities: DEFAULT (PK, fallback to NONE), = INDEX, FULL, and NONE. What if we introduce a new replica identity, tentatively called = =E2=80=9CFORCE=E2=80=9D: PK with fallback to FULL. (Let=E2=80=99s keep = our focus on the design, not argue the name for now.) With this approach: 1) Replica identity remains a table property. 2) Publication membership is still evaluated at runtime, so FOR TABLES = IN SCHEMA is not special-cased. 3) No new GUCs are required. 4) The user must explicitly opt in by setting the replica identity. Once = FORCE is chosen, adding or dropping a primary key later does not = silently break UPDATE/DELETE replication. 5) The logical replication contract remains explicit; the table declares = that it is safe for UPDATE/DELETE replication even without a PK, at the = cost of higher WAL volume. This feels like a small, explicit extension of the existing RI = semantics. Notably, REPLICA IDENTITY DEFAULT already has conditional = behavior (PK fallback to NONE), so conditional RI behavior is not new, this would just make a = different fallback explicit and user-chosen. After that, we could consider a database-level default_replica_identity = setting, applied at table creation time, for environments that want this = behavior consistently. But that would only make sense if we first agree = on the table-level mechanism. I=E2=80=99m interested in whether this direction aligns better with the = goals above. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/