Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vXePJ-000Ry7-18 for pgsql-hackers@arkaria.postgresql.org; Mon, 22 Dec 2025 11:48:58 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vXePI-00Ef9f-0J for pgsql-hackers@arkaria.postgresql.org; Mon, 22 Dec 2025 11:48:56 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vXePH-00Ef9W-2W for pgsql-hackers@lists.postgresql.org; Mon, 22 Dec 2025 11:48:56 +0000 Received: from mail-lj1-x22e.google.com ([2a00:1450:4864:20::22e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vXePG-00242R-18 for pgsql-hackers@lists.postgresql.org; Mon, 22 Dec 2025 11:48:56 +0000 Received: by mail-lj1-x22e.google.com with SMTP id 38308e7fff4ca-37bb8bef4cdso38149411fa.3 for ; Mon, 22 Dec 2025 03:48:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766404134; x=1767008934; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=581dcUzEkBWpzIcWUd/QexxvnGM1uVzalEy1+zMlNBM=; b=BQvU1f2W3MGcBV/lI3zlw8gvDC+6XGGL3GOijHXiCR78me/2iguvj0OLCc7Mgso9LX jRA4a/7K5D2YcYZtJwzzPfDCcJqdKqy6O6WNnDUkH1f+85uTHvqtixPdNVwOo9ui4KBL dY6SHo5KkJOZWFEPjz9204+SeKe8mMWR4O/d7YAawSktzGGoSx+XSaGvJFlVSibTP7mg gwiuQF3YDjnWC6ueNf7lBpJuYYjy7/R+vO5oBGVvQLk3Se9tsqDjtCQmFP+TU6oOXw7o 5/b2HEafLzbJhQ37U0LHEFEzC/ehSMdus3pHswf24/kCxkr0X8/I59hd5h94oW2hguy1 fV2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766404134; x=1767008934; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=581dcUzEkBWpzIcWUd/QexxvnGM1uVzalEy1+zMlNBM=; b=Uq7zP9qcHF/ajAGrGII15rf0xBKIuUZyt4IncRbYSgQ6UPpAlFHnfCFoZKtqP3f+JT ppytRbKxmP/nmbq0x3biBDBZZeyVIRjEuzSa/UUdFiPTC0wAK/ButKEWEuwj2/sWwWWC 1BuRb3ku/YSV/M9i7XcRmAQcpN3q4w2vjORWjChnCK7JHjZnvusNzKaiPqX6BlPFYPcw cvxsWd26QZpbCVkQR9ZfF9rhRf66hsGNFByQsuAFG+H6/4KJRsfAo4vY2x+yERbHTR3R H5sSqlE9eW3hSFLL16QjpSp6uLI49NmWrdcFYY0GVQs8CzTHjaFYJpDmMj9OMKDkkvE3 8j7w== X-Forwarded-Encrypted: i=1; AJvYcCWzeABsQ+UqY8hZn2gnUO/CZuE2FmbeKwP3iMbCSnjHWB8vAaS8nDwKMNyQSWfzXhL9q6UBXt/MMAOE5ggN@lists.postgresql.org X-Gm-Message-State: AOJu0Ywj3vEGir1oCfJi1BfynRM4VdiM6x9LeFnUshhGpIBjEc3TfUwI ciuaImIf41jyOL8faxlvGiNDCKKX70INOILm5coDRtx70Kw54WpkQZTAOz0sNq2SdN1k7C2ZzxJ TxlOxJj5xtoKPqWpBTxRZ1kHDDQ+wmQ8= X-Gm-Gg: AY/fxX4c1DazBq8iFrUEMrMl5feW1vk0PyabzDDFbLyVaRoYAvV8xwjc5T6MFgAvVqI hGH+NdQi5XF0IhZ5I/ZzJ+olIcasmC5K01AN7mHxfQY59PkIi2Sx8NhtSibrQeOH9Nx6NigukAy SF5I814ps1zRSVre/XxuMbrhR8e/BYqEQdib7iZPTWJj5eVTokQEVlzXZPMP6YIFmRvVyfkAp4B 8YA7AMmd3WSms/3X6OqmvjXEBAydskcWHYGkjUrivuToL8fX6obYOWu8ibZmNUCu0+Of3NRQuPm gWyLAU8R2t7/pnlPrtyPQJRrWlFnYA== X-Google-Smtp-Source: AGHT+IGxUxReg6v94ywEDi6G4FPPj4Zbb2lq64sEM3YynpvGY92xBqEhN4DHrOYhQurQoms3uoQ/por6LXQUkzgBC6g= X-Received: by 2002:a2e:a7cf:0:b0:37a:45b0:467a with SMTP id 38308e7fff4ca-3812158ffe9mr36647681fa.5.1766404133251; Mon, 22 Dec 2025 03:48:53 -0800 (PST) MIME-Version: 1.0 References: <5ABD7727-CD22-4112-A186-0E788EE78109@gmail.com> <23A24BFF-18A7-4FE9-AAFA-13E1AA207DD0@gmail.com> <19dac243-1f46-4720-bdec-bf0f851d03b9@app.fastmail.com> <875BBCC0-CF08-4136-8E9E-F03DF75C3A11@gmail.com> In-Reply-To: <875BBCC0-CF08-4136-8E9E-F03DF75C3A11@gmail.com> From: Amit Kapila Date: Mon, 22 Dec 2025 17:18:41 +0530 X-Gm-Features: AQt7F2oQx3TmZhBTV2wNW-_rx57WyPk7vgDwRzK4Srp693_iLZG983QrXXv0A8I Message-ID: Subject: Re: Improve logical replication usability when tables lack primary keys To: Chao Li Cc: Euler Taveira , GRANT ZHOU , "houzj.fnst@fujitsu.com" , Dilip Kumar , Postgres hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Dec 19, 2025 at 1:39=E2=80=AFPM Chao Li wr= ote: > > > On Dec 18, 2025, at 22:49, Euler Taveira wrote: > > > > On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote: > >> On Wed, Dec 17, 2025 at 12:50=E2=80=AFPM Euler Taveira wrote: > >>> Each table needs to say what's its row identifier. The user created a= table > >>> without primary key. Well, create a primary key. There are dozens of = thousands > >>> of objects. Use a script. > >> However, I=E2=80=99d like to share a user perspective regarding the "u= se a > >> script" approach. The main value of `FOR TABLES IN SCHEMA` is > >> *in-database automation*. If users still need to maintain external > >> scripts to monitor and `ALTER` new tables to prevent replication > >> errors, it significantly diminishes the value of that automation. > >> > > > > As I tried to explain in the previous email, the problem with FOR ALL T= ABLES > > and FOR TABLES IN SCHEMA syntax is that the is no catalog information a= bout the > > relations; the list of relations is collected at runtime. > > > > When I suggested "use a script" I was referring to fix the logical repl= ication > > setup regarding the lack of primary key. There is no need to have an au= tomation > > outside the database, use an event trigger. If your lazy user doesn't c= reate > > the primary key, assign REPLICA IDENTITY FULL. Something like > > > > -- This example is far from being a complete solution for fixing the la= ck of > > -- primary key in a logical replication scenario. > > -- ALTER TABLE should be supported too > > CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity() > > RETURNS event_trigger LANGUAGE plpgsql AS $$ > > DECLARE > > obj record; > > rec record; > > ricnt integer :=3D 0; > > BEGIN > > FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() > > LOOP > > IF obj.command_tag =3D 'CREATE TABLE' THEN > > SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid =3D = obj.objid AND indisprimary; > > RAISE NOTICE 'ricnt: %', ricnt; > > IF ricnt =3D 0 THEN > > EXECUTE 'ALTER TABLE ' || obj.object_identity || ' REPLI= CA IDENTITY FULL'; > > END IF; > > END IF; > > END LOOP; > > END; > > $$; > > > > CREATE EVENT TRIGGER event_trigger_for_replica_identity > > ON ddl_command_end > > EXECUTE FUNCTION event_trigger_for_replica_identity(); > > > > CREATE TABLE event_trigger_test_1 (a int); > > \d+ event_trigger_test_1 > > CREATE TABLE event_trigger_test_2 (a int primary key); > > \d+ event_trigger_test_2 > > CREATE TABLE event_trigger_test_3 (a int, b text not null, primary key(= b)); > > \d+ event_trigger_test_3 > > --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT event_trigger_test_3= _pkey; > > --\d+ event_trigger_test_3 > > > > DROP EVENT TRIGGER event_trigger_for_replica_identity; > > DROP FUNCTION event_trigger_for_replica_identity; > > DROP TABLE event_trigger_test_1, event_trigger_test_2, event_trigger_te= st_3; > > > > 8<---------------------------------------------------------------------= -------8< > > > >> Additionally, tables without Primary Keys are valid SQL and extremely > >> common in enterprise environments (e.g., audit logs, data warehousing)= . > >> In large-scale deployments, enforcing PKs on every single table isn't > >> always practical. > >> > > > > I'm not saying users shouldn't create tables without a primary key. I'm= arguing > > that this decision should take into account what adjustments need to be= made to > > use these tables in logical replication. > > > >> > >> I think the goal of this proposal is not to change the underlying tabl= e > >> property design, but rather to seek a mechanism (like a Publication > >> option) to ensure this automation functions safely without external > >> intervention. It is simply about allowing the database to handle these > >> valid, common scenarios gracefully when automation is enabled. > >> > > > > You didn't get it. You already have one property to handle it and you a= re > > proposing to add a second property to handle it. > > > > I think you are pursuing the wrong solution. IMO we need a solution to = enforce > > that the logical replication contract is valid. If you create or modify= a table > > that is part of a publication, there is no validation that that table c= omplies > > with the publication properties (update and delete properties should re= quire an > > appropriate replica identity). We should close the gaps in both publica= tion and > > table. > > If we want, we can ensure that any table added to that specific publication (that has an option replica_identy=3D'full') would automatically override the default to FULL, if PK is not available. This information can be cached to avoid overhead. > > If I summarize Euler=E2=80=99s position in short words: discipline over c= onvenience. I actually strongly agree with that. In PG we generally prefer = explicit over implicit behavior, and predictability over magic. > You haven't told why we can't consider a custom event trigger as suggested by Euler for customers who are not willing to change the RI default explicitly for each table. I think it is worth considering providing a custom solution outside core-postgres for your customers for this specific case. > Based on the discussion so far, I think we share the following design goa= ls: > > 1) Keep replica identity as a table property. > 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in table= s without primary keys. > 3) Avoid global or implicit behavior changes. > 4) Preserve explicit opt-in for higher WAL cost. > 5) Keep the logical replication contract explicit and enforceable. > > I=E2=80=99ve been thinking about whether adding a new replica identity co= uld meet these goals. > > Today we have four replica identities: DEFAULT (PK, fallback to NONE), IN= DEX, FULL, and NONE. > > What if we introduce a new replica identity, tentatively called =E2=80=9C= FORCE=E2=80=9D: PK with fallback to FULL. (Let=E2=80=99s keep our focus on = the design, not argue the name for now.) > > With this approach: > > 1) Replica identity remains a table property. > 2) Publication membership is still evaluated at runtime, so FOR TABLES IN= SCHEMA is not special-cased. > 3) No new GUCs are required. > 4) The user must explicitly opt in by setting the replica identity. Once = FORCE is chosen, adding or dropping a primary key later does not silently b= reak UPDATE/DELETE replication. > > 5) The logical replication contract remains explicit; the table declares = that it is safe for UPDATE/DELETE replication even without a PK, at the cos= t of higher WAL volume. > > This feels like a small, explicit extension of the existing RI semantics.= Notably, REPLICA IDENTITY DEFAULT already has conditional behavior (PK fal= lback > to NONE), so conditional RI behavior is not new, this would just make a d= ifferent fallback explicit and user-chosen. > > After that, we could consider a database-level default_replica_identity s= etting, applied at table creation time, for environments that want this beh= avior consistently. But that would only make sense if we first agree on the= table-level mechanism. > I don't much like the database-level option as it expects a new default to be introduced. I think the internal working will almost be same as the option at publication-level. --=20 With Regards, Amit Kapila.