Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXZWS-003L8H-1d for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Jun 2026 07:08:16 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wXZWQ-00Em0X-0H for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Jun 2026 07:08:14 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXZWP-00Em0P-2R for pgsql-hackers@lists.postgresql.org; Thu, 11 Jun 2026 07:08:13 +0000 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wXZWN-00000002TwU-2Ixz for pgsql-hackers@lists.postgresql.org; Thu, 11 Jun 2026 07:08:13 +0000 Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-36dd65b95f2so383105a91.0 for ; Thu, 11 Jun 2026 00:08:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781161688; cv=none; d=google.com; s=arc-20240605; b=VpGveAt4dD5pfyidJWlcYzPEh0uYm2XHRf0C/if9nA92G1lbQp0il7cJSmWezSzZEa blJFdZMX6GS7MYDGD8qB277ZRk6Is9aSMH5a7oiv0LU4Oh+GfA10OfCmBO20HQa7fFIL eBmNMcW/r2jbTmd/Wt/OEjTZkZW5rN2V6LzN9pdnhiuDzWcVbXDr1fkkxGPLboev36gT KZKzrPr0E8mqW8ZOhzUMES4EO449UcXvsZAYfEJx5BZoQTxrC6t4GtHaoP89hrt1pOSa w5uLaHEi42CQlEcYBtOy4qAs8DBQeKY0WyzxUsoYNh/6MOzR8wIwRF/IqorBvK4KyD0E 22pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=cjfhvNssoOAYDQ5/TfXfIvW2C4yHWEf/9njXUrD9BPs=; fh=dNd8Ck8cu/Nct2pr8lNy60Oiv7m/9cLOJ8xF+imNoTs=; b=k7R749fSI49xv87M2j/bJPJCYVRnIBBj8OwU9jHlRzNvKEdj7YcAs10naN8KxHzaBM /0AEx3KBLEr9ZfNaC9uXQWtLLwBMF97Sz0LWW6US0TIgti8WPhuGv5D6tVtQcyDlg2i8 mFEDgOlBnYUsdXLDtcOc1E++aU7MUJEwz5y7/18jtTZDXqxXa2LVcr56fYB/MONktISp Wf6Ae8IH12m2nJEzLzdlRUOocMmoGr/0hGzkV4Zw8ONGuIPqWVvFXOBZvyPKPmxvs+en jjtqQ8BWnpg7k5cy/JeoDzr5AMh3iSUmo1LJFfWGebylBhfvGusoB6Y1RQY4CwDdmRVC 6nkQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781161688; x=1781766488; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=cjfhvNssoOAYDQ5/TfXfIvW2C4yHWEf/9njXUrD9BPs=; b=CUV1S02er1yhusJoQUNql2GDhcuwzozRwOfVan+D8InZbpNapKax5pZacsunNcyse2 +QdrkVaCgHJySwVmdtoX1/4WjPPU6qTTRZ/PAwv5Uk7kp/n9U7nOdwLPYAzNwrrQXcXR FOBx4Rh828eSjEb9D6Bq4UNBuzphKug9ScNztl2EzilqOKhIC4NjkYJLd0hHAkjBmgQL t+LlayHYUmtKsRM0gIvnaSaSO9Xof9kqP1V6pbG933+lWkPIlF2mvEY9m9DOXfo/oNW0 xHVKH/iIu2zaRjSuWgL6cgzcsruWjWn8EhO2uE8dV+YSntZNIcHZhgcNYshM5BIe/PBz kn0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781161688; x=1781766488; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cjfhvNssoOAYDQ5/TfXfIvW2C4yHWEf/9njXUrD9BPs=; b=oO7L7v20FarGZpWekpMWDIMMxwI/sI/2Sf3YuzZGtT/NRfY1PXF02pmWme1KuCmSn/ uMrmt1qnvIUSC8uWM6TflL58A4mNsUPP/mTCEAcUweywjOtpOqU0NkOkCwYSTpRlj0ZZ uT0br7gIZlRTL9/Db2yTlJP3IzSlojc+TeVEYrdKSzyPk3vpZU6O6ls5Vefy/yZWcGdq ehtbE8OosmNGsiGqk06RbIUyHhzoGAaqonFsTBIMVpU3gDfsSmO8DTvVCxuOijNZLp/w XP8AZEbULgvArKDyv8hrH1NYjtkTKD3v+cLQsbJ11cNaU2TmtY+c8FGGRqB+vbeGgYxN Uknw== X-Forwarded-Encrypted: i=1; AFNElJ9pf2qPSCmdqWyBCBdX8MLHf6beBPRPcYHn5D/VpnnxpGCabHTt+KEpJj4sxO9/lrHK25eWanwXZtlVBMdz@lists.postgresql.org X-Gm-Message-State: AOJu0Yyz6HCSN/gF1avaR1j9U4FALcAKiFnZL0hUG8JsBAxqZd4KBkTK LHonUitWUKa6/wdr1MELsbrnZXTh1CHM4JfIhAx+8hh/b848oLxcQVDSB13zGfCfT9+bb4xEsM8 Pyg7fQygeWyJkziDoSbzW2ufgxNoP8Kc= X-Gm-Gg: Acq92OFbsXcJprkafY9BQ7L5ni5mSlPlpO5ytQ26mMCN2kpJi7/kf0kU8FyLW74I0E0 zJAVDk7WhuKcj3H7x819tU00PTD9C3hKb0Vd16ApgDw4SYTNmGsySU4cq1HR+MRJWSLrDVFS4md V10yV4CUlhZhpnKZ15Ru3Hs5ZTatdaaLHCwHO2cP5N++t8YnzHZLnDIVe9W6SBcvM/vkrbRLGe7 Ea4tjycHovBI5BohC1YBvBl4sBbpz4TaXcRw5rMqSyIEgdEWwozXEq7mmSd42cL0CzfBSysvImG Pb47SFfUi1CBA/lHVTEoKbQpQ+Zez6yePgmJMyk0kzsyuweKePA= X-Received: by 2002:a17:90b:3850:b0:36a:8519:a4e9 with SMTP id 98e67ed59e1d1-377f5124df4mr1356749a91.18.1781161683214; Thu, 11 Jun 2026 00:08:03 -0700 (PDT) MIME-Version: 1.0 References: <080f9394-c127-4cef-865e-10f2f997125c@postgrespro.ru> <38690b0e-f91b-46fa-b72a-57775612e463@postgrespro.ru> In-Reply-To: From: Masahiko Sawada Date: Thu, 11 Jun 2026 00:07:26 -0700 X-Gm-Features: AVVi8CfXPpPIwLem84D_a5EVkFzMATFHOAr0egfPY64Khp-FsDvqv1-2iXWUejk Message-ID: Subject: Re: Support logical replication of DDLs, take2 To: Amit Kapila Cc: Vitaly Davydov , Ashutosh Bapat , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, May 5, 2026 at 10:36=E2=80=AFAM Masahiko Sawada wrote: > > On Mon, May 4, 2026 at 5:23=E2=80=AFAM Amit Kapila wrote: > > > > On Fri, May 1, 2026 at 2:11=E2=80=AFAM Masahiko Sawada wrote: > > > > > > On Wed, Apr 29, 2026 at 9:44=E2=80=AFPM Amit Kapila wrote: > > > > > > > > On Wed, Apr 29, 2026 at 3:19=E2=80=AFAM Masahiko Sawada wrote: > > > > > > > > > > On Mon, Apr 27, 2026 at 11:32=E2=80=AFPM Amit Kapila wrote: > > > > > > > > > > > > Yes, there will be a maintenance cost of JSON-based deparsing > > > > > > approach. But note that multiple senior people (Alvaro Herrera,= Robert > > > > > > Haas) [1] seems to favor that approach. So, I am not sure we ca= n > > > > > > conclude to abandon that approach without those people or some = other > > > > > > senior people agreeing to abandon it. To be clear, I am not aga= inst > > > > > > considering a new/different approach for DDL replication but ju= st that > > > > > > it is not clear that old/existing approach can be ruled out wit= hout > > > > > > more discussion on it, > > > > > > > > > > Thank you for pointing it out. Just to be clear, IIUC what they l= iked > > > > > was to use JSON string representation of DDLs, but not JSON strin= g > > > > > representation of DDLs that are deparsed from parse nodes, no? > > > > > > > > > > > > > As per my understanding, we built deparsing stuff with a goal of > > > > supporting DDL replication and Alvaro was the original author of th= at > > > > work, see [1]. The benefit it provides flexibility in terms of > > > > filtering by decoding plugin, if any, or changing the DDL (like > > > > schema-mapping) during apply. It is not clear to me if we can achiv= e > > > > similar level of flexibility with other approach. > > > > > > I think we can generate the same JSON-string representation of DDLs > > > from catalog information, it would also require a lot of code, though= . > > > It would be independent from parse nodes and if we implement it as an > > > option for pg_get_xxx_ddl() functionality it would be able to be > > > reused by other tools too. > > > > > > > IIRC, this was discussed previously as well but we were not sure if we > > can build all (especially some complex ones) without parsetree. See > > discussion/emails around [1][2]. > > Right. We would need parsetree somewhat. I think we're able to > generate CREATE and DROP TABLE statements for the particular table > without parsetree. But as for generating CREATE TABLE for the table, > it's going to be a combination of CREATE/ALTER TABLE/INDEX/SEQUENCE > statements, like pg_dump does. For instance, if a user executes > "CREATE TABLE foo (id serial primary key)", we create table, sequence, > and index, but searching system catalogs doesn't tell us these objects > are created in one statement. So we would generate multiple DDLs as > follow: > > CREATE TABLE public.foo (id integer NOT NULL); > CREATE SEQUENCE public.foo_id_seq AS integer ...; > ALTER SEQUENCE public.foo OWNED BY foo; > ALTER TABLE public.foo ADD CONSTRAINT foo_pkey PRIMARY KEY (id); > > While these queries create the same table as the one on the publisher, > we need to consider whether it's okay to replicate these queries > instead of the oen statement originally executed on the publisher. If > we can use something like pg_get_table_ddl() in DDL replication, that > function would be able to be used also by the initial schema > synchronization. > > As for ALTER TABLE, we would need parsetree of ALTER TABLE subcommands. > > > > > > > > > > > > > > > > > > > > > > > > We would need to maintain the JSON serialization code whenever > > > > > > > creating or modifying parse nodes, regardless of whether the = changes > > > > > > > were related to DDL replication. IIUC, this was the primary r= eason the > > > > > > > feature didn't cross the finish line. > > > > > > > > > > > > > > Additionally, I think there is another design issue: it is no= t > > > > > > > output-plugin agnostic. Since the deparsed DDL was written by= a > > > > > > > logical-replication-specific event trigger, third-party logic= al > > > > > > > decoding plugins cannot easily detect DDL events. > > > > > > > > > > > > > > > > > > > Why RmgrId like RM_LOGICALDDLMSG_ID and XLOG_LOGICAL_DDL_MESSAG= E wal > > > > > > info is not sufficient for this? Decoder will add a message lik= e > > > > > > REORDER_BUFFER_CHANGE_DDL which can be used to detect DDL messa= ge, no? > > > > > > > > > > Right, but I'm not sure this is a good developer experience that > > > > > additional steps are required to capture DDL events for other plu= gins > > > > > while changes of INSERT/UPDATE/DELETE/TRUNCATE are passed from t= he > > > > > logical decoding by default. > > > > > > > > > > > > > Yes, there could probably be additional steps for plugins but they > > > > must be doing a few things already which are defined at publication > > > > level like column lists, row filtering, something related to RI, et= c. > > > > > > I think those publication-level features operate at a somewhat > > > different layer than the fundamental mechanism of capturing DDL > > > events. Plugins filter rows or columns based on configuration, but th= e > > > logical decoding itself guarantees that the DML events are reliably > > > passed to them. Given that the TRUNCATE in logical replication alread= y > > > works so, I guess DDL should have the same fundamental guarantee. > > > > > > it's unclear to me how plugins could reliably manage these event > > > triggers. While a plugin might create an event trigger during the > > > startup callback if it doesn't exist, it cannot drop it during the > > > shutdown callback. We also cannot establish a dependency between an > > > event trigger and a logical replication slot. We would likely need to > > > invent a new plugin callback specifically invoked at slot drop time > > > just to clean it up. Also, if different plugins want to capture DDL > > > events, they could end up registering different event triggers, > > > emitting multiple DDL WAL records for the same DDL event. > > > > > > > Why would different plugins end up registering different event > > triggers? I mean if they are already registered by the first plugin > > what is the need to re-register. > > > > Since you mentioned column lists and row filtering as examples of what > individual plugins already do in a reply to my point that registering > event triggers could be an additional step for other plugins, I > thought you meant that each plugin registering event triggers is not a > huge cumbersome. I think different plugins don't need to register > different event triggers. We can have the common event triggers to > write logical-DDL WAL and register them when the first logical slot is > created. But as I mentioned, we need to be careful about both the > concurrent slot creation/drop and the fact that slot creation/drop > operations are not transactional. Also, we cannot create event > triggers on the replicas even if a logical slot is created there. If a > failover happens before applying the WAL of creating event triggers, > we would need to somehow make sure that even triggers are created on > the new primary if it has logical slots. > I've reviewed and researched the last proposed DDL deparse patch[1], and found that it can generate a set of commands for a single command. For example, deparsing "create table test_serial (a serial)" generates: - CREATE SEQUENCE public.test_serial_a_seq CACHE 1 NO CYCLE INCREMENT BY 1 MINVALUE 1 MAXVALUE 2147483647 START WITH 1 RESTART 1 AS pg_catalog.int4; - CREATE TABLE public.test_serial (a pg_catalog.int4 STORAGE PLAIN NOT NULL DEFAULT pg_catalog.nextval('public.test_serial_a_seq'::pg_catalog.regc= lass)); - ALTER SEQUENCE public.test_serial_a_seq OWNED BY public.test_serial.a; And deparsing "create table test_b (a int, b int references test_a (a))" generates: - CREATE TABLE public.test_b (a pg_catalog.int4 STORAGE PLAIN, b pg_catalog.int4 STORAGE PLAIN); - ALTER TABLE public.test_b ADD CONSTRAINT test_b_b_fkey FOREIGN KEY (b) REFERENCES public.test_a(a); The reason is that the patch deparses after the command runs (via an event trigger at ddl_command_end) and rebuilds the DDL from the catalog. By that time, the server has already expanded one CREATE TABLE into several internal sub-commands: transformCreateStmt() turns a serial column into a CREATE SEQUENCE plus ALTER SEQUENCE ... OWNED BY, a foreign key into an ALTER TABLE ADD CONSTRAINT, a primary key into an index, and so on. So the output looks much like pg_dump or the proposed pg_get_table_ddl() [2], because it also rebuilds from the catalog, not from the original parse tree. For a general DDL deparse feature, is it useful to turn one command into several, or to emit a command that is different from what the user ran? Since DDL deparsing can be used also for schema-qualifying DDL commands, I considered it's useful for audit purposes or CDC use cases. But if the generated DDL command could be different or even multiple DDL commands are generated from one DDL command, I'm not sure it's useful for other use cases than DDL replication. Even in DDL replication use cases, I'm concerned that it might be confusing users. For example, what if DDL replication has DDL command filter and users specify it to replicate only 'CREATE TABLE' and not for 'ALTER TABLE'? Users might expect all replicated DDL commands are 'CREATE TABLE' but it would not be able to replicate some form of CREATE TABLE without ALTER TABLE or CREATE/ALTER SEQUENCE. Also, given that the proposed pg_get_table_ddl()[2] has approximately 1800 lines to support generating CREATE TABLE commands for the given table, does it really make sense to have additional 3000 lines to support DDL deparsing that works in mostly the same way but based on parse trees? While the JSON blob idea is flexible and preferable, considering that two features seem to work in mostly the same way, I'm not sure it can justify the implementation costs. I think that DDL deparse should keep the intention and the form of the original command as much as possible: do not turn one command into several commands, and do not add options that the user did not write. I see two benefits: First, as a feature on its own, this makes DDL deparse usable for pure schema-qualification. That is useful outside DDL replication too, for example for audit or CDC. (Some audit cases may instead want the fully expanded form. We could offer that as an option, but I think the default should stay faithful to the original command.) Second, as a building block for DDL replication, keeping the original command lets the subscriber benefit from improvements in its own version. Some commands may take a lighter lock, or use better new defaults. For example, in v19 the default TOAST compression became lz4, while it was pglz before. If we had DDL replication in v18 and a user set up a logical replication v18 -> v19, I think they would not want new tables on v19 to be forced to pglz just because that was the publisher's default. If a user does want exactly the same options on the subscriber, we can add an option to include the options that were not explicitly specified too. But even then I do not think we should produce multiple commands from one command. Another point I think we should discuss is how to capture DDL events. The last patch creates event triggers automatically at CREATE PUBLICATION. I am not sure this is the best approach: Third-party logical decoding output plugins do not use PUBLICATION, so they cannot easily capture DDL events this way. We could instead create a common event trigger when the first logical slot is created. But it is not easy to make sure the trigger is reliably created and dropped, including on replicas (for example, a slot can exist only on a replica that is later promoted). I think we can decouple the DDL capture infrastructure from event triggers, and use it as common infrastructure for both event triggers and WAL-logging of DDL for replication. That way, I think we can trigger DDL deparse while not relying on event triggers and minimizing the code duplication. I'm drafting the patch for this idea and feedback is very welcome. Regards, [1] https://www.postgresql.org/message-id/OS0PR01MB57163E6487EFF7378CB8E17C= 9438A%40OS0PR01MB5716.jpnprd01.prod.outlook.com [2] https://www.postgresql.org/message-id/CANxoLDfjQnhM%3DE6JSyYo9s9OdjqoN8= s_3wE5yL%3DkaDu_X8j-dA%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com