Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wbaqw-002gwM-37 for pgsql-hackers@arkaria.postgresql.org; Mon, 22 Jun 2026 09:22:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wbaqv-005nEU-1N for pgsql-hackers@arkaria.postgresql.org; Mon, 22 Jun 2026 09:22:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wbaqv-005nEM-0B for pgsql-hackers@lists.postgresql.org; Mon, 22 Jun 2026 09:22:01 +0000 Received: from mail-pg1-x536.google.com ([2607:f8b0:4864:20::536]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wbaqt-00000001jU3-04ue for pgsql-hackers@lists.postgresql.org; Mon, 22 Jun 2026 09:22:00 +0000 Received: by mail-pg1-x536.google.com with SMTP id 41be03b00d2f7-c858014845aso1670989a12.1 for ; Mon, 22 Jun 2026 02:21:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1782120116; cv=none; d=google.com; s=arc-20240605; b=HgZ0UHae5Kiux4BFdls1uJO7Zs11ck+ds33M0IaUEbVnKa5sE5H6i6lWzVha7gKaqu 47yGGvjG+Mt3a8hRHzAUAKFDEfytyoG5ShlzKl7I0U6lECapn7B4Gas5b9d/YI3JGRPn EaBB+3FoktmZyy4b9tzQ+imudKAwSkneg6XvU/CDjs5HusrEQqV5dekWAY93GGyXlS3Q Og3wXCOCjbd3np/emxNA/6Tefx2MpxFKaSqlTDrMGyNuVp9c7zqfwWG+KhkTiP0HybYH LOxHiOZ1Iwy5Q3uONEqZbD2T1416fP2pUKUKIL0PAHQo2iosHrjen0m6V0usTvzqku0n 1elw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=yAQ+hSN+Ph81nZ3Zi+DUlRgClRlcYXd+tzs15DdSXTE=; fh=PG6obZ5YAiyiwjcKyIBD0Qa3ermV+r0PhukjHMsuOrY=; b=NN23N2qoqP6Mn7VqyaqTzZqjeF4LUlOrAe4IphBA3V92tszRQglmZJGHtqa/jX8WqC u5nl6Q/tICL+q6xcfhfg7LiEmq3UvinfV1wgXzknOnIsdmClUf06kE4w5f+c0WDEL9PN MqHLUloq36pxNqOfnJX8js4aCufCfSlXEp4LTsEcOrw8FFprASu+W/SKsc43yJ7PU9te KcRfClT20wMv2kK+x4vglq5OOAM1V4kwDXR8PYzgKOMm1G+f/RwWxuRtXeLLuSHLWlSZ 8vODu8nF/F9jGafHnCE+bV4XKPStghVle/P1DDtAfBb5NYPvTf9Azlvy9ng67yx6FXne 4zOw==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1782120116; x=1782724916; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yAQ+hSN+Ph81nZ3Zi+DUlRgClRlcYXd+tzs15DdSXTE=; b=q9Jsq2g3KSmlLJsgY062Ar/Q/0ktZQZ8Ml01HoyAP4+EvU30LP9F/3JFqkRH6Crrh0 +2aL8D4sPpAjRGLZegMAAZXRo03h/OPu7k1FfyK2LRCySlfQQ5j1sIGoOeYTN+b11FDg NYY7ZOspaVUZZwE8FbJ2BRqQve+VvyC9Xjxu9cw9OrS443BtqX5Gd3Q11gyb+jyNHYd5 Opf/RnHmtWI7liesKgerSRuKZFtqFremD8OBxai52nzKIcKbW/3xgDL42jDjCYg7oTBJ XLQndFL0BhMGdlwvp2pN5Im824xPCofgE8r+ofKvDTz0f1bEQfi5AB6RIlVXfQNnPTBC +bSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1782120116; x=1782724916; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=yAQ+hSN+Ph81nZ3Zi+DUlRgClRlcYXd+tzs15DdSXTE=; b=LSzuEMmYQIjeYt69w8YEjPCCJAB+1AHLkZoyuV0oXsTB8Lu/1DYHIMzlVc76KC4Yym Z60fo7u+gUrwiiUIntbumOLQIYanOSMqG8IFsu8EH9i28OzxvTD5v//MkJNiOjtI7FvN e2g1W/BiHlaGthA4nrZ3J8mEj1OKcYadUJeHPa9FpXf/h+TlE+HHNxjb7f/z7XXuR4O7 dnjum93rc/hTp6ecdZL3GUGJ1kmpm9c1hNx/+rnsFE1K7Idt5vpNI8g64sSWEhKEkfm5 demKx95aO/gzob4C6IdAMUgB75qTS6cp8uqoPPs75PZ7z4NVqO8aZCPPjsEsJ72YrpMg w34A== X-Forwarded-Encrypted: i=1; AFNElJ8yCriiu626TY7SNRhrng3etM68vlLWHSLaENtRETS1br2ivm60NCWzz9+0ciRRVZY2KWcuRctTQU9/B52w@lists.postgresql.org X-Gm-Message-State: AOJu0Yzh6WJ10JhFP3zSX09StmeOymJvt7sKV2ttZCvcm0+vL1i+kUSk ZhBV6eT7d6+WCh3HtnWUbdF+FDM5b+TrSljy+0x83t3mvpE2kPD9Ho6X4844jdYpuRpaeuGZW/g BVllCfpZNiH58OqCHjq7SiqlqHMU34rc= X-Gm-Gg: AfdE7ckeT+YdBy7PNyPG3V9m5s6AxbbQJIcrac8st1vNdaM8SnDlzdsYWhELkM8JApV DNfnhvRhAM0j9FnOKQZTEmYXF4rV2MNPfVjkHZ0zHJd5TusEcCOO4iNPrZEY4V+wTAwsK4/0RLu dAs1LnqUCepZ8EfZBhn2jwGqrEsgRizaOPNX7luuERByH/wkHE+GJFvb7gTxziqReRIDg3h+4/9 AgDO/2EviT+j6+0KrJnCW2eVoEsdG/zN6z0gFc9RMd1JkNfUZykByvWBxYs+FcJzEFbLRHS18mc aQFIL/mlC4JkW0SdpJuxmakFjsw/+8TRbJM2iIF2JRQ= X-Received: by 2002:a05:6a21:9209:b0:39f:2af3:e8ee with SMTP id adf61e73a8af0-3bb3544ac10mr11244210637.13.1782120116423; Mon, 22 Jun 2026 02:21:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: shveta malik Date: Mon, 22 Jun 2026 14:51:43 +0530 X-Gm-Features: AVVi8CczOGwCAB_xCfI9EQ3h8ZzePAX7IuUilMACQBJCrOP34ZTBI426Fqw9lGw Message-ID: Subject: Re: Proposal: Conflict log history table for Logical Replication To: Amit Kapila Cc: vignesh C , Dilip Kumar , Shlok Kyal , Peter Smith , Nisha Moond , Masahiko Sawada , Bharath Rupireddy , PostgreSQL Hackers , shveta malik Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Jun 22, 2026 at 2:38=E2=80=AFPM Amit Kapila wrote: > > On Mon, Jun 22, 2026 at 1:56=E2=80=AFPM shveta malik wrote: > > > > On Mon, Jun 22, 2026 at 9:33=E2=80=AFAM vignesh C = wrote: > > > > > > On Mon, 22 Jun 2026 at 08:41, Amit Kapila w= rote: > > > > > > > > On Sun, Jun 21, 2026 at 7:53=E2=80=AFPM vignesh C wrote: > > > > > > > > > > While attempting to log a conflict, a concurrent ALTER SUBSCRIPTI= ON > > > > > can change the conflict logging destination from all to log. In t= his > > > > > scenario, the apply worker may already have cached the conflictlo= gdest > > > > > information, including the OID of the current conflict log table. > > > > > However, the concurrent ALTER SUBSCRIPTION drops the conflict log > > > > > table as part of the destination change: > > > > > +Relation > > > > > +GetConflictLogDestAndTable(ConflictLogDest *log_dest) > > > > > +{ > > > > > + Oid conflictlogrelid; > > > > > + > > > > > + /* > > > > > + * Convert the text log destination to the internal enum. > > > > > MySubscription > > > > > + * already contains the data from pg_subscription. > > > > > + */ > > > > > + *log_dest =3D GetConflictLogDest(MySubscription->conflict= logdest); > > > > > + > > > > > + /* Quick exit if a conflict log table was not requested. = */ > > > > > + if (!CONFLICTS_LOGGED_TO_TABLE(*log_dest)) > > > > > + return NULL; > > > > > + > > > > > + conflictlogrelid =3D MySubscription->conflictlogrelid; > > > > > + > > > > > + Assert(OidIsValid(conflictlogrelid)); > > > > > + > > > > > + return table_open(conflictlogrelid, RowExclusiveLock); > > > > > +} > > > > > > > > > > As a result, when the apply worker later attempts to open the cac= hed > > > > > conflict log table, table_open() fails because the relation has > > > > > already been dropped. This causes the error handling path itself = to > > > > > fail before the conflict record can be written to either the conf= lict > > > > > log table or the server log. > > > > > > > > > > In such cases, the conflict record is effectively lost and is not > > > > > logged anywhere. For example: > > > > > 2026-06-21 19:31:13.592 IST [263598] LOG: logical replication ap= ply > > > > > worker for subscription "sub1" has started > > > > > 2026-06-21 19:32:26.731 IST [263598] ERROR: could not open relat= ion > > > > > with OID 16405 > > > > > 2026-06-21 19:32:26.731 IST [263598] CONTEXT: processing remote = data > > > > > for replication origin "pg_16404" during message type "INSERT" fo= r > > > > > replication target relation "public.t1" in transaction 698, finis= hed > > > > > at 0/017D39A0 > > > > > 2026-06-21 19:32:26.735 IST [263471] LOG: background worker "log= ical > > > > > replication apply worker" (PID 263598) exited with exit code 1 > > > > > > > > > > Ideally, failure to access the conflict log table should not prev= ent > > > > > the conflict from being reported in the server log. This issue is > > > > > present with the v52 version. I have not yet checked if Amit's re= cent > > > > > patch posted a few minutes ago at [1] handles this issue. > > > > > > > > > > > > > There are two places in the patch from where we LOG/Insert the > > > > conflict data. First is ReportApplyConflict() where we LOG if the > > > > conflict arises from a non-ERROR path (aka conflicts other > > > > INSERT/UPDATE_EXISTS). In that case, the conflict data will be log= ged > > > > even when we fail to insert into CLT. Second is the place for > > > > conflicts that arose as ERRORs (aka INSERT/UPDATE_EXISTS), where th= e > > > > conflict information will be logged along with insert failure as > > > > CONTEXT. Can you please verify your test based on this input and sh= are > > > > your findings and thoughts? > > > > > > The scenario I am testing is an insert_exists conflict. > > > On the publisher: > > > CREATE TABLE t1 (c1 int); > > > > > > On the subscriber: > > > CREATE TABLE t1 (c1 int PRIMARY KEY); > > > > > > Then execute the following on the publisher: > > > INSERT INTO t1 VALUES (10); > > > INSERT INTO t1 VALUES (10); > > > > > > The second insert generates an insert_exists conflict on the > > > subscriber. The conflict is reported and logged through the following > > > call chain: > > > apply_handle_insert > > > -> apply_handle_insert_internal > > > -> ExecSimpleRelationInsert > > > -> CheckAndReportConflict > > > -> ReportApplyConflict > > > > > > Pause execution in ReportApplyConflict() at > > > GetConflictLogDestAndTable(), immediately before opening the conflict > > > log table: > > > ... > > > return table_open(conflictlogrelid, RowExclusiveLock); > > > ... > > > > > > While the apply worker is paused, execute the following command concu= rrently: > > > ALTER SUBSCRIPTION sub1 > > > SET (conflict_log_destination =3D 'log'); > > > > > > This succeeds and drops the conflict log table: > > > NOTICE: dropped conflict log table "pg_conflict.pg_conflict_log_1640= 4" > > > for subscription "sub1" > > > ALTER SUBSCRIPTION > > > > > > At this point, GetConflictLogDestAndTable() has already determined > > > that the conflict should be logged to a table and has cached the > > > corresponding relation OID. However, the concurrent ALTER SUBSCRIPTIO= N > > > has removed that table. > > > > > > When execution resumes, the subsequent table_open() call fails with: > > > 2026-06-22 09:24:53.072 IST [304864] ERROR: could not open relation > > > with OID 16405 > > > > > > As a result, conflict processing itself fails before the conflict > > > details can be recorded. The conflict is therefore not logged to the > > > conflict log table and is also not emitted to the server log. > > > > > > > > > I understand this case, but I feel it isn't critical because the table > > is going to be dropped in parallel, so ultimately, all data is lost. > > At-max, we can provide a LOG when table-open fails, indicating that > > the CLT table is dropped concurrently and thus conflict-data cannott > > be logged to table. > > > > Instead of adding additional LOG, a simpler fix would be to use > try_table_open() and if the table is dropped, silently just LOG the > conflict and proceed (see attached top-up patch). Yeah, the idea looks good. > In general, I agree > that it is not a very critical issue but as the fix is simpler, I > thought it is better to address so that apply worker can continue > instead of erroring out. yes, my intent was the same, the apply worker can simply LOG (table-open issue) and continue when table_open does not give table. But we can LOG the conflict instead as you suggested. > Having said that, I think we can't handle > each and every corner case where there are some other errors before we > can LOG the conflict, say some OOM or some other error happens during > CheckAndReportConflict. > I agree. thanks Shveta