Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wV2SW-001dVU-1s for pgsql-hackers@arkaria.postgresql.org; Thu, 04 Jun 2026 07:25:44 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wV2SV-005MOJ-1a for pgsql-hackers@arkaria.postgresql.org; Thu, 04 Jun 2026 07:25:43 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wV2SV-005MOA-0R for pgsql-hackers@lists.postgresql.org; Thu, 04 Jun 2026 07:25:43 +0000 Received: from mail-lj1-x22b.google.com ([2a00:1450:4864:20::22b]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wV2SS-00000001CJ3-4A5Z for pgsql-hackers@lists.postgresql.org; Thu, 04 Jun 2026 07:25:42 +0000 Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-3967725a77fso3358961fa.1 for ; Thu, 04 Jun 2026 00:25:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1780557940; cv=none; d=google.com; s=arc-20240605; b=KLvnN91vzQ3zNg9xMNA2rpVUvATeqdNNDGHHQm0mesvT7+R8ijZt6ZDsL9hlZYgI5s o4pJ6chdifygFgNqbRFauMdpHjFfyz41xvpyJ+mdci+F0WZJ12mo2GMJDgZBfIg3iAqt /y0XFE8aY+MCfMLxbPs/eNJb1Bc4t3f9iQrgILoHLnfvPaVrYiOXuYmBCXErkmJdJOD6 dliCcH3hwHrhgWkBnog7vuJkrnPVrRP9RHwoNtCi/b2uKTXYw+IaUwyrx4m9nin9ECVe cMbRAdne0zYG+ye9nYSV8E/cwiBXGuHX6D+TZLG43Ju7W4ycN3ErlOcaBZGwpB4ayZNQ hnFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=WlQ2Z+4of2Irzeg8UXsKQ6ncyOaP+GIUEYJCs919ugs=; fh=mcEPB5Lif5vl54S8RnO7jW/434JT/cJNRfuzAnt1iow=; b=SOtgC1AYn0OBz3G+EjwJ9oVYCDsqbvbX0RsO743xEGFpw3UOKsVZxKJSjMQdYb1HlD ckcMvA0gWaPWB1hM/1Np5F31KLpLqhWYrHGC4x9N88FbDZQc7je5EykLAUIsc5rWGm1U Q+c17T22xmqTnx5Gw2/8TV6peZ/T75tX3S4PHu6KqagAqxSxp96US9Z+TAfzxY4R7kKA +Gn+YLz0fYw6B36SFtodbjbja3cvzexjzhUwna3ZKGZVNHXkqhGgMIH4XRCcotsEvYqL rtlu3HHMZXhCiwpxR4V75XZUxKymlKE1UzBqJ7wELXIQPNvxinthcyULxoioU58jTtmp voHw==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780557940; x=1781162740; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WlQ2Z+4of2Irzeg8UXsKQ6ncyOaP+GIUEYJCs919ugs=; b=nhoJZWQX7L5r7WAW3CSFeV26juz9Jqna8nCriX63oZFdFzT/rb8OGk7O8ZxMThdy9c oeA+j/pRAyN8D5nWPDkMfEWrk7nsXKZJteX9HbXTP0tIrs98nWkUIYrYXiWbt1jzMrC2 D3VzjmTWDHioskDsrke35PsYgtQXnlGmwO1KlNrjSbBQxb+cqMhVwDFIDE1Fsp7TUe/R wSUMSpQnNLot4RgN7n7YXTd2qNEsn1Lx0Q9YFVJyzqkCCujiP/mvRJ6OzpL0xPqRgBCg xmu87mPzQoWlEjC8nMQmYPd7pIAaozTjA4EUJJda4Ph5nxUP22reGeRoCEw3076oBZBH bYnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780557940; x=1781162740; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=WlQ2Z+4of2Irzeg8UXsKQ6ncyOaP+GIUEYJCs919ugs=; b=s6wO9h3czWT3f3SsMbTOB4yOC3RvysVRO4s/MWRqMxLzncBZhEEbIH+FUTP6E8XzwR f6Jy5ps2+yr4yaAiIYlH2Ft3Mqf8DqTuSelFqZDnumZyb62eeysAmNAA1fJSSYjFNkc5 X2dvxvL0go2Ao2MP5lQ93/ks7hymdSghN3s38JwhdHZCEspA7xBt7WzGKgYRITXEkAcL 8c+CfmiRjuP88xR7pq+t0DdlXpV8S5a3rIjtKlSSc3Nu3+e2fT5k096ghCAzOgygsxql D7UGsH1rTv1qFZZyTZrGd+4IGXjcTvHOz8xtGq5ykMntunl3GW0dn67iiDNYZ4rVer6Y bY3w== X-Forwarded-Encrypted: i=1; AFNElJ/+/mrjkP+c+LF1uYUdCy/EKnUrREeJXlzLxJNdXtiaBU2vUYEz9mfdPWk+AqUpSDsKFX+m04RuOSiuvg4z@lists.postgresql.org X-Gm-Message-State: AOJu0Yz+R+qqUN6n2+CWQ/R6q9gRyusCXKGEaHBK/gFuw98nNq06MKEI No0oMHuBnK2lu4KSgFWcQEkBwK7f+hA2vGPx5jf3lpSGmTvL+FUzwTFDph2oNabfh/fRFLcTRl6 DduzoxdjnHdQC96Z/bgQ6yOQkafWWMgc= X-Gm-Gg: Acq92OEbJBbwlVD5MRXdEDw1WXLQ+HVcpwT4Sxno/S0PCSZvFRPLcYit/f2IVEwYmg3 CMrQtToNKG2vMHVikL2mWuVA1rHri1Fy6DHyafLB4pY9S6qlY4n3TGTA4R9kwGuGxiwhq3utpTp sxdLB/TT6662PfKK8jh9JvBAdYIHjlJBKiViLoLUPLqMybu+c4nX1X1Qp/inHu2tpmJf1XeLUlI 1uGx/KarT4a6SqlDEo6SERqiBcDzNnmRmj0a1tH2xaGBhyaU0DU9AJNpDUijMzafUIMciKlvxFl OnF1ECePPFWyh/7zr7noBgxVY7aaY5ZaUhnCeJO91MVWslr87zK4e3PVLUs= X-Received: by 2002:a05:6512:3190:b0:5aa:6db6:537 with SMTP id 2adb3069b0e04-5aa7c0bc7bdmr2285284e87.16.1780557939328; Thu, 04 Jun 2026 00:25:39 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Dilip Kumar Date: Thu, 4 Jun 2026 12:55:22 +0530 X-Gm-Features: AVVi8Cf6rUhZikAN5Bss39to_osacTu_MffYUwpM-CkX60PA6i-TzcPqtd1bRd0 Message-ID: Subject: Re: Proposal: Conflict log history table for Logical Replication To: Nisha Moond Cc: vignesh C , Amit Kapila , Peter Smith , shveta malik , Masahiko Sawada , Bharath Rupireddy , PostgreSQL Hackers , shveta malik Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Jun 3, 2026 at 3:25=E2=80=AFPM Nisha Moond wrote: > > On Wed, Jun 3, 2026 at 10:11=E2=80=AFAM vignesh C w= rote: > > > > On Tue, 2 Jun 2026 at 10:51, Dilip Kumar wrote: > > > > > > PFA, latest version of patch > > > > > > Changes: > > > 1. Includes Nisha's suggestion from v43_comments_nisha.patch > > > 2. Include Amit's suggestion from v43-0001_amit.1.patch > > > 3. Include Amit's suggestion: change GetLogDestination to > > > GetConflictLogDest(), drop_sub_conflict_log_table() to > > > drop_sub_conflict_log_table(), CONFLICTS_LOGGED_TO_FILE to > > > CONFLICTS_LOGGED_TO_LOG and renamed IsConflictNamespace() to > > > IsConflictLogTableNamespace() > > > 4. Include Vignesh's fix Parallel_apply_failure.patch > > > 5. Change conflict log table name back to conflict_log_table_ > > > 6. Changed acl to exclude INSERT/UPDATE/USAGE as it was there in init= ial patch > > > 7. Merged 0002 to 0001 and 0004 to 0003 after review > > > > > > Open > > > 1. Vignesh please rebase upgrade and \dRs+ page > > > > Here is the rebased version of the upgrade and \dRs patch which is > > present in v45-0003 and v45-0004. There is no change in v45-0001 and > > v45-0002, they are the same patch as in [1]. > > Thanks for the updated patches, please find my comments on the v44 > patches below. These appear to apply to v45-001 and v45-002 as well, > as both are unchanged. > > 1) The patches require a catversion bump. It may be worth adding it > now so it does not get missed later during commit. Yeah we can do this > > 2) As ProcessPendingConflictLogTuple() may perform a heap_insert, it > could fail for various reasons, such as table access errors, write > failures, WAL write failures, or ownership-related issues. > I looked into the error handling when CLT insert fails: > > 2a) When disable_on_error =3D false > > /* Try to allocate a worker for the streaming transaction. */ > @@ -5666,6 +5674,7 @@ start_apply(XLogRecPtr origin_startpos) > */ > AbortOutOfAnyTransaction(); > pgstat_report_subscription_error(MySubscription->oid); > + ProcessPendingConflictLogTuple(); > > PG_RE_THROW(); > } > > If ProcessPendingConflictLogTuple() fails, PG_RE_THROW() is never > reached, so the original conflict error is lost and only the secondary > CLT-related error is reported, e.g.: > > ERROR: could not open relation with OID 16421 > LOG: background worker "logical replication apply worker" exited > with exit code 1 > > If the failure persists, the apply worker will keep retrying and users > may never see the actual conflict error. The same concern applies to > the parallel worker case too. > > Is there a clean way to ensure the original apply/conflict error is > still logged even if ProcessPendingConflictLogTuple() itself fails? Does this happening when destination is table or with all as well? > ~~ > > 2b) When disable_on_error =3D true i.e. DisableSubscriptionAndExit() path > > @@ -6040,6 +6049,8 @@ DisableSubscriptionAndExit(void) > */ > pgstat_report_subscription_error(MyLogicalRepWorker->subid); > > + ProcessPendingConflictLogTuple(); > + > /* Disable the subscription */ > StartTransactionCommand(); > > This introduces a potential failure point before the subscription is > disabled. If ProcessPendingConflictLogTuple() fails, the subscription > may never be disabled and the worker could repeatedly hit the same > error. > > Would it be safer to move ProcessPendingConflictLogTuple() until after > the subscription has been disabled, so that disabling the subscription > is not dependent on successful CLT processing? Yeah, I think it make sense, I don't see obvious issue with moving it after disabling the subscription. > > 2c) When "conflict_log_destination =3D table" the log reports: > > ERROR: conflict detected on relation "public.t1": conflict=3Dinsert_ex= ists > DETAIL: Conflict details are logged to the conflict log table: > pg_conflict_log_16403 > > However, the subsequent CLT write can still fail: > ... > ERROR: could not open relation with OID 16426 > LOG: background worker "logical replication apply worker" exited > with exit code 1 > > Since the CLT write has not yet occurred when this message is emitted, > would it make sense to change it to: > "Conflict details will be logged to the conflict log table: ..." > > to avoid implying that the logging has already succeeded? IMHO "Conflict details are logged to the conflict log table" doesn't mean a conflict has been logged to the table; what I interpret from this message is that conflict logging to the table is enabled. > ~~~ > > 3) The v45-002 commit message still refers to the old conflict log > table name, pg_conflict.pg_conflict_log_for_subid_16396. It should be > updated to pg_conflict_log_16396. > ~~~ > > 4) 035_conflicts.pl: I see that v45-002 has corrected the CLT name in > below test - > +# Verify the contents of the Conflict Log Table (CLT) > +# This section ensures that the CLT contains the expected > +# type and specific key data. > +my $subid =3D $node_subscriber->safe_psql('postgres', > + "SELECT oid FROM pg_subscription WHERE subname =3D 'sub_tab';"); > +my $clt =3D "pg_conflict.pg_conflict_log_$subid"; > + > +# Wait for the conflict to be logged in the CLT > +my $log_check =3D $node_subscriber->poll_query_until( > + 'postgres', > + "SELECT count(*) > 0 FROM $clt;" > +); > > I wonder why this wasn't caught in the v44 tests. It seems the check > below is silently broken because poll_query_until() keeps retrying > until timeout when the table does not exist. > Should we replace it with an ok() test instead? something like - > > # Wait for the conflict to be logged in the CLT > ok($node_subscriber->poll_query_until( > 'postgres', > "SELECT count(*) > 0 FROM $clt;"), > 'conflict appeared in CLT after insert'); Why would CLT not exist, IIUC this is positive test where we are trying to validate the conflict data from CLT, no? --=20 Regards, Dilip Kumar Google