public inbox for [email protected]
help / color / mirror / Atom feedFrom: Dilip Kumar <[email protected]>
To: Bharath Rupireddy <[email protected]>
Cc: Laurenz Albe <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: SATYANARAYANA NARLAPURAM <[email protected]>
Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
Date: Mon, 9 May 2022 15:14:03 +0530
Message-ID: <CAFiTN-thTDvo1CKJ2DDdZpGJwc4erp8jy8EKx=LPzw0wi_XMiw@mail.gmail.com> (raw)
In-Reply-To: <CALj2ACWoB3k0kfjA7JxJgskVGGeE3jWzmGRPjP9QTRSCgSjhOg@mail.gmail.com>
References: <CALj2ACUrOB59QaE6=jF2cFAyv1MR7fzD8tr4YM5+OwEYG1SNzA@mail.gmail.com>
<[email protected]>
<CALj2ACWoB3k0kfjA7JxJgskVGGeE3jWzmGRPjP9QTRSCgSjhOg@mail.gmail.com>
On Mon, May 9, 2022 at 2:50 PM Bharath Rupireddy
<[email protected]> wrote:
>
> On Tue, Apr 26, 2022 at 11:57 AM Laurenz Albe <[email protected]> wrote:
> >
> > On Mon, 2022-04-25 at 19:51 +0530, Bharath Rupireddy wrote:
> > > With synchronous replication typically all the transactions (txns)
> > > first locally get committed, then streamed to the sync standbys and
> > > the backend that generated the transaction will wait for ack from sync
> > > standbys. While waiting for ack, it may happen that the query or the
> > > txn gets canceled (QueryCancelPending is true) or the waiting backend
> > > is asked to exit (ProcDiePending is true). In either of these cases,
> > > the wait for ack gets canceled and leaves the txn in an inconsistent
> > > state [...]
> > >
> > > Here's a proposal (mentioned previously by Satya [1]) to avoid the
> > > above problems:
> > > 1) Wait a configurable amount of time before canceling the sync
> > > replication by the backends i.e. delay processing of
> > > QueryCancelPending and ProcDiePending in Introduced a new timeout GUC
> > > synchronous_replication_naptime_before_cancel, when set, it will let
> > > the backends wait for the ack before canceling the synchronous
> > > replication so that the transaction can be available in sync standbys
> > > as well.
> > > 2) Wait for sync standbys to catch up upon restart after the crash or
> > > in the next txn after the old locally committed txn was canceled.
> >
> > While this may mitigate the problem, I don't think it will deal with
> > all the cases which could cause a transaction to end up committed locally,
> > but not on the synchronous standby. I think that only using the full
> > power of two-phase commit can make this bulletproof.
>
> Not sure if it's recommended to use 2PC in postgres HA with sync
> replication where the documentation says that "PREPARE TRANSACTION"
> and other 2PC commands are "intended for use by external transaction
> management systems" and with explicit transactions. Whereas, the txns
> within a postgres HA with sync replication always don't have to be
> explicit txns. Am I missing something here?
>
> > Is it worth adding additional complexity that is not a complete solution?
>
> The proposed approach helps to avoid some common possible problems
> that arise with simple scenarios (like cancelling a long running query
> while in SyncRepWaitForLSN) within sync replication.
IMHO, making it wait for some amount of time, based on GUC is not a
complete solution. It is just a hack to avoid the problem in some
cases.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
view thread (37+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
In-Reply-To: <CAFiTN-thTDvo1CKJ2DDdZpGJwc4erp8jy8EKx=LPzw0wi_XMiw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox