Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Bruce Momjian <[email protected]>
To: Bharath Rupireddy <[email protected]>
Cc: Kyotaro Horiguchi <[email protected]>
Cc: Laurenz Albe <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: SATYANARAYANA NARLAPURAM <[email protected]>
Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
Date: Wed, 5 Oct 2022 17:00:03 -0400
Message-ID: <[email protected]> (raw)
In-Reply-To: <CALj2ACVuivbfS4zNGKPRE=rmme2VxC9obBjoOdrw5k+JKVk3UA@mail.gmail.com>
References: <[email protected]>
	<[email protected]>
	<CALj2ACWPMYoPSC3t-9uW+0gqDUcJf1mLww6hHzo2V2AvE-Tu+w@mail.gmail.com>
	<[email protected]>
	<CALj2ACXmMWtpmuT-=v8F+Lk4QCbdkeN+yHKXeRGKFfjG96YbKA@mail.gmail.com>
	<CALj2ACUO6oz-43ryqfMOVZ_Q-N10C5tkzKku12+QV02NnXsDrw@mail.gmail.com>
	<[email protected]>
	<CALj2ACVuivbfS4zNGKPRE=rmme2VxC9obBjoOdrw5k+JKVk3UA@mail.gmail.com>

On Sat, Oct  1, 2022 at 06:59:26AM +0530, Bharath Rupireddy wrote:
> > I have always felt this has to be done at the server level, meaning when
> > a synchronous_standby_names replica is not responding after a certain
> > timeout, the administrator must be notified by calling a shell command
> > defined in a GUC and all sessions will ignore the replica.  This gives a
                         ------------------------------------
> > much more predictable and useful behavior than the one in the patch ---
> > we have discussed this approach many times on the email lists.
> 
> IIUC, each walsender serving a sync standby will determine that the
> sync standby isn't responding for a configurable amount of time (less
> than wal_sender_timeout) and calls shell command to notify the admin
> if there are any backends waiting for sync replication in
> SyncRepWaitForLSN(). The shell command then provides the unresponsive
> sync standby name at the bare minimum for the admin to ignore it as
> sync standby/remove it from synchronous_standby_names to continue
> further. This still requires manual intervention which is a problem if
> running postgres server instances at scale. Also, having a new shell

As I highlighted above, by default you notify the administrator that a
sychronous replica is not responding and then ignore it.  If it becomes
responsive again, you notify the administrator again and add it back as
a sychronous replica.

> command in any form may pose security risks. I'm not sure at this
> point how this new timeout is going to work alongside
> wal_sender_timeout.

We have archive_command, so I don't see a problem with another shell
command.

> I'm thinking about the possible options that an admin has to get out
> of this situation:
> 1) Removing the standby from synchronous_standby_names.

Yes, see above.  We might need a read-only GUC that reports which
sychronous replicas are active.  As you can see, there is a lot of API
design required here, but this is the most effective approach.

> 2) Fixing the sync standby, by restarting or restoring the lost part
> (such as network or some other).
> 
> (1) is something that postgres can help admins get out of the problem
> easily and automatically without any intervention. (2) is something
> postgres can't do much about.
> 
> How about we let postgres automatically remove an unresponsive (for a
> pre-configured time) sync standby from synchronous_standby_names and
> inform the user (via log message and via new walsender property and
> pg_stat_replication for monitoring purposes)? The users can then
> detect such standbys and later try to bring them back to the sync
> standbys group or do other things. I believe that a production level
> postgres HA with sync standbys will have monitoring to detect the
> replication lag, failover decision etc via monitoring
> pg_stat_replication. With this approach, a bit more monitoring is
> needed. This solution requires less or no manual intervention and
> scales well. Please note that I haven't studied the possibilities of
> implementing it yet.
> 
> Thoughts?

Yes, see above.

> > Once we have that, we can consider removing the cancel ability while
> > waiting for synchronous replicas (since we have the timeout) or make it
> > optional.  We can also consider how do notify the administrator during
> > query cancel (if we allow it), backend abrupt exit/crash, and
> 
> Yeah. If we have the
> timeout-and-auto-removal-of-standby-from-sync-standbys-list solution,
> the users can then choose to disable processing query cancels/proc
> dies while waiting for sync replication in SyncRepWaitForLSN().

Yes.  We might also change things so a query cancel that happens during 
sychronous replica waiting can only be done by an administrator, not the
session owner.  Again, lots of design needed here.

> > if we
> > should allow users to specify a retry interval to resynchronize the
> > synchronous replicas.
> 
> This is another interesting thing to consider if we were to make the
> auto-removed (by the above approach) standby a sync standby again
> without manual intervention.

Yes, see above.  You are addressing the right questions here.  :-)

-- 
  Bruce Momjian  <[email protected]>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

view thread (37+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox