MIME-Version: 1.0
References: <9290b55b6ae2b04e002ca9dadadd1cca09461482.camel@cybertec.at>
 <20220805.114916.994654810780821553.horikyota.ntt@gmail.com>
 <CALj2ACWPMYoPSC3t-9uW+0gqDUcJf1mLww6hHzo2V2AvE-Tu+w@mail.gmail.com>
 <20220809.161236.1486509314201074910.horikyota.ntt@gmail.com>
 <CALj2ACXmMWtpmuT-=v8F+Lk4QCbdkeN+yHKXeRGKFfjG96YbKA@mail.gmail.com>
 <CALj2ACUO6oz-43ryqfMOVZ_Q-N10C5tkzKku12+QV02NnXsDrw@mail.gmail.com>
 <YzYh3NpCQAFkA6lF@momjian.us>
 <CALj2ACVuivbfS4zNGKPRE=rmme2VxC9obBjoOdrw5k+JKVk3UA@mail.gmail.com>
 <Yz3wUxW2a3raVbfJ@momjian.us>
In-Reply-To: <Yz3wUxW2a3raVbfJ@momjian.us>
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 6 Oct 2022 13:33:33 +0530
Message-ID: 
 <CALj2ACUqEw57RbG9DieLTku5ZCJdivSU32XuLH3U7rx0d6fJAQ@mail.gmail.com>
Subject: Re: An attempt to avoid
 locally-committed-but-not-replicated-to-standby-transactions
 in synchronous replication
To: Bruce Momjian <bruce@momjian.us>
Cc: Kyotaro Horiguchi <horikyota.ntt@gmail.com>,
 Laurenz Albe <laurenz.albe@cybertec.at>,
	PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
	SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Archived-At: 
 <https://www.postgresql.org/message-id/CALj2ACUqEw57RbG9DieLTku5ZCJdivSU32XuLH3U7rx0d6fJAQ%40mail.gmail.com>
Precedence: bulk

On Thu, Oct 6, 2022 at 2:30 AM Bruce Momjian <bruce@momjian.us> wrote:
>
> As I highlighted above, by default you notify the administrator that a
> sychronous replica is not responding and then ignore it.  If it becomes
> responsive again, you notify the administrator again and add it back as
> a sychronous replica.
>
> > command in any form may pose security risks. I'm not sure at this
> > point how this new timeout is going to work alongside
> > wal_sender_timeout.
>
> We have archive_command, so I don't see a problem with another shell
> command.

Why do we need a new command to inform the admin/user about a sync
replication being ignored (from sync quorum) for not responding or
acknowledging for a certain amount of time in SyncRepWaitForLSN()?
Can't we just add an extra column or use existing sync_state in
pg_stat_replication()? We can either introduce a new state such as
temporary_async or just use the existing state 'potential' [1]. A
problem is that the server has to be monitored for this extra, new
state. If we do this, we don't need another command to report.

> > I'm thinking about the possible options that an admin has to get out
> > of this situation:
> > 1) Removing the standby from synchronous_standby_names.
>
> Yes, see above.  We might need a read-only GUC that reports which
> sychronous replicas are active.  As you can see, there is a lot of API
> design required here, but this is the most effective approach.

If we use the above approach to report via pg_stat_replication(), we
don't need this.

> > > Once we have that, we can consider removing the cancel ability while
> > > waiting for synchronous replicas (since we have the timeout) or make it
> > > optional.  We can also consider how do notify the administrator during
> > > query cancel (if we allow it), backend abrupt exit/crash, and
> >
> > Yeah. If we have the
> > timeout-and-auto-removal-of-standby-from-sync-standbys-list solution,
> > the users can then choose to disable processing query cancels/proc
> > dies while waiting for sync replication in SyncRepWaitForLSN().
>
> Yes.  We might also change things so a query cancel that happens during
> sychronous replica waiting can only be done by an administrator, not the
> session owner.  Again, lots of design needed here.

Yes, we need infrastructure to track who issued the query cancel or
proc die and so on. IMO, it's not a good way to allow/disallow query
cancels or CTRL+C based on role types - superusers or users with
replication roles or users who are members of any of predefined roles.

In general, it is the walsender serving sync standby that has to mark
itself as async standby by removing itself from
synchronous_standby_names, reloading config variables and waking up
the backends that are waiting in syncrep wait queue for it to update
LSN.

And, the new auto removal timeout should always be set to less than
wal_sender_timeout.

All that said, imagine we have
timeout-and-auto-removal-of-standby-from-sync-standbys-list solution
in one or the other forms with auto removal timeout set to 5 minutes,
any of following can happen:

1) query is stuck waiting for sync standby ack in SyncRepWaitForLSN(),
no query cancel or proc die interrupt is arrived, the sync standby is
made as async standy after the timeout i.e. 5 minutes.
2) query is stuck waiting for sync standby ack in SyncRepWaitForLSN(),
say for about 3 minutes, then query cancel or proc die interrupt is
arrived, should we immediately process it or wait for timeout to
happen (2 more minutes) and then process the interrupt? If we
immediately process the interrupts, then the
locally-committed-but-not-replicated-to-sync-standby problems
described upthread [2] are left unresolved.

[1] https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-VIEW
sync_state text
Synchronous state of this standby server. Possible values are:
async: This standby server is asynchronous.
potential: This standby server is now asynchronous, but can
potentially become synchronous if one of current synchronous ones
fails.
sync: This standby server is synchronous.
quorum: This standby server is considered as a candidate for quorum standbys.

[2] https://www.postgresql.org/message-id/CALj2ACXmMWtpmuT-%3Dv8F%2BLk4QCbdkeN%2ByHKXeRGKFfjG96YbKA%40mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com