Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2lKv-000Zow-2Q for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 07:29:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w2lKt-008Ksf-1c for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 07:28:59 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2lKt-008KsX-0F for pgsql-hackers@lists.postgresql.org; Wed, 18 Mar 2026 07:28:59 +0000 Received: from mail-vk1-xa2b.google.com ([2607:f8b0:4864:20::a2b]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w2lKp-00000000Jou-2gqz for pgsql-hackers@lists.postgresql.org; Wed, 18 Mar 2026 07:28:58 +0000 Received: by mail-vk1-xa2b.google.com with SMTP id 71dfb90a1353d-56a9c5cb48bso2615423e0c.0 for ; Wed, 18 Mar 2026 00:28:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773818936; cv=none; d=google.com; s=arc-20240605; b=WM20N1MosRwSRUlBRuqbINRTE84pqFxnc+SirRk6binZHaxYm9kRGjQjev+s90jNgr dlBc8/TPPT6iltVBH+yvKAzpktZclRkFzl5iUEiOyK1MigYQB7aHHVqsbsYm33P8talT 9qyrYWwJhaNM4AtxXreESJ/w7UceoYlozNiI/zagLkF3EK25jOGEvts2mibkc0QADWT6 P3kiL4nOzv+dcvlY6EnT28CXD2e1yrJgwAgzGKYFwV1bRUiVZoKTI/c+RWe1445Bfe3Q 38pmyhoqrc8LJhox7iPhARyU+58aUMnCenx5k1yQVByDl7Vo0TCa1e6ZevB58PDkK9JO BfKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=BZMMsF94EVccSs3kjuQhkOCEA2NMSyvpdFvY3/P/8x4=; fh=dtwre1TbBcR6DnwqkrtZZ0VOR9wy+xuplJK4TPbL4Z8=; b=CP3SvXratsXBXErgb2Xsat0ThhDPSOncnUshSkvgjYzL5+yIzGOxz1ejhe/RY4gtzy eGNldCn9lWGF25+2Kn8bl+Fl8xJwRMzwY5XL2OvStDusDxU+Z6Me6Jmqg13UHbOXaOvh FmWCPwa2rAzbggDizKSQ/yD9DqKZXUJr6bH616CtQhKRCRKWqYxZTteVcze4oM9XJNTZ WabvCC1f5e80K2gc2QyucxDrKmL0pvQOI3aHdkRAoc+jzEAGN/Ebh0IgMpQfBFQjKTLA 0z10pohUN0QY9sg2Sazu/OxIyQY6K9fT9CmcCa6jtJclWc70IDdyFjTJD3M9CaSlqgH+ b2/Q==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773818936; x=1774423736; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=BZMMsF94EVccSs3kjuQhkOCEA2NMSyvpdFvY3/P/8x4=; b=ChCwheCU6hH5SNIpcM+YTJTRaLdExgV1YIAAeA5hObQ99O0WSQDCSNy5lywRMX0jdo mJNTThrUnHP1SRIU1gGV0z2VC7iwJxhL9mJdzaJW8HHpWUKA8gNf+LWRZ0nKMrjbDhNT K+LrYJOnksIXwl/yHz/UcKbezTOVowhVBYyzaeCSCdXflA/Fo+H/MwgurjOh+5ztp93Q O5NQmjg31iWYA+G6+fZ7/HFOJsKS04ek7MzaCv+Qn7vvESDW7LPku0B1Yz9miOSeejiX u/oPl5K46xW91srVujqoGmHi14Jkt84gFzOxcDUjh9+rKiA5a3jqTl7iAQeGN2rK52Hn nBLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773818936; x=1774423736; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BZMMsF94EVccSs3kjuQhkOCEA2NMSyvpdFvY3/P/8x4=; b=su2ZF4NM1+kde5v8Nxdl9MgWjx9nAxqb17ep4HgE9C5DMBjfkFzhjfN6mKuAcAHQsZ 3AUDUuoLx3/yhzx++bYRkRXbjV+NbhIkFBGWk74SPJWhZPso4AQqFzfRc0fSYSLwdipr Znin/8yuyEwsq2LkbIqvS3T+LQ78z0UbJgZowDlw1GqRbWRlhoaNepy/coCtl3UoCPwK ILw739Vd2UArBjW546RF916Dbr24GXe/QbblBA5PMN3o+n6RDtkDWcpaR6RDGkvtlvVr tm3Mq6dgTA3wADq4C/xTo0y/EGne4bcWB7MJD787WTWvMHxCRdwv1+JlCsTCMaaQH3LB bwKQ== X-Forwarded-Encrypted: i=1; AJvYcCXnm2aa9XeQoVAxNQ/KMb1kbLAW9zomq3SLVJdVwl1o6kOUvts2v+QSyucdFnfP3NQFSH/lP3PGqlx7izFG@lists.postgresql.org X-Gm-Message-State: AOJu0Yw+/Qu1q5VWml49xLPeZyTv3p1/B+s2zev3Uu401/9x8HSBqOuH q7dnpxdtX8Jf6q3Y4wOpU93dFj6cwCcOEGu+ZDJX6zqkbOiDNJU/iMtWyI/FeIRNhcgFE8xatC9 T/9S29Q1S/Mx+XVnZLP7w3o+XrNrcW54= X-Gm-Gg: ATEYQzyznjd9FTc3zAn5rFCZJlQekdoT1nwSfUs71cfyAFU27byfCswtSFmSFIxe7r7 OJQlYn0Cls0NXrvD7DIn4tecTBwlR/9WBviAKVe+DM9vyCWbiw91ckEafKK06K3hTUw9u44zBr/ KxGF7kCeAaXMJw6J/nxdT7ZjcbiSQ/8ST5jhwTJzR5MUrgROqCcLKt5VyKaVfNcQEdElJg5Z8s6 Mfys62oI3WIl9zI76nhGbn6vNvz2EQ1uhg43hRjrDJOyqJh0esUmDrrVnlfkOA36WLyVzLLUYfL 2ICT79A= X-Received: by 2002:a05:6102:a53:b0:5ff:1f6e:6278 with SMTP id ada2fe7eead31-6027d3fd019mr1055552137.34.1773818936369; Wed, 18 Mar 2026 00:28:56 -0700 (PDT) MIME-Version: 1.0 References: <20220805.114916.994654810780821553.horikyota.ntt@gmail.com> <20220809.161236.1486509314201074910.horikyota.ntt@gmail.com> In-Reply-To: From: SATYANARAYANA NARLAPURAM Date: Wed, 18 Mar 2026 00:28:44 -0700 X-Gm-Features: AaiRm53SNhmMzyiTkrOI6mJuDOHwyrmdJQbUdJ7BWFLCcmsgtk6WKYkBul3O5wc Message-ID: Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication To: Bharath Rupireddy Cc: Bruce Momjian , Andrey Borodin , Kyotaro Horiguchi , Laurenz Albe , PostgreSQL Hackers Content-Type: multipart/alternative; boundary="0000000000009927dc064d476732" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000009927dc064d476732 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Reviving this thread. On Sun, Jan 29, 2023 at 9:55=E2=80=AFPM Bharath Rupireddy < bharath.rupireddyforpostgres@gmail.com> wrote: > For proc die, it looks like the suggestion was to process it > immediately and upon next restart, don't allow user connections unless > all sync standbys were caught up. However, we need to be able to allow > replication connections from standbys so that they'll be able to > stream the needed WAL and catch up with primary, allow superuser or > users with pg_monitor role to connect to perform ALTER SYSTEM to > remove the unresponsive sync standbys if any from the list or disable > sync replication altogether or monitor for flush lsn/catch up status. > And block all other connections. Note that replication, superuser and > users with pg_monitor role connections are allowed only after the > server reaches a consistent state not before that to not read any > inconsistent data. > Allowing replication, superuser and pg_monitor seems reasonable to me. > > The trickiest part of doing the above is how we detect upon restart > that the server received proc die while waiting for sync replication > ACK. One idea might be to set a flag in the control file before the > crash. Second idea might be to write a marker file (although I don't > favor this idea); presence indicates that the server was waiting for > sync replication ACK before the crash. However, we may not detect all > sorts of crashes in a backend when it is waiting for sync replication > ACK to do any of these two ideas. Therefore, this may not be a > complete solution. > You cannot control the crash, it can be a simple power failure too and none of them could have reached the disk. Additionally, this is in a critical transaction commit path. > > Third idea might be to just let the primary wait for sync standbys to > catch up upon restart irrespective of whether it was crashed or not > while waiting for sync replication ACK. While this idea works well > without having to detect all sorts of crashes, the primary may not > come up if any unresponsive standbys are present (currently, the > primary continues to be operational for read-only queries at least > irrespective of whether sync standbys have caught up or not). > I prefer this approach because depending on the quorum policy defined in the synchrnous_standby_names, the primary will open connections for read/writes. If there is no progress from sync standbys then Postgres admin has to jump in regardless. Thanks, Satya --0000000000009927dc064d476732 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Reviving this thread.

On Sun, Ja= n 29, 2023 at 9:55=E2=80=AFPM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
Fo= r proc die, it looks like the suggestion was to process it
immediately and upon next restart, don't allow user connections unless<= br> all sync standbys were caught up. However, we need to be able to allow
replication connections from standbys so that they'll be able to
stream the needed WAL and catch up with primary, allow superuser or
users with pg_monitor role to connect to perform ALTER SYSTEM to
remove the unresponsive sync standbys if any from the list or disable
sync replication altogether or monitor for flush lsn/catch up status.
And block all other connections. Note that replication, superuser and
users with pg_monitor role connections are allowed only after the
server reaches a consistent state not before that to not read any
inconsistent data.


The trickiest part of doing the above is how we detect upon restart
that the server received proc die while waiting for sync replication
ACK. One idea might be to set a flag in the control file before the
crash. Second idea might be to write a marker file (although I don't favor this idea); presence indicates that the server was waiting for
sync replication ACK before the crash. However, we may not detect all
sorts of crashes in a backend when it is waiting for sync replication
ACK to do any of these two ideas. Therefore, this may not be a
complete solution.

You cannot control t= he crash, it can be a simple power failure too and none of=C2=A0them could = have reached the disk.
Additionally, this is in a critical transa= ction commit path.
=C2=A0

Third idea might be to just let the primary wait for sync standbys to
catch up upon restart irrespective of whether it was crashed or not
while waiting for sync replication ACK. While this idea works well
without having to detect all sorts of crashes, the primary may not
come up if any unresponsive standbys are present (currently, the
primary continues to be operational for read-only queries at least
irrespective of whether sync standbys have caught up or not).

I prefer this approach because depending on the quor= um policy defined in the synchrnous_standby_names, the primary will open co= nnections for read/writes.
If there is no progress from sync stan= dbys then Postgres admin has to jump in regardless.
=C2=A0
<= div>Thanks,
--0000000000009927dc064d476732--