Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oFvfY-00068X-FF for pgsql-hackers@arkaria.postgresql.org; Mon, 25 Jul 2022 10:50:37 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1oFvfW-000549-W3 for pgsql-hackers@arkaria.postgresql.org; Mon, 25 Jul 2022 10:50:34 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oFvfW-00053z-IL for pgsql-hackers@lists.postgresql.org; Mon, 25 Jul 2022 10:50:34 +0000 Received: from forwardcorp1j.mail.yandex.net ([5.45.199.163]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oFvfR-0001JI-Sn for pgsql-hackers@lists.postgresql.org; Mon, 25 Jul 2022 10:50:33 +0000 Received: from myt6-81d8ab6a9f9d.qloud-c.yandex.net (myt6-81d8ab6a9f9d.qloud-c.yandex.net [IPv6:2a02:6b8:c12:520a:0:640:81d8:ab6a]) by forwardcorp1j.mail.yandex.net (Yandex) with ESMTP id 707902E046F; Mon, 25 Jul 2022 13:50:27 +0300 (MSK) Received: from [IPv6:2a02:6b8:82:604:5471:fcd3:ba3e:6548] (unknown [2a02:6b8:82:604:5471:fcd3:ba3e:6548]) by myt6-81d8ab6a9f9d.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id bvyzc3hfhE-oQPaSZat; Mon, 25 Jul 2022 13:50:26 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1658746226; bh=SyWKV8OVO8uWkcSStgXhRq5BSDETpOGITnswpl6sNUM=; h=Message-Id:To:Date:References:Cc:In-Reply-To:From:Subject; b=HUf+GlbXGeR/zYT8uct7nozrO+0YmjQYQ5SIeSkphpvZ7I2EVQ0GX2+JbXp9+Q1lf 9uCiuGHe4iFeVvE9F9AmLBOgYa0M3ikPUM5XD4AJTudhMkS9Yve7giPBXwHMceF0Ac MEZ7cNz/NxShUbbBYrDlIK9k9W62xWmlRZstU+zI= Authentication-Results: myt6-81d8ab6a9f9d.qloud-c.yandex.net; dkim=pass header.i=@yandex-team.ru Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\)) Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication From: Andrey Borodin In-Reply-To: Date: Mon, 25 Jul 2022 15:50:25 +0500 Cc: Dilip Kumar , Laurenz Albe , PostgreSQL Hackers , SATYANARAYANA NARLAPURAM Content-Transfer-Encoding: quoted-printable Message-Id: <4F070B19-51EC-4A05-A111-6001A961F991@yandex-team.ru> References: <9290b55b6ae2b04e002ca9dadadd1cca09461482.camel@cybertec.at> <763B5AF0-1C9E-4796-9639-F969A2E66189@yandex-team.ru> <11FF616C-C78C-41AA-A823-E3D4E745ACE5@yandex-team.ru> To: Bharath Rupireddy X-Mailer: Apple Mail (2.3608.120.23.2.7) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > 25 =D0=B8=D1=8E=D0=BB=D1=8F 2022 =D0=B3., =D0=B2 14:29, Bharath = Rupireddy =D0=BD=D0=B0=D0=BF=D0=B8= =D1=81=D0=B0=D0=BB(=D0=B0): >=20 > Hm, after thinking for a while, I tend to agree with the above > approach - meaning, query cancel interrupt processing can completely > be disabled in SyncRepWaitForLSN() and process proc die interrupt > immediately, this approach requires no GUC as opposed to the proposed > v1 patch upthread. GUC was proposed here[0] to maintain compatibility with previous = behaviour. But I think that having no GUC here is fine too. If we do not = allow cancelation of unreplicated backends, of course. >>=20 >> And yes, we need additional complexity - but in some other place. = Transaction can also be locally committed in presence of a server crash. = But this another difficult problem. Crashed server must not allow data = queries until LSN of timeline end is successfully replicated to = synchronous_standby_names. >=20 > Hm, that needs to be done anyways. How about doing as proposed > initially upthread [1]? Also, quoting the idea here [2]. >=20 > Thoughts? >=20 > [1] = https://www.postgresql.org/message-id/CALj2ACUrOB59QaE6=3DjF2cFAyv1MR7fzD8= tr4YM5+OwEYG1SNzA@mail.gmail.com > [2] 2) Wait for sync standbys to catch up upon restart after the crash = or > in the next txn after the old locally committed txn was canceled. One > way to achieve this is to let the backend, that's making the first > connection, wait for sync standbys to catch up in ClientAuthentication > right after successful authentication. However, I'm not sure this is > the best way to do it at this point. I think ideally startup process should not allow read only connections = in CheckRecoveryConsistency() until WAL is not replicated to quorum al = least up until new timeline LSN. Thanks! [0] https://commitfest.postgresql.org/34/2402/