Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1no1Gr-0007MO-HL for pgsql-hackers@arkaria.postgresql.org; Mon, 09 May 2022 11:09:46 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1no1Go-00086d-Iz for pgsql-hackers@arkaria.postgresql.org; Mon, 09 May 2022 11:09:42 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1no1Go-00086U-5X for pgsql-hackers@lists.postgresql.org; Mon, 09 May 2022 11:09:42 +0000 Received: from forwardcorp1o.mail.yandex.net ([2a02:6b8:0:1a2d::193]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1no1Gg-0003SL-UC for pgsql-hackers@lists.postgresql.org; Mon, 09 May 2022 11:09:41 +0000 Received: from vla3-850de775f4df.qloud-c.yandex.net (vla3-850de775f4df.qloud-c.yandex.net [IPv6:2a02:6b8:c15:341d:0:640:850d:e775]) by forwardcorp1o.mail.yandex.net (Yandex) with ESMTP id 336982E0D74; Mon, 9 May 2022 14:09:32 +0300 (MSK) Received: from vla5-d6d5ce7a4718.qloud-c.yandex.net (vla5-d6d5ce7a4718.qloud-c.yandex.net [2a02:6b8:c18:341e:0:640:d6d5:ce7a]) by vla3-850de775f4df.qloud-c.yandex.net (mxbackcorp/Yandex) with ESMTP id 9XYLZyfW5m-9VKu12tF; Mon, 09 May 2022 14:09:32 +0300 X-Yandex-Fwd: 2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1652094572; bh=2o3lB2zkyMKGCHGcDvOCQ29wACb3sA2W0CeKU+juPGQ=; h=Message-Id:References:Date:Cc:In-Reply-To:Subject:To:From; b=Jqi1SwNswIm29tt0wP+CHULFweoGTqfqtgupg3zBsCF2g76rSqRI6n5QSKgjTmfhr MLqZdB5Qffheel4n8a1+X8FC+14GRJ37wtRdK/2IVzBa/qxRQncLaecItw7G+SlGu7 UFN7fDO8asHqiF7HWKbhmab0YFCG3MAGYmt83aXA= Authentication-Results: vla3-850de775f4df.qloud-c.yandex.net; dkim=pass header.i=@yandex-team.ru Received: from smtpclient.apple (unknown [2a02:6b8:b081:8119::1:1d]) by vla5-d6d5ce7a4718.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id s6PlRIBfNQ-9VNSwTnK; Mon, 09 May 2022 14:09:31 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.60.0.1.1\)) Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication From: Andrey Borodin In-Reply-To: Date: Mon, 9 May 2022 16:09:30 +0500 Cc: Laurenz Albe , PostgreSQL Hackers , SATYANARAYANA NARLAPURAM Content-Transfer-Encoding: quoted-printable Message-Id: <763B5AF0-1C9E-4796-9639-F969A2E66189@yandex-team.ru> References: <9290b55b6ae2b04e002ca9dadadd1cca09461482.camel@cybertec.at> To: Bharath Rupireddy X-Mailer: Apple Mail (2.3693.60.0.1.1) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On 9 May 2022, at 14:20, Bharath Rupireddy = wrote: >=20 > On Tue, Apr 26, 2022 at 11:57 AM Laurenz Albe = wrote: >>=20 >> While this may mitigate the problem, I don't think it will deal with >> all the cases which could cause a transaction to end up committed = locally, >> but not on the synchronous standby. I think that only using the full >> power of two-phase commit can make this bulletproof. >=20 > Not sure if it's recommended to use 2PC in postgres HA with sync > replication where the documentation says that "PREPARE TRANSACTION" > and other 2PC commands are "intended for use by external transaction > management systems" and with explicit transactions. Whereas, the txns > within a postgres HA with sync replication always don't have to be > explicit txns. Am I missing something here? COMMIT PREPARED needs to be replicated as well, thus encountering the = very same problem as usual COMMIT: if done during failover it can be = canceled and committed data can be wrongfully reported durably written. = 2PC is not a remedy to the fact that PG silently cancels awaiting of = sync replication. The problem arise in presence of any "commit". And = "commit" is there if transactions are there. > On 9 May 2022, at 14:44, Dilip Kumar wrote: >=20 > IMHO, making it wait for some amount of time, based on GUC is not a > complete solution. It is just a hack to avoid the problem in some > cases. Disallowing cancelation of locally committed transactions is not a hack. = It's removing of a hack that was erroneously installed to make backend = responsible to Ctrl+C (or client side statement timeout). > On 26 Apr 2022, at 11:26, Laurenz Albe = wrote: >=20 > Is it worth adding additional complexity that is not a complete = solution? Its not additional complexity. It is removing additional complexity that = made sync rep interruptible. (But I'm surely talking not about GUCs like = synchronous_replication_naptime_before_cancel - wait of sync rep must be = indefinite until synchrous_commit\synchronous_standby_names are = satisfied ) And yes, we need additional complexity - but in some other place. = Transaction can also be locally committed in presence of a server crash. = But this another difficult problem. Crashed server must not allow data = queries until LSN of timeline end is successfully replicated to = synchronous_standby_names. Best regards, Andrey Borodin.=