Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oLKsg-00033C-Fb for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Aug 2022 08:46:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1oLKsf-0006m6-5O for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Aug 2022 08:46:29 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oLKse-0006ge-Lr for pgsql-hackers@lists.postgresql.org; Tue, 09 Aug 2022 08:46:28 +0000 Received: from mail-lj1-x235.google.com ([2a00:1450:4864:20::235]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1oLKsX-0001nO-I7 for pgsql-hackers@lists.postgresql.org; Tue, 09 Aug 2022 08:46:27 +0000 Received: by mail-lj1-x235.google.com with SMTP id l21so828611ljj.2 for ; Tue, 09 Aug 2022 01:46:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=DE2uh8AL9pB3LEzHjGtvbV07DBBhVVjD3+0EjW6XHII=; b=VoW7q11hXmAD0hW1n5iqp0RtLo8TwuCSUcMxd/SFLNPdqwgCjs2FeEL3h7S7dpH1Gk IWoQMEuhcC9zwOrwJmjq3lcPSzBxx/OxUCiXx0HaE0tkP+7ySWrsofR7/4BBLYPs/iqi TCrR7kB7qS7Ne2xHr4JjkaZ1KGy8gZlUH1B67SIdJ1xCjYvpVDsMf2zdMzGg7UYUE4Qt VssAUcybol3mjCB1t+zOXih8q8GqnaaKRGPehaj9AefbrYAG4gFlRBaHBLtpWaqmXU2G 6+8O58344MHh96HE+z73GnutPZS2vtJ/03itdGFb9Bgq4xTPBJPjvVuDSOZJqYuGbVfb ke4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=DE2uh8AL9pB3LEzHjGtvbV07DBBhVVjD3+0EjW6XHII=; b=e1DY+GkOqmRsmdLD5/fSk98K85+U8fnbI+itomJUEdr4JNI15uXWMdBwo49xmc2v67 u5uK2Q5IHuG4gPHD/grQj0cv9Mdr7mEdlW3p+urVMFfSOAGXunGCh5AtP9J9svOanepe vTq7TtbM+uCM+HS6sJJDfPkJNu11Pc8WHdPLWdqOS+IVephnng7W1ItC/Gm3puL5Aqq3 l4mh5oGAYpkOK4wfOP2+/aJEXf0OqePIVwnINxU1uHvZWvYKLXDjldeFy0WrhGM7KxTt LXBttLq5kMxgKsMUI8Ofjrgn9pebjxObQ+wXPewU48KOasFuWWciQY+mC5Y8rnXeQtET F9EA== X-Gm-Message-State: ACgBeo17bP5Fy/duggyvtABrtl39fSFgtVVxB3IS9FK/xZ+iFH0MdJ3k TZN0Ck+dN83GQ3vA0qWBjiJ8/Mac1130u76D6BQ= X-Google-Smtp-Source: AA6agR5sOL+cMDNigF8nIwwaymWnqr+VJNavdZYYyYMvB8OruWOk6IeRHYF3mlnPms6VNJeTyLBBn8KmzVFWnCZwqtI= X-Received: by 2002:a05:651c:179a:b0:247:d37b:6ec5 with SMTP id bn26-20020a05651c179a00b00247d37b6ec5mr7260066ljb.112.1660034779006; Tue, 09 Aug 2022 01:46:19 -0700 (PDT) MIME-Version: 1.0 References: <9290b55b6ae2b04e002ca9dadadd1cca09461482.camel@cybertec.at> <20220805.114916.994654810780821553.horikyota.ntt@gmail.com> <20220809.161236.1486509314201074910.horikyota.ntt@gmail.com> In-Reply-To: <20220809.161236.1486509314201074910.horikyota.ntt@gmail.com> From: Bharath Rupireddy Date: Tue, 9 Aug 2022 14:16:00 +0530 Message-ID: Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication To: Kyotaro Horiguchi Cc: Laurenz Albe , PostgreSQL Hackers , SATYANARAYANA NARLAPURAM Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Aug 9, 2022 at 12:42 PM Kyotaro Horiguchi wrote: > > > Can you please explain more about your idea, I may be missing something? > > (I'm not sure I understand the requirements here..) I've explained the problem with the current HA setup with synchronous replication upthread at [1]. Let me reiterate it here once again. When a query is cancelled (a simple stroke of CTRL+C or pg_cancel_backend() call) while the txn is waiting for ack in SyncRepWaitForLSN(); for the client, the txn is actually committed (locally-committed-but-not-yet-replicated to all of sync standbys) like any other txn, a warning is emitted into server logs but it is of no use for the client (think of client as applications). There can be many such txns waiting for ack in SyncRepWaitForLSN() and query cancel can be issued on all of those sessions. The problem is that the subsequent reads will then be able to read all of those locally-committed-but-not-yet-replicated to all of sync standbys txns data - this is what I call an inconsistency (can we call this a read-after-write inconsistency?) because of lack of proper query cancel handling. And if the sync standbys are down or unable to come up for some reason, until then, the primary will be serving clients with the inconsistent data. BTW, I found a report of this problem here [2]. The solution proposed for the above problem is to just 'not honor query cancels at all while waiting for ack in SyncRepWaitForLSN()'. When a proc die is pending, then also, there can be locally-committed-but-not-yet-replicated to all of sync standbys txns. Typically, there are two choices for the clients 1) reuse the primary instance after restart 2) failover to one of sync standbys. For case (1), there might be read-after-write inconsistency as explained above. For case (2), those txns might get lost completely if the failover target sync standby or the new primary didn't receive them and the other sync standbys that have received them are now ahead and need a special treatment (run pg_rewind) for them to be able to connect to new primary. The solution proposed for case (1) of the above problem is to 'process the ProcDiePending immediately and upon restart the first backend can wait until the sync standbys are caught up to ensure no inconsistent reads'. The solution proposed for case (2) of the above problem is to 'either run pg_rewind for the sync standbys that are ahead or use the idea proposed at [3]'. I hope the above explanation helps. [1] https://www.postgresql.org/message-id/flat/CALj2ACUrOB59QaE6%3DjF2cFAyv1MR7fzD8tr4YM5%2BOwEYG1SNzA%40mail.gmail.com [2] https://stackoverflow.com/questions/42686097/how-to-disable-uncommited-reads-in-postgres-synchronous-replication [3] https://www.postgresql.org/message-id/CALj2ACX-xO-ZenQt1MWazj0Z3ziSXBMr24N_X2c0dYysPQghrw%40mail.gmail.com -- Bharath Rupireddy RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/