Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ogLrD-0005Wl-NA for pgsql-hackers@arkaria.postgresql.org; Thu, 06 Oct 2022 08:03:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1ogLrB-00070v-Es for pgsql-hackers@arkaria.postgresql.org; Thu, 06 Oct 2022 08:03:49 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ogLrB-00070h-2q for pgsql-hackers@lists.postgresql.org; Thu, 06 Oct 2022 08:03:49 +0000 Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1ogLr8-0008Ub-HR for pgsql-hackers@lists.postgresql.org; Thu, 06 Oct 2022 08:03:48 +0000 Received: by mail-wr1-x42c.google.com with SMTP id a10so1386476wrm.12 for ; Thu, 06 Oct 2022 01:03:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=tBCVefW+oP7+nRyyyloEsMcs4JN8pW1Qs0X3/78mxnU=; b=fb/JujOrMyHZCliuo4sU6e7bGLgOGoF5ZSgSEjIPwFs2W56CX4u+umSSAa2AWHWqFi PWAw4MR/7mn4oesBnpOAsb1udgdaktgTmFyUdjCKN5L9SmWQ9iljF9kZkfbi0ilGpD7X 7CtAFs5gBMUO0GHe2M1Fh9JMNGsaYh8WJSuvMknw/+3i+BTN9sCr400l+lYqnFFJmnVg /2tFFu3RoXjjYq1pTyJvgp71wRNzCy2SXLRNinHShO2T5DvZmuMBXc9gRfG7o4meGi7z 4BkRYIZyidPYcMKTK5OLqfEmNbDWzCHVGHpoVfC7D91KmHRPIesZHthP1GhSv0V45Ko6 QRuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tBCVefW+oP7+nRyyyloEsMcs4JN8pW1Qs0X3/78mxnU=; b=pYD+CllYbIhDG7wGB1P3AWcDUtO+cLEOUdgNCVhoOm867GmMu0gNxn623lWhxi0e9C renxve9jRxhTbMtnRGetyF2OUSMsRvBxeWr+xXKQtCwI+q4oy/E5EbiA2bPKXiffuoX3 RPbUfrpBWTNf2lyY/K0EZFAHKmh0OufLY58wGPVOM+v6zAVU7bIJ+QpOjvehQGqNWz3I pQwXcxcff+8G8RpVba+vLZ25RnZKcVrzAWV1Nnl2iY2zk8ROjb/H6lFG7t4gdty3vnZK v1ZcIAuoNjMmVzc9APN0zGrO+ODWwe+374QzXAE0WctDGpPJEu36ejV5AdouyQXuT6k+ lMtg== X-Gm-Message-State: ACrzQf1Hdlx/vq4Wuecmsacla2vVdl09ynI5XxUupciu+M+KeiTWdQas g0i6t2iXg+rbLdS+eHgzIe0UcvsFoz+A9MU1QLw= X-Google-Smtp-Source: AMsMyM6TyBQ6sqdvKpSVIolGWh3rF3vHl04JOCEZgpZXFfrH/i/DpMmhRT9TZZjt3JyVvVwDXa3PPzluCG9cstL88eg= X-Received: by 2002:a05:6000:154b:b0:22e:64fa:dc24 with SMTP id 11-20020a056000154b00b0022e64fadc24mr2299083wry.113.1665043424944; Thu, 06 Oct 2022 01:03:44 -0700 (PDT) MIME-Version: 1.0 References: <9290b55b6ae2b04e002ca9dadadd1cca09461482.camel@cybertec.at> <20220805.114916.994654810780821553.horikyota.ntt@gmail.com> <20220809.161236.1486509314201074910.horikyota.ntt@gmail.com> In-Reply-To: From: Bharath Rupireddy Date: Thu, 6 Oct 2022 13:33:33 +0530 Message-ID: Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication To: Bruce Momjian Cc: Kyotaro Horiguchi , Laurenz Albe , PostgreSQL Hackers , SATYANARAYANA NARLAPURAM Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Oct 6, 2022 at 2:30 AM Bruce Momjian wrote: > > As I highlighted above, by default you notify the administrator that a > sychronous replica is not responding and then ignore it. If it becomes > responsive again, you notify the administrator again and add it back as > a sychronous replica. > > > command in any form may pose security risks. I'm not sure at this > > point how this new timeout is going to work alongside > > wal_sender_timeout. > > We have archive_command, so I don't see a problem with another shell > command. Why do we need a new command to inform the admin/user about a sync replication being ignored (from sync quorum) for not responding or acknowledging for a certain amount of time in SyncRepWaitForLSN()? Can't we just add an extra column or use existing sync_state in pg_stat_replication()? We can either introduce a new state such as temporary_async or just use the existing state 'potential' [1]. A problem is that the server has to be monitored for this extra, new state. If we do this, we don't need another command to report. > > I'm thinking about the possible options that an admin has to get out > > of this situation: > > 1) Removing the standby from synchronous_standby_names. > > Yes, see above. We might need a read-only GUC that reports which > sychronous replicas are active. As you can see, there is a lot of API > design required here, but this is the most effective approach. If we use the above approach to report via pg_stat_replication(), we don't need this. > > > Once we have that, we can consider removing the cancel ability while > > > waiting for synchronous replicas (since we have the timeout) or make it > > > optional. We can also consider how do notify the administrator during > > > query cancel (if we allow it), backend abrupt exit/crash, and > > > > Yeah. If we have the > > timeout-and-auto-removal-of-standby-from-sync-standbys-list solution, > > the users can then choose to disable processing query cancels/proc > > dies while waiting for sync replication in SyncRepWaitForLSN(). > > Yes. We might also change things so a query cancel that happens during > sychronous replica waiting can only be done by an administrator, not the > session owner. Again, lots of design needed here. Yes, we need infrastructure to track who issued the query cancel or proc die and so on. IMO, it's not a good way to allow/disallow query cancels or CTRL+C based on role types - superusers or users with replication roles or users who are members of any of predefined roles. In general, it is the walsender serving sync standby that has to mark itself as async standby by removing itself from synchronous_standby_names, reloading config variables and waking up the backends that are waiting in syncrep wait queue for it to update LSN. And, the new auto removal timeout should always be set to less than wal_sender_timeout. All that said, imagine we have timeout-and-auto-removal-of-standby-from-sync-standbys-list solution in one or the other forms with auto removal timeout set to 5 minutes, any of following can happen: 1) query is stuck waiting for sync standby ack in SyncRepWaitForLSN(), no query cancel or proc die interrupt is arrived, the sync standby is made as async standy after the timeout i.e. 5 minutes. 2) query is stuck waiting for sync standby ack in SyncRepWaitForLSN(), say for about 3 minutes, then query cancel or proc die interrupt is arrived, should we immediately process it or wait for timeout to happen (2 more minutes) and then process the interrupt? If we immediately process the interrupts, then the locally-committed-but-not-replicated-to-sync-standby problems described upthread [2] are left unresolved. [1] https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-VIEW sync_state text Synchronous state of this standby server. Possible values are: async: This standby server is asynchronous. potential: This standby server is now asynchronous, but can potentially become synchronous if one of current synchronous ones fails. sync: This standby server is synchronous. quorum: This standby server is considered as a candidate for quorum standbys. [2] https://www.postgresql.org/message-id/CALj2ACXmMWtpmuT-%3Dv8F%2BLk4QCbdkeN%2ByHKXeRGKFfjG96YbKA%40mail.gmail.com -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com