Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p05im-0004bJ-Fl for pgsql-hackers@arkaria.postgresql.org; Tue, 29 Nov 2022 18:52:44 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1p05il-0002nY-7o for pgsql-hackers@arkaria.postgresql.org; Tue, 29 Nov 2022 18:52:43 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p05ik-0002m8-TN for pgsql-hackers@lists.postgresql.org; Tue, 29 Nov 2022 18:52:42 +0000 Received: from mail-yw1-x112d.google.com ([2607:f8b0:4864:20::112d]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1p05if-00050d-SO for pgsql-hackers@lists.postgresql.org; Tue, 29 Nov 2022 18:52:41 +0000 Received: by mail-yw1-x112d.google.com with SMTP id 00721157ae682-3b48b139b46so148124587b3.12 for ; Tue, 29 Nov 2022 10:52:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=oAB5EbE333rvQJERoT4/cBuGkdz+6yqv1CZIahkaEJc=; b=ode+BgaHiGovY93+XqNAMn68u0IhIpwB9O/mbS6IeJ3UPytXthUtbSVmUPbp0UE2Zx qkDRrThdWMgk+6LZL7SQxoWIGRGh4iPVvy/xD2bw8WPI8nVritaAfvnkrq8G6cqos5IM KosDA2tzbnD9tr2TCXLmHjKPX+7+Nj5v7eojYFKCNJFYxetCEVVqs1MbIDJv5BilwGOr y4VqTxhdt9mfaIYOKgOgNje5Lte1gXygfLKRqoPUiw70nc+LALOv2PZPbv6iZP9lcQtG DW0yilqQQg2hXATiZxu/PNIyCu//XEJycc3DFCs6AZHPNXe7Ev/wmUSatskexZPU5oRZ y+Hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oAB5EbE333rvQJERoT4/cBuGkdz+6yqv1CZIahkaEJc=; b=6cEB7eeqqXt1yUeH596a0d+bYcc+1BoCFb7canRBtFsjC0KI1S3Gh6IYfGMrnKOJSp gbwnOiIv3ZUwDOBBOPMxqNGZUHp0BgSMo7hqkAmANT17u2++Kzixr7P3seWkWeVduMHR Wm0PUqPq2MVVHGgJCmTjAla+cfSeE2qNB/cUHZ7sLqCMM3OnRuqhqtkzpRx2v2j8gxq1 q3apFOMO6RdslJtquaU5WRKo4MtRWz0a6nqMTBDudVKXO2POmIiOprNbsfr2AwugNN+q xcLig1vfAxZFlSMWt5MsJn5DzsLJsHOj7jfHxSUkHiR2qbLCLvDtTNDDAWgeKyB/9iqw S5uw== X-Gm-Message-State: ANoB5pnH6lAqW7mPrS+4Jc0YoftpUIidW5MF9kN9/qZFbrDSYzvliX9w 9iDOROFsLE46IbNWEsuOF+gTxL/3vNlpYMZLQ5Y= X-Google-Smtp-Source: AA0mqf48mdK8fvvLXbPNUKoroNaUZleUBz7Ys++0z1VQXhL5tE8LsKZiOOZV0q+gW/b56AdJRTuDQLFPF7vwRtNiY9o= X-Received: by 2002:a05:690c:854:b0:370:4a99:df7d with SMTP id bz20-20020a05690c085400b003704a99df7dmr52736251ywb.308.1669747956121; Tue, 29 Nov 2022 10:52:36 -0800 (PST) MIME-Version: 1.0 References: <20220805.114916.994654810780821553.horikyota.ntt@gmail.com> <20220809.161236.1486509314201074910.horikyota.ntt@gmail.com> In-Reply-To: From: Andrey Borodin Date: Tue, 29 Nov 2022 10:52:24 -0800 Message-ID: Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication To: Bruce Momjian Cc: SATYANARAYANA NARLAPURAM , Bharath Rupireddy , Kyotaro Horiguchi , Laurenz Albe , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Nov 29, 2022 at 8:29 AM Bruce Momjian wrote: > > On Tue, Nov 29, 2022 at 08:14:10AM -0800, SATYANARAYANA NARLAPURAM wrote: > > 2. Process proc die immediately when a backend is waiting for sync > > replication acknowledgement, as it does today, however, upon restart, > > don't open up for business (don't accept ready-only connections) > > unless the sync standbys have caught up. > > > > > > Are you planning to block connections or queries to the database? It would be > > good to allow connections and let them query the monitoring views but block the > > queries until sync standby have caught up. Otherwise, this leaves a monitoring > > hole. In cloud, I presume superusers are allowed to connect and monitor (end > > customers are not the role members and can't query the data). The same can't be > > true for all the installations. Could you please add more details on your > > approach? > > I think ALTER SYSTEM should be allowed, particularly so you can modify > synchronous_standby_names, no? We don't allow SQL access during crash recovery until it's caught up to consistency point. And that's for a reason - the cluster may have invalid system catalog. So no, after crash without a quorum of standbys you can only change auto.conf and send SIGHUP. Accessing the system catalog during crash recovery is another unrelated problem. But I'd propose to treat these two points differently, they possess drastically different scales of danger. Query Cancels are issued here and there during failovers\switchovers. Crash amidst network partitioning is not that common. Best regards, Andrey Borodin.