Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sAN6b-009JIh-NY for pgsql-general@arkaria.postgresql.org; Fri, 24 May 2024 05:04:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sAN6a-002XiE-Vm for pgsql-general@arkaria.postgresql.org; Fri, 24 May 2024 05:04:36 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sAN6a-002Xem-Fy for pgsql-general@lists.postgresql.org; Fri, 24 May 2024 05:04:36 +0000 Received: from mail-vk1-xa2e.google.com ([2607:f8b0:4864:20::a2e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sAN6X-001jOe-2t for pgsql-general@lists.postgresql.org; Fri, 24 May 2024 05:04:35 +0000 Received: by mail-vk1-xa2e.google.com with SMTP id 71dfb90a1353d-4e4f02f4c6fso203467e0c.2 for ; Thu, 23 May 2024 22:04:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716527072; x=1717131872; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=NW54TrPwjs4yk7Nc2tj0crqQOXLZas45ZDKmN4va4Fg=; b=SsckMo7SIi3HqmfttHod0dv8nJms54ChSUqa7L1tv0ubyZ9x12IQf5rbObX3rJrS+F yc1NxCutb8nim2FZTM42w4Qo1afREGXIQX2RBPape6JQZX0jwdDt2NZRK+6VBqDuWWO2 lhLRoDwNbNqjenpS0Dgb2bjC/m9pdnuYy02budflyOUUzd001Xp9xq/uVNm3AoX+773d BdZK4DExXdrd+B2ZdSrUZG60RDPYmLQz9X2GgmnEQEZomrTTzYo/KY0tBwaCaXvnkCA7 7Zxae34fVwddMYJdNlWkFORDtjcE51lacm278BQ5Gl8MmLvEbVz8JwHaI+znaTOSokUC PWIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716527072; x=1717131872; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NW54TrPwjs4yk7Nc2tj0crqQOXLZas45ZDKmN4va4Fg=; b=PYb+mXpcboP0n4EgfBblxRpIW+rbcHJGgQuaUg527JH7aNVSMKkPqfn8wJwzeuaxm3 mnb3YGv/wQgcydLlsa3QoCTWCej8Aitayqd5TEtw+ioDJ/tRii1jRHilt4Czl9QOJuQi NkDIf7t85lb2OhQVCmJahcYO2gaVrG9y5VB+QLeQ2PqqkQoQ+zrvk6E5g6wXCe6zwkOv Mo7eCaz8UzRMn/VJKnmTB8h6Og8BSM0KbuJQ+i7c9RSgI7YU84X1goL3OsuBfuc/C8HN LU5zaMa4CzgrcQE3TFYIS10u7Enp41PDOhcKjF45A5lCQ/GuKE+EbqzXcimQE+lgvKTv wiDQ== X-Gm-Message-State: AOJu0YzQBaARNIbnqyUyYZ3esXbN0YsGMhciIbDfG4Icd0g/lHMDhSaj saEhG3QRJ8WbfTONphmRIYrwtA/IRejSxZYVYAlYusZC1RY8AAX7E7kyeeV4PBviizWuPb/kqr6 cKDfMsVyLI9fHSgus04IJbl2H73o= X-Google-Smtp-Source: AGHT+IGmEaom0N2E32daLL3pFYUlQsUqik7Vy5lFp8ejDkPyJt7hoNYnMwkon7yGbaqyIB2Yx6HjX8SAohAcpf7hdq4= X-Received: by 2002:a05:6122:319a:b0:4d8:7222:b6da with SMTP id 71dfb90a1353d-4e4f027193cmr1432917e0c.6.1716527071702; Thu, 23 May 2024 22:04:31 -0700 (PDT) MIME-Version: 1.0 References: <891bcfec74f7358ef0212caf6565a35153dd2941.camel@cybertec.at> In-Reply-To: <891bcfec74f7358ef0212caf6565a35153dd2941.camel@cybertec.at> From: sud Date: Fri, 24 May 2024 10:34:20 +0530 Message-ID: Subject: Re: Long running query causing XID limit breach To: Laurenz Albe Cc: pgsql-general Content-Type: multipart/alternative; boundary="0000000000005b369a06192c1a64" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005b369a06192c1a64 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, May 23, 2024 at 8:11=E2=80=AFPM Laurenz Albe wrote: > On Thu, 2024-05-23 at 18:15 +0530, sud wrote: > > On Thu, May 23, 2024 at 1:45=E2=80=AFPM Laurenz Albe > wrote: > > > If a long running query on the standby influences the primary, that > means that > > > you have "hot_standby_feedback" set to "on". Set it to "off". > > > > Will the setting up of "hot_standby_feedback" value to OFF will cause t= he > > reader instance to give incorrect query results or unexpected query > failure > > which will be potential inconsistency between the writer and reader > instance, > > as because those XID's can be removed/cleaned by the writer node even i= f > its > > being read by the reader instance query. And it can have more > replication lag. > > There will never be incorrect query results. > > It can happen that a query on the standby gets canceled if you don't set > "max_standby_streaming_delay" to -1, but that can happen even if > "hot_standby_feedback" is "on". It just happens less often. > > The effect of setting "max_standby_streaming_delay" to -1 will often be a > replication delay if you run a long query. That's what you have to > accept if you want to execute long-running queries. > > You will never be able to have both of the following: > - queries never get canceled > - there is no replication delay > > > I am trying to understand these two parameters and each time it looks a bit confusing to me. If These two parameters complement or conflict with each other. Say for example, If we set hot_feedback_standby to ON (which is currently set as default ON by the way), it will make the primary wait till the query completion at standby and can cause such a high bump in XID in scenarios where the query on standby runs for days(like in our current scenario which happens). So we were thinking of setting it as OFF, to avoid the transaction ID wrap around issue.. But as you also mentioned to set the "max_standby_streaming_delay" to -1 (which is currently set as 14 second in our case) ,it will wait infinitely , till the query completes on the standby and wont apply the WAL which can cause override of the XID which the standby query is reading from. But wont this same behaviour be happening while we have hot_feedback_standby set as "ON"? But again for HA , in case primary down we should not be in big lag for the standby and thus we want the standby also with minimal lag. And as you mentioned there will never be incorrect results but at amx it will be query cancellation, so I was thinking , if it's fine to just keep the "hot_feedback_standby" as OFF and let the max_standby_streaming_delay set as it is like 14 sec. Let me know your thoughts. Basically below are the combinations, i am confused between.. hot_feedback_stanby ON and max_standby_streaming_delay=3D-1 or hot_feedback_stanby OFF and max_standby_streaming_delay=3D-1 Or hot_feedback_stanby ON and max_standby_streaming_delay=3D14 sec Or hot_feedback_stanby OFF and max_standby_streaming_delay=3D14 sec --0000000000005b369a06192c1a64 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, May 23, 2024 at 8:11=E2=80=AFPM L= aurenz Albe <laurenz.albe@cy= bertec.at> wrote:
On Thu, 2024-05-23 at 18:15 +0530, sud = wrote:
> On Thu, May 23, 2024 at 1:45=E2=80=AFPM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > If a long running query on the standby influences the primary, th= at means that
> > you have "hot_standby_feedback" set to "on".= =C2=A0 Set it to "off".
>
> Will the setting up of "hot_standby_feedback" value to OFF w= ill cause the
> reader instance to give incorrect=C2=A0query results or unexpected que= ry failure
> which will be potential inconsistency between the writer and reader in= stance,
> as because those XID's can be removed/cleaned by the writer node e= ven if its
> being read by the reader instance query. And it can have more replicat= ion lag.

There will never be incorrect query results.

It can happen that a query on the standby gets canceled if you don't se= t
"max_standby_streaming_delay" to -1, but that can happen even if<= br> "hot_standby_feedback" is "on".=C2=A0 It just happens l= ess often.

The effect of setting "max_standby_streaming_delay" to -1 will of= ten be a
replication delay if you run a long query.=C2=A0 That's what you have t= o
accept if you want to execute long-running queries.

You will never be able to have both of the following:
- queries never get canceled
- there is no replication delay








--0000000000005b369a06192c1a64--