Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sHf7E-009v7V-P2 for pgsql-general@arkaria.postgresql.org; Thu, 13 Jun 2024 07:43:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sHf7C-00ErSV-AX for pgsql-general@arkaria.postgresql.org; Thu, 13 Jun 2024 07:43:23 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sHf7B-00ErRf-S6 for pgsql-general@lists.postgresql.org; Thu, 13 Jun 2024 07:43:22 +0000 Received: from mail-yw1-x1132.google.com ([2607:f8b0:4864:20::1132]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sHf79-0015Tt-QC for pgsql-general@lists.postgresql.org; Thu, 13 Jun 2024 07:43:21 +0000 Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-62ffc495935so7942817b3.1 for ; Thu, 13 Jun 2024 00:43:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718264599; x=1718869399; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=tXu+jEtN14P2n3NEYLYVOKYHaVNuAndL5Q0dhbRDjdg=; b=E0s5BhsWOjtXJUhAbXhT56OmeooAkTgLV9n9kwGXAlPPJQmPukLh1mM9OWjuxWUeeD 4tn4VC9xmcYq3Fn9H/KgeoVqd+M6fMuIiFZR6YAxBgcXmFIdSYjuNWQmdzYS4Pv4ZXB9 A3SRXg4MMw/vZ0XKqZ+n77Buxwsv+HqsJUVXCB1kF3vS6mE7jpiBMaee9yCIq+xR31Il s6u0RetkGGWz4rhsGEwoBDWFaoWE2ollX7Hhf2GRsImUBySyNf+8kxBxAT9C5IIN64B6 WPNMN/jkGosOL+iAZnHfY1sfcFL8Q/ZsMVYD4rBHgPDsqtj401wHs5wOTlA0D6q5e8Cz RsLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718264599; x=1718869399; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tXu+jEtN14P2n3NEYLYVOKYHaVNuAndL5Q0dhbRDjdg=; b=RbSOQUZ0eqwHbkF5I94YkLhg1INfk65afKiSGwg24hMKMRSGJGQWcYf7dEZ/Ag/gSv qvooOKMBPkqfHgvdzLUyGILqNMPmQC/kPipdKazFR4GSNu4+rGHfARJwyCCwBUt4N7cM h0R3WSeF9Hf+jo+lZb0R8li9YS1HVSzgHScvrEePveyb6LSrpB2LkckqRfl5EO/5FCWS y5xULW4LAiMFE3vphiQ4r3CngUJf2baQyqT+fTTS7T492AJQvNuce9at7Q+bf0ooeqpQ MbfhsRF0vUAKVi7O6ZZy+yLha9/TvbRD+GoPbfUM3OvjjrW96buyQAfXhUJzFzvf4yxY ZSPA== X-Forwarded-Encrypted: i=1; AJvYcCV/Z5+oSTRH1aO9uUVnb/Jw42xU5qoQga/1nReefjW526fwKnjewzU0wu2FPnn7EQCmI9mppF+KxQMOhrNc2YsNXaALvvkLu0MUcwZwxtVrelxE X-Gm-Message-State: AOJu0YxQDnzLNgZIm+w4T3UCCGSkj0bpHaw8pPURcLgzw4EaRDClqexV dfIUlZOElVo+B84gcztvwrME7ugwe1iVBOnehVo8Xgw+l2BHTm6de7zPaI0b6bNt5yU1phJI+XJ kDroC2cRqEWGoc4Qa10etlwSg8uo= X-Google-Smtp-Source: AGHT+IHw/z4NBq9CU/Xsp6oinnl+KSf+2IEkhc2EoA9ft3hjYjlazqC1a5Up2ptZ4B1iR+CK5FX/dcC0AY0crDic7zw= X-Received: by 2002:a0d:da43:0:b0:62c:e537:3750 with SMTP id 00721157ae682-62fb7f99d81mr38502757b3.10.1718264597592; Thu, 13 Jun 2024 00:43:17 -0700 (PDT) MIME-Version: 1.0 References: <891bcfec74f7358ef0212caf6565a35153dd2941.camel@cybertec.at> <56ad97911d83f721dd872e8ee68cd77d50d3eef6.camel@cybertec.at> In-Reply-To: From: yudhi s Date: Thu, 13 Jun 2024 13:13:06 +0530 Message-ID: Subject: Re: Long running query causing XID limit breach To: sud , Laurenz Albe Cc: Simon Elbaz , pgsql-general Content-Type: multipart/alternative; boundary="000000000000f846b1061ac0a6f2" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f846b1061ac0a6f2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Jun 8, 2024 at 2:51=E2=80=AFPM sud wrote: > > Thank You so much Laurenz and Yudhi. > > Yes its RDS and as you mentioned there does exist a space limitation of > ~64TB but as Laurenz mentioned the only time the second standby may crash > would be probably because of the storage space saturation and thus we ne= ed > to have appropriate monitoring in place to find this and get alerted > beforehand. And also a monitoring to see how much WAL gets generated per > hour/day to get an idea of the usage. I am not sure how to do it , but wi= ll > check on this. > Not exactly related but just for our information, While going through the "aurora postgres" database docs in regards to similar concepts which are getting discussed here, I am finding some interesting stuff. https://aws.amazon.com/blogs/database/manage-long-running-read-queries-on-a= mazon-aurora-postgresql-compatible-edition/ *Cancel the conflicting query on the reader node if the conflict lasts longer than max_standby_streaming_delay (maximum 30 seconds). This is different from Amazon RDS or self-managed PostgreSQL. With Amazon RDS or self-managed PostgreSQL, the instance has its own physical copy of the database, and you=E2=80=99re able to set the parameter max_standby_streamin= g_delay as high as you want to prevent query cancellation.If the conflicting query can=E2=80=99t cancel in time, or if multiple long-running queries are causi= ng the replication lag to go beyond 60 seconds, Aurora restarts the reader node to ensure it=E2=80=99s not lagging far behind the primary node.* So if i get it correct it means, even if hot_standby_feedback is set to OFF= , the constraints of max_standby_streaming_delay (30 seconds) and the 60-second replication lag limit applies. And thus Aurora may cancel long-running queries or restart reader nodes to maintain synchronization even if it just runs for >60seconds. So it's really odd but does that mean , by no way you can guarantee a query to run >60 seconds on read replica in aurora postgres? --000000000000f846b1061ac0a6f2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Sat, Jun 8, 2024 at 2:51=E2= =80=AFPM sud <suds1434@gmail.com> wrote:

<= /div>
Thank You so much Laurenz and Yudhi.

Yes= its RDS and as you mentioned there does exist=C2=A0a space limitation of ~= 64TB but as Laurenz mentioned the only time the second standby may crash wo= uld be probably because of=C2=A0 the storage=C2=A0space saturation and thus= we need to have appropriate monitoring in place to find this and get alert= ed beforehand. And also a monitoring to see how much WAL gets generated=C2= =A0per hour/day to get an idea of the usage. I am not sure how to do it , b= ut will check on this.

=C2=A0<= /div>
Not exactly related but just for our inform= ation, While going through the "aurora postgres" database docs in= regards to similar concepts which are getting discussed here, I am finding= some interesting=C2=A0stuff.



--000000000000f846b1061ac0a6f2--