public inbox for [email protected]help / color / mirror / Atom feed
canceling/terminating statement due to conflict with recovery in Replica/DR instances 7+ messages / 4 participants [nested] [flat]
* canceling/terminating statement due to conflict with recovery in Replica/DR instances @ 2025-09-30 05:59 Ishan joshi <[email protected]> 0 siblings, 2 replies; 7+ messages in thread From: Ishan joshi @ 2025-09-30 05:59 UTC (permalink / raw) To: pgsql-admin Hi Team, We are using Postgresql 16.9 in production and with large database about 25TB of size. We have patroni setup with replica instance and DR patroni setup with patroni streaming. We have high volume and frequent commit in the database. There are few large tables for which we asked client to execute queries on DR/Replica instances but these queries are start getting failed with "canceling statement due to conflict with recovery" and "terminating statement due to conflict with recovery" error. As I understand the behavior is correct but we need to get rid of this issue. I gone through the old posts and some documentation and got to know that below parameters can help to reduce this error. max_standby_streaming_delay max_standby_archive_delay hot_standby_feedback = off Our queries are running for long period that makes me to set this value to some minutes/hours (lets set 900s) which is not feasible for production as it will start impacting the replication lag. Also, the queries will fail if it reaches to mentioned thresholds. If I set these parameters to "-1" (disable) then there will be direct impact on replication lag which will impact further queries on replica node and DR cluster. Can you please guide If any other better solution present for such scenario? Thanks & Regards, Ishan Joshi ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances @ 2025-09-30 06:16 Laurenz Albe <[email protected]> parent: Ishan joshi <[email protected]> 1 sibling, 1 reply; 7+ messages in thread From: Laurenz Albe @ 2025-09-30 06:16 UTC (permalink / raw) To: Ishan joshi <[email protected]>; pgsql-admin On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote: > We are using Postgresql 16.9 in production and with large database about 25TB > of size. We have patroni setup with replica instance and DR patroni setup with > patroni streaming. > > We have high volume and frequent commit in the database. There are few large > tables for which we asked client to execute queries on DR/Replica instances but > these queries are start getting failed with "canceling statement due to conflict > with recovery" and "terminating statement due to conflict with recovery" error. > > As I understand the behavior is correct but we need to get rid of this issue. > > I gone through the old posts and some documentation and got to know that below > parameters can help to reduce this error. > > max_standby_streaming_delay > max_standby_archive_delay > hot_standby_feedback = off > > Our queries are running for long period that makes me to set this value to some > minutes/hours (lets set 900s) which is not feasible for production as it will > start impacting the replication lag. Also, the queries will fail if it reaches > to mentioned thresholds. > > If I set these parameters to "-1" (disable) then there will be direct impact on > replication lag which will impact further queries on replica node and DR cluster. > > Can you please guide If any other better solution present for such scenario? No, there is no better solution. You can reduce replication conflicts by turning on "hot_standby_feedback" and by turning off "vacuum_truncate", but you probably won't be able to get rid of all replication conflicts. You can either have a small replay delay and canceled queries or no canceled queries, but the occasional replay delay. If you need both no delay and no canceled queries, the only clean solution is to have two standby servers. Yours, Laurenz Albe ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances @ 2025-09-30 07:58 Peter Gram <[email protected]> parent: Laurenz Albe <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Peter Gram @ 2025-09-30 07:58 UTC (permalink / raw) To: Laurenz Albe <[email protected]>; +Cc: Ishan joshi <[email protected]>; pgsql-admin Hi Laurenz Thanks for all the answers you give on this list. Could you elaborate on why two or more standby servers would help in this case ? Med venlig hilsen Peter Gram Sæbyholmsvej 18 2500 Valby Mobile: (+45) 5374 7107 Email: [email protected] On Tue, 30 Sept 2025 at 08:17, Laurenz Albe <[email protected]> wrote: > On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote: > > We are using Postgresql 16.9 in production and with large database about > 25TB > > of size. We have patroni setup with replica instance and DR patroni > setup with > > patroni streaming. > > > > We have high volume and frequent commit in the database. There are few > large > > tables for which we asked client to execute queries on DR/Replica > instances but > > these queries are start getting failed with "canceling statement due to > conflict > > with recovery" and "terminating statement due to conflict with recovery" > error. > > > > As I understand the behavior is correct but we need to get rid of this > issue. > > > > I gone through the old posts and some documentation and got to know that > below > > parameters can help to reduce this error. > > > > max_standby_streaming_delay > > max_standby_archive_delay > > hot_standby_feedback = off > > > > Our queries are running for long period that makes me to set this value > to some > > minutes/hours (lets set 900s) which is not feasible for production as it > will > > start impacting the replication lag. Also, the queries will fail if it > reaches > > to mentioned thresholds. > > > > If I set these parameters to "-1" (disable) then there will be direct > impact on > > replication lag which will impact further queries on replica node and DR > cluster. > > > > Can you please guide If any other better solution present for such > scenario? > > No, there is no better solution. > > You can reduce replication conflicts by turning on "hot_standby_feedback" > and by > turning off "vacuum_truncate", but you probably won't be able to get rid > of all > replication conflicts. > > You can either have a small replay delay and canceled queries or no > canceled > queries, but the occasional replay delay. > > If you need both no delay and no canceled queries, the only clean solution > is > to have two standby servers. > > Yours, > Laurenz Albe > > > ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances @ 2025-09-30 09:40 Laurenz Albe <[email protected]> parent: Peter Gram <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Laurenz Albe @ 2025-09-30 09:40 UTC (permalink / raw) To: Peter Gram <[email protected]>; +Cc: Ishan joshi <[email protected]>; pgsql-admin On Tue, 2025-09-30 at 09:58 +0200, Peter Gram wrote: > On Tue, 30 Sept 2025 at 08:17, Laurenz Albe <[email protected]> wrote: > > On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote: > > > There are few large > > > tables for which we asked client to execute queries on DR/Replica instances but > > > these queries are start getting failed with "canceling statement due to conflict > > > with recovery" and "terminating statement due to conflict with recovery" error. > > > > > > As I understand the behavior is correct but we need to get rid of this issue. > > > > > > I gone through the old posts and some documentation and got to know that below > > > parameters can help to reduce this error. > > > > > > max_standby_streaming_delay > > > max_standby_archive_delay > > > hot_standby_feedback = off > > > > > > Our queries are running for long period that makes me to set this value to some > > > minutes/hours (lets set 900s) which is not feasible for production as it will > > > start impacting the replication lag. Also, the queries will fail if it reaches > > > to mentioned thresholds. > > > > > > If I set these parameters to "-1" (disable) then there will be direct impact on > > > replication lag which will impact further queries on replica node and DR cluster. > > > > > > Can you please guide If any other better solution present for such scenario? > > > > No, there is no better solution. > > > > If you need both no delay and no canceled queries, the only clean solution is > > to have two standby servers. > > Could you elaborate on why two or more standby servers would help in this case ? One of the standby servers would have "max_standby_streaming_delay = 0" or "hot_standby = off", that one would be for high availability. The other one would have "max_standby_streaming_delay = -1" and would be used for queries. Yours, Laurenz Albe ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances @ 2025-09-30 12:36 Imran Khan <[email protected]> parent: Ishan joshi <[email protected]> 1 sibling, 1 reply; 7+ messages in thread From: Imran Khan @ 2025-09-30 12:36 UTC (permalink / raw) To: Ishan joshi <[email protected]>; +Cc: pgsql-admin Hi Isha, I believe you have partitions and correct type of indexes created for those tables. Also, is this 25 TB size grown over many years or just few years old? Parameters tuning can help but won't be a permanent solution. Having multiple replicas I believe can make sense at this point. Thanks, Imran On Tue, Sep 30, 2025, 8:59 AM Ishan joshi <[email protected]> wrote: > Hi Team, > > We are using Postgresql 16.9 in production and with large database about > 25TB of size. We have patroni setup with replica instance and DR patroni > setup with patroni streaming. > > We have high volume and frequent commit in the database. There are few > large tables for which we asked client to execute queries on DR/Replica > instances but these queries are start getting failed with "canceling > statement due to conflict with recovery" and "terminating statement due to > conflict with recovery" error. > > As I understand the behavior is correct but we need to get rid of this > issue. > > I gone through the old posts and some documentation and got to know that > below parameters can help to reduce this error. > > max_standby_streaming_delay > max_standby_archive_delay > hot_standby_feedback = off > > Our queries are running for long period that makes me to set this value to > some minutes/hours (lets set 900s) which is not feasible for production as > it will start impacting the replication lag. Also, the queries will fail if > it reaches to mentioned thresholds. > > If I set these parameters to "-1" (disable) then there will be direct > impact on replication lag which will impact further queries on replica node > and DR cluster. > > Can you please guide If any other better solution present for such > scenario? > > Thanks & Regards, > Ishan Joshi > ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances @ 2025-09-30 15:28 Ishan joshi <[email protected]> parent: Laurenz Albe <[email protected]> 0 siblings, 0 replies; 7+ messages in thread From: Ishan joshi @ 2025-09-30 15:28 UTC (permalink / raw) To: Laurenz Albe <[email protected]>; Peter Gram <[email protected]>; +Cc: pgsql-admin Hi Laurenz, Thanks, for your explanations. It makes sense for having another replica instance but in our case, it is not possible to have another replica instance with huge database size. We will see the impact with delaying the reply lag and act accordingly. Thanks & Regards, Ishan Joshi ________________________________ From: Laurenz Albe <[email protected]> Sent: 30 September 2025 15:10 To: Peter Gram <[email protected]> Cc: Ishan joshi <[email protected]>; [email protected] <[email protected]> Subject: Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances On Tue, 2025-09-30 at 09:58 +0200, Peter Gram wrote: > On Tue, 30 Sept 2025 at 08:17, Laurenz Albe <[email protected]> wrote: > > On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote: > > > There are few large > > > tables for which we asked client to execute queries on DR/Replica instances but > > > these queries are start getting failed with "canceling statement due to conflict > > > with recovery" and "terminating statement due to conflict with recovery" error. > > > > > > As I understand the behavior is correct but we need to get rid of this issue. > > > > > > I gone through the old posts and some documentation and got to know that below > > > parameters can help to reduce this error. > > > > > > max_standby_streaming_delay > > > max_standby_archive_delay > > > hot_standby_feedback = off > > > > > > Our queries are running for long period that makes me to set this value to some > > > minutes/hours (lets set 900s) which is not feasible for production as it will > > > start impacting the replication lag. Also, the queries will fail if it reaches > > > to mentioned thresholds. > > > > > > If I set these parameters to "-1" (disable) then there will be direct impact on > > > replication lag which will impact further queries on replica node and DR cluster. > > > > > > Can you please guide If any other better solution present for such scenario? > > > > No, there is no better solution. > > > > If you need both no delay and no canceled queries, the only clean solution is > > to have two standby servers. > > Could you elaborate on why two or more standby servers would help in this case ? One of the standby servers would have "max_standby_streaming_delay = 0" or "hot_standby = off", that one would be for high availability. The other one would have "max_standby_streaming_delay = -1" and would be used for queries. Yours, Laurenz Albe ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances @ 2025-09-30 15:32 Ishan joshi <[email protected]> parent: Imran Khan <[email protected]> 0 siblings, 0 replies; 7+ messages in thread From: Ishan joshi @ 2025-09-30 15:32 UTC (permalink / raw) To: Imran Khan <[email protected]>; +Cc: pgsql-admin Hi Imran, Thanks for your reply. We have migrated from Oracle to Postgres these 25TB database. As the storage is huge we are not in position to create new replica instance/cluster. Yes, I also believe the tuning the parameter is not long-term solution but we will check the impact and validate the same. Thanks & Regards, Ishan Joshi ________________________________ From: Imran Khan <[email protected]> Sent: 30 September 2025 18:06 To: Ishan joshi <[email protected]> Cc: pgsql-admin <[email protected]> Subject: Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances Hi Isha, I believe you have partitions and correct type of indexes created for those tables. Also, is this 25 TB size grown over many years or just few years old? Parameters tuning can help but won't be a permanent solution. Having multiple replicas I believe can make sense at this point. Thanks, Imran On Tue, Sep 30, 2025, 8:59 AM Ishan joshi <[email protected]<mailto:[email protected]>> wrote: Hi Team, We are using Postgresql 16.9 in production and with large database about 25TB of size. We have patroni setup with replica instance and DR patroni setup with patroni streaming. We have high volume and frequent commit in the database. There are few large tables for which we asked client to execute queries on DR/Replica instances but these queries are start getting failed with "canceling statement due to conflict with recovery" and "terminating statement due to conflict with recovery" error. As I understand the behavior is correct but we need to get rid of this issue. I gone through the old posts and some documentation and got to know that below parameters can help to reduce this error. max_standby_streaming_delay max_standby_archive_delay hot_standby_feedback = off Our queries are running for long period that makes me to set this value to some minutes/hours (lets set 900s) which is not feasible for production as it will start impacting the replication lag. Also, the queries will fail if it reaches to mentioned thresholds. If I set these parameters to "-1" (disable) then there will be direct impact on replication lag which will impact further queries on replica node and DR cluster. Can you please guide If any other better solution present for such scenario? Thanks & Regards, Ishan Joshi ^ permalink raw reply [nested|flat] 7+ messages in thread
end of thread, other threads:[~2025-09-30 15:32 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-09-30 05:59 canceling/terminating statement due to conflict with recovery in Replica/DR instances Ishan joshi <[email protected]> 2025-09-30 06:16 ` Laurenz Albe <[email protected]> 2025-09-30 07:58 ` Peter Gram <[email protected]> 2025-09-30 09:40 ` Laurenz Albe <[email protected]> 2025-09-30 15:28 ` Ishan joshi <[email protected]> 2025-09-30 12:36 ` Imran Khan <[email protected]> 2025-09-30 15:32 ` Ishan joshi <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox