Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ssakF-009R8u-7C for pgsql-general@arkaria.postgresql.org; Mon, 23 Sep 2024 04:32:20 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ssakE-000sii-IU for pgsql-general@arkaria.postgresql.org; Mon, 23 Sep 2024 04:32:18 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ssakD-000siZ-VZ for pgsql-general@lists.postgresql.org; Mon, 23 Sep 2024 04:32:18 +0000 Received: from mail-oo1-xc2d.google.com ([2607:f8b0:4864:20::c2d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1ssakA-000aGs-N5 for pgsql-general@lists.postgresql.org; Mon, 23 Sep 2024 04:32:16 +0000 Received: by mail-oo1-xc2d.google.com with SMTP id 006d021491bc7-5e56759e6d7so2495369eaf.3 for ; Sun, 22 Sep 2024 21:32:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727065934; x=1727670734; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=J8Iku45nwhYU0MtMZizP1PMmmrwq5A0BOehtdB6N2CU=; b=WU0qLRK1bcB326Lh3V1rn1bk6ZuJLWubJGrZHm1edulUNlzWJ+ZRd/IXMa0g5P8VRX xz2ZbB9acDIcuN+ZXy9LwQaVFfyLA5qOD1Z1NIipCHO0RxX+j97hSUi1obde+vPHAK6M 2tfZYMRLzos6QTsF26ht4H/EAzcitWT3VHRRwwDX+gzstW9XApeUmLEnMDRspYOpo52s 9DlmS52GFCJO7C5fokwLih/xqVc/mFHy+2GA25tL2z3jIek6pvJez+asXq4rO8g6/D2O h5gameBKCzMIIJ4fXx7kno/fk1u7RsvA2Tlha/uzrEy3QUYn4yEw4LaiKi/WAkdf9c9t 4HuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727065934; x=1727670734; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=J8Iku45nwhYU0MtMZizP1PMmmrwq5A0BOehtdB6N2CU=; b=A0dNBiAyI/xRg6xaQNZNhs0OwoGYtX4se1G9kKQdjxWRY2zE27bvyvfyhmuV5eBoR+ JYbBN0ia4FYRscrcus7+OjlCVKUdNKudpJlRT4NZINF4WPMtnKAQJjNZ2hv7AjwcfQ6E VSwoJWtVS07UXmMyErE9JM9q4PBuhmhxmlLj7z/mwx/c6W+IIOLF3UlCxwmyXlppSfZm R5NNwSzUZ//dFq9D/RiSe0pwu/pA4zaHJYqNmwpaKYOZ7KF2L58QxT9yChf5t+DjQqNJ YBc96KfjisUUV7SV5BQATWrO1EvG4GIv9NbGV51m4gDrZoPDgUjEG37uIbLN43Oi35tf I0SQ== X-Gm-Message-State: AOJu0YyaUUoceytjqZmG6VhEQeh+Xmc6sRXOhcwGUoI+PzXo7VmHcPR6 352qrHBoAelV5PfR2iHwE2mKFMYNag7PpW6C/AbJy73Js8DetXKWLjlZ5YFrFHkPylpFb/ClNvA YrRWiBWbhMR1VQlGvWy6erS2JJOs= X-Google-Smtp-Source: AGHT+IEurKrN62ULJYKCxnIi5AFXcyYIz2YYdjq7hXkWr8vX5xae9bgoBg3q8N5+y6mE184QgHf2xw41O3wWG4jq8o8= X-Received: by 2002:a05:6820:2296:b0:5e5:7086:ebe8 with SMTP id 006d021491bc7-5e58b848286mr4443892eaf.0.1727065933860; Sun, 22 Sep 2024 21:32:13 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ramakrishna m Date: Mon, 23 Sep 2024 10:02:02 +0530 Message-ID: Subject: Re: Logical Replication Delay To: Justin Cc: pgsql-general@lists.postgresql.org, ravisql09@gmail.com Content-Type: multipart/alternative; boundary="0000000000007dd9300622c1dff2" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000007dd9300622c1dff2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Justin, Thank you for your suggestions and detailed insights. Due to certain business constraints, we are unable to split the tables into parallel publications. All of the tables involved are primary key tables, which adds further complexity to separating them into multiple publications= . That said, we truly appreciate your recommendations regarding the use of binary mode and reviewing the use of IDENTITY SET TO FULL. We will ensure that the subscriber is operating in binary mode and will recheck the identity setup to minimize WAL size and overhead. Regards, Ram. On Sun, 22 Sept 2024 at 01:32, Justin wrote: > Hi Ramakrishna, > > 4GB of WAL generated per minute is a lot. I would expect the replay on > the subscriber to lag behind because it is a single process. PostgreSQL = 16 > can create parallel workers for large transactions, however if there is a > flood of small transactions touching many tables the single LR worker is > going to fall behind. > > The only option is split the LR into multiple Publications and > Subscriptions as a single worker can't keep up. > > What is the justification to not split the tables across multiple > Publications and Subscriptions > > Additional items to check > > Make sure the Subscriber is using binary mode, this avoids an encoding > step. > https://www.postgresql.org/docs/current/sql-createsubscription.html > > Avoid the use of IDENTITY SET TO FULL on the publisher, if you do use > IDENTITY FULL make sure the subscriber table identity is set to a > qualifying unique index. In previous versions of PG the publisher and > subscriber identities had to match... > > IDENTITY SET TO FULL increase the size of the WAL and the work the > publisher and subscriber has to do. > > Hope this helps. > > > > On Sat, Sep 21, 2024 at 3:08=E2=80=AFPM Ramakrishna m wrote: > >> Hi Team, >> >> We have configured bidirectional replication (but traffic can only flow >> in one direction) between two data centers (distance: 1000 km, maximum >> Network latency: 100 ms) with an application TPS (transactions per secon= d) >> of 700 at maximum. >> >> We are fine with handling up to 500 TPS without observing any lag betwee= n >> the two data centers. However, when TPS increases, we notice a lag in WA= L >> files of over 100 GB (initially, it was 1 TB, but after tuning, it was >> reduced to 100 GB). During peak times, WAL files are generated at a rate= of >> 4 GB per minute. >> >> All transactions (Tx) take less than 200 ms, with a maximum of 1 second >> at times (no long-running transactions). >> >> *Here are the configured parameters and resources:* >> >> - *OS*: Ubuntu >> - *RAM*: 376 GB >> - *CPU*: 64 cores >> - *Swap*: 32 GB >> - *PostgreSQL Version*: 16.4 (each side has 3 nodes with Patroni and >> etcd configured) >> - *DB Size*: 15 TB >> >> *Parameters configured on both sides:* >> Name >> Setting Unit >> >> >> log_replication_commands off >> logical_decoding_work_mem 524288 kB >> max_logical_replication_workers 16 >> max_parallel_apply_workers_per_subscription 2 >> max_replication_slots 20 >> max_sync_workers_per_subscription 2 >> max_wal_senders 20 >> max_worker_processes 40 >> wal_level logical >> wal_receiver_timeout 600000 ms >> wal_segment_size 1073741824 B >> wal_sender_timeout 600000 ms >> >> *Optimizations applied:* >> >> 1. Vacuum freeze is managed during off-hours; no aggressive vacuum is >> triggered during business hours. >> 2. Converted a few tables to unlogged. >> 3. Removed unwanted tables from publication. >> 4. Partitioned all large tables. >> >> *Pending:* >> >> 1. Turning off/tuning autovacuum parameters to avoid triggering >> during business hours. >> >> *Not possible: *We are running all tables in a single publication, and >> it is not possible to separate them. >> >> I would greatly appreciate any suggestions you may have to help avoid >> logical replication delays, whether through tuning database or operating >> system parameters, or any other recommendations >> >> -- >> Thanks & Regards, >> Ram. >> > --0000000000007dd9300622c1dff2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi=C2=A0Justin,

Thank you for your suggestions and detailed insights.

D= ue to certain business constraints, we are unable to split the tables into = parallel publications. All of the tables involved are primary key tables, w= hich adds further complexity to separating them into multiple publications.=

That said, we truly appreciate your recommendations regarding the u= se of binary mode and reviewing the use of IDENTITY SET TO FULL. We will en= sure that the subscriber is operating in binary mode and will recheck the i= dentity setup to minimize WAL size and overhead.

Regards,
Ram.

On Sun, 22 S= ept 2024 at 01:32, Justin <zzzzz= .graf@gmail.com> wrote:
Hi=C2=A0Ramakrishna,

4GB of WAL gene= rated per minute is a lot.=C2=A0 I would expect the replay=C2=A0on the subs= criber to lag behind because it is a single process.=C2=A0 PostgreSQL 16 ca= n create parallel workers for large transactions, however if there is a flo= od of small transactions touching many tables the single LR worker is going= to fall behind.

The only option is split the=C2=A0LR into multiple= =C2=A0 Publications and Subscriptions as a single worker can't keep up.= =C2=A0

What is the justification to not split the tables across mult= iple Publications and Subscriptions=C2=A0

Additional=C2=A0items to c= heck

Make sure the Subscriber is using binary mode, this avoids an e= ncoding step.=C2=A0
https://www.postgresql.org/do= cs/current/sql-createsubscription.html

Avoid the use of IDENTITY= SET TO FULL on the publisher, if you do use IDENTITY FULL make sure the su= bscriber=C2=A0table identity is set to a qualifying unique index.=C2=A0 In = previous versions of PG the publisher and subscriber identities had to matc= h...

IDENTITY SET TO FULL increase the size of the WAL=C2=A0and=C2= =A0the work the publisher and subscriber has to do.=C2=A0

Hope this= =C2=A0helps.



On Sat, Sep 21, 2024 at 3:08=E2=80=AFPM Ramakrishna= m <ram.pgdb@gma= il.com> wrote:

Hi Team,

We have configured bidirectional replication (but traffic can = only flow in one direction) between two data centers (distance: 1000 km, ma= ximum Network latency: 100 ms) with an application TPS (transactions per se= cond) of 700 at maximum.

We are fine with = handling up to 500 TPS without observing any lag between the two data cente= rs. However, when TPS increases, we notice a lag in WAL files of over 100 G= B (initially, it was 1 TB, but after tuning, it was reduced to 100 GB). Dur= ing peak times, WAL files are generated at a rate of 4 GB per minute.

All transactions (Tx) take less than 200 ms, wi= th a maximum of 1 second at times (no long-running transactions).

Here are the configured parameters and reso= urces:

  • OS: = Ubuntu
  • RAM: 376 GB
  • CPU: 64 cores
  • Swap: 32 GB
  • PostgreSQL Version: 16.4 (each side has 3 node= s with Patroni and etcd configured)
  • DB Size: 15 TB

= Parameters configured on both sides:

<= th>Name
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Setting=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0Unit<= /th>
ms


log_replication_commandsoff
log= ical_decoding_work_mem524288kB
max= _logical_replication_workers16=
max_parallel_apply_workers_p= er_subscription=C2=A02<= /td>
max_replication_slots<= font color=3D"#000000" style=3D"background-color:rgb(255,255,255)" face=3D"= arial, sans-serif">20
= max_sync_workers_per_subscription2
max_wal_senders20
max_worker_processes40
wal_levellogical
w= al_receiver_timeout600000
wal_s= egment_size1073741824B
wal_sende= r_timeout600000ms

Optimizations applied:

  1. Va= cuum freeze is managed during off-hours; no aggressive vacuum is triggered = during business hours.
  2. Converted a few t= ables to unlogged.
  3. Removed unwanted tabl= es from publication.
  4. Partitioned all lar= ge tables.

Pending:

  1. Turning off/tuning autovacuum parame= ters to avoid triggering during business hours.

Not possible:=C2=A0We are running all tables i= n a single publication, and it is not possible to separate them.

=

I would greatly appreciate any suggestions you may have to help avoid logic= al replication delays, whether through tuning database or operating system = parameters, or any other recommendations


--
Thanks &= Regards,
Ram.
<= /div>

--0000000000007dd9300622c1dff2--