Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ss6JK-0066Wq-Ez for pgsql-general@arkaria.postgresql.org; Sat, 21 Sep 2024 20:02:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ss6JG-009UIZ-Vy for pgsql-general@arkaria.postgresql.org; Sat, 21 Sep 2024 20:02:28 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ss6JG-009UIR-Ak for pgsql-general@lists.postgresql.org; Sat, 21 Sep 2024 20:02:27 +0000 Received: from mail-io1-xd2c.google.com ([2607:f8b0:4864:20::d2c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1ss6JE-000NEk-MN for pgsql-general@lists.postgresql.org; Sat, 21 Sep 2024 20:02:26 +0000 Received: by mail-io1-xd2c.google.com with SMTP id ca18e2360f4ac-82aa93774ceso162902139f.0 for ; Sat, 21 Sep 2024 13:02:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726948944; x=1727553744; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=QQMc/QZJWjHSJ1d29sPFPIfhyeAQIOu6Cj6OWwL77/c=; b=ADAU106Pnz/Jh2TCaMNatjtouffhm3qQa2hAGP5sEXHw7Oped6H6V5lD5jIal4qD3H NG8YEcFO+jiF9F6QXh8kdvbOF8ONEXK6WP16EB6cq/Lb2+IbUPYBrcrvoa/1FTDLDAxj eQsVv0J5EhLQmXslNyzkiRvuxcFurNZKEqGsRq9zW98GHW32/1+vH7lAA40SD8KXpBeS SoW0QSQny9XWkVC4X/7bpIh2h+wAL8EAQtx98pPJT4jwB092ZLiRGQI3Ve/fxAsmqK0k zm1IZyf6WI6ffqy/MnSpOjpyVxlBzKOyd+nQvJ+GQTgl8tAdFt1Izvch48Ts4N24EFcw 3GZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726948944; x=1727553744; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=QQMc/QZJWjHSJ1d29sPFPIfhyeAQIOu6Cj6OWwL77/c=; b=QInWZC/Mhu6rbsyfRZBgycyMElPTX/ooquAy4gknKbMd5gPbJQpB/WY92nB3fb+QnK VahAqVWVwLj3ctXmSfJ1O1wvh9onJoTkQwywXVn1B/ZuXzRaY0laVcFOUf3xaljPUFWS +hNottShbIRnkXl6lmBMVP23JmabwEye+dtcbo7ZZ2nxuyIoPQrJFqtF2GbY250YT184 ATsvBrq50tSu40RdIcQ9EaTMBCCvG+rkMJug1PHkiurPFuglHPP0fvTGyBZ2YVPKhNud 2mU3mWd081rco8x+2+I73hOM/TJJpgS5p+bXxwLW9Yi/pYQHfvclJaLEyIkXhUQ/r/YS /McA== X-Gm-Message-State: AOJu0YwWY5HNrdeKvJk+viuR1KeTiRNu+PVIj0K3ftEbdT4yK/GpCWIl mAYEwsCcXeFuLfGhWgShpg681nI4K5dARcwStXMmEmDpzHyPueD3gxk42nBLUkOdBVTz3MmQ+4z hp1AnYmcfNGHFGPMH5r8JUeSek7c= X-Google-Smtp-Source: AGHT+IECGh+DqVMyU8wr1BMzCeP12/Dp+qx6aHESfQIc6kAzlzCFZV77q7lDJYDmdveWuMlsbBmC34XLJfd49FiUoDg= X-Received: by 2002:a05:6e02:180f:b0:3a0:4fb8:ceda with SMTP id e9e14a558f8ab-3a0c9d70003mr62621665ab.17.1726948943662; Sat, 21 Sep 2024 13:02:23 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Justin Date: Sat, 21 Sep 2024 16:02:11 -0400 Message-ID: Subject: Re: Logical Replication Delay To: Ramakrishna m Cc: pgsql-general@lists.postgresql.org, ravisql09@gmail.com Content-Type: multipart/alternative; boundary="0000000000005517350622a6a216" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005517350622a6a216 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Ramakrishna, 4GB of WAL generated per minute is a lot. I would expect the replay on the subscriber to lag behind because it is a single process. PostgreSQL 16 can create parallel workers for large transactions, however if there is a flood of small transactions touching many tables the single LR worker is going to fall behind. The only option is split the LR into multiple Publications and Subscriptions as a single worker can't keep up. What is the justification to not split the tables across multiple Publications and Subscriptions Additional items to check Make sure the Subscriber is using binary mode, this avoids an encoding step. https://www.postgresql.org/docs/current/sql-createsubscription.html Avoid the use of IDENTITY SET TO FULL on the publisher, if you do use IDENTITY FULL make sure the subscriber table identity is set to a qualifying unique index. In previous versions of PG the publisher and subscriber identities had to match... IDENTITY SET TO FULL increase the size of the WAL and the work the publisher and subscriber has to do. Hope this helps. On Sat, Sep 21, 2024 at 3:08=E2=80=AFPM Ramakrishna m = wrote: > Hi Team, > > We have configured bidirectional replication (but traffic can only flow i= n > one direction) between two data centers (distance: 1000 km, maximum Netwo= rk > latency: 100 ms) with an application TPS (transactions per second) of 700 > at maximum. > > We are fine with handling up to 500 TPS without observing any lag between > the two data centers. However, when TPS increases, we notice a lag in WAL > files of over 100 GB (initially, it was 1 TB, but after tuning, it was > reduced to 100 GB). During peak times, WAL files are generated at a rate = of > 4 GB per minute. > > All transactions (Tx) take less than 200 ms, with a maximum of 1 second a= t > times (no long-running transactions). > > *Here are the configured parameters and resources:* > > - *OS*: Ubuntu > - *RAM*: 376 GB > - *CPU*: 64 cores > - *Swap*: 32 GB > - *PostgreSQL Version*: 16.4 (each side has 3 nodes with Patroni and > etcd configured) > - *DB Size*: 15 TB > > *Parameters configured on both sides:* > Name > Setting Unit > > > log_replication_commands off > logical_decoding_work_mem 524288 kB > max_logical_replication_workers 16 > max_parallel_apply_workers_per_subscription 2 > max_replication_slots 20 > max_sync_workers_per_subscription 2 > max_wal_senders 20 > max_worker_processes 40 > wal_level logical > wal_receiver_timeout 600000 ms > wal_segment_size 1073741824 B > wal_sender_timeout 600000 ms > > *Optimizations applied:* > > 1. Vacuum freeze is managed during off-hours; no aggressive vacuum is > triggered during business hours. > 2. Converted a few tables to unlogged. > 3. Removed unwanted tables from publication. > 4. Partitioned all large tables. > > *Pending:* > > 1. Turning off/tuning autovacuum parameters to avoid triggering during > business hours. > > *Not possible: *We are running all tables in a single publication, and it > is not possible to separate them. > > I would greatly appreciate any suggestions you may have to help avoid > logical replication delays, whether through tuning database or operating > system parameters, or any other recommendations > > -- > Thanks & Regards, > Ram. > --0000000000005517350622a6a216 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi=C2=A0Ramakrishna,

4GB of WAL generated per minut= e is a lot.=C2=A0 I would expect the replay=C2=A0on the subscriber to lag b= ehind because it is a single process.=C2=A0 PostgreSQL 16 can create parall= el workers for large transactions, however if there is a flood of small tra= nsactions touching many tables the single LR worker is going to fall behind= .

The only option is split the=C2=A0LR into multiple=C2=A0 Publicati= ons and Subscriptions as a single worker can't keep up.=C2=A0

Wh= at is the justification to not split the tables across multiple Publication= s and Subscriptions=C2=A0

Additional=C2=A0items to check

Make= sure the Subscriber is using binary mode, this avoids an encoding step.=C2= =A0
https://www.postgresql.org/docs/current/sql-createsubscription.= html

Avoid the use of IDENTITY SET TO FULL on the publisher, if = you do use IDENTITY FULL make sure the subscriber=C2=A0table identity is se= t to a qualifying unique index.=C2=A0 In previous versions of PG the publis= her and subscriber identities had to match...

IDENTITY SET TO FULL i= ncrease the size of the WAL=C2=A0and=C2=A0the work the publisher and subscr= iber has to do.=C2=A0

Hope this=C2=A0helps.



On Sat, Sep 2= 1, 2024 at 3:08=E2=80=AFPM Ramakrishna m <ram.pgdb@gmail.com> wrote:

Hi Team,<= /font>

We have configured bidirectional replicati= on (but traffic can only flow in one direction) between two data centers (d= istance: 1000 km, maximum Network latency: 100 ms) with an application TPS = (transactions per second) of 700 at maximum.

We are fine with handling up to 500 TPS without observing any lag betwee= n the two data centers. However, when TPS increases, we notice a lag in WAL= files of over 100 GB (initially, it was 1 TB, but after tuning, it was red= uced to 100 GB). During peak times, WAL files are generated at a rate of 4 = GB per minute.

All transactions (Tx) take = less than 200 ms, with a maximum of 1 second at times (no long-running tran= sactions).

Here are the configured= parameters and resources:

  • <= strong>OS: Ubuntu
  • RAM: 376 GB
  • CPU: 64= cores
  • Swap: 32 GB
  • PostgreSQL Version: 16.4 (e= ach side has 3 nodes with Patroni and etcd configured)
  • DB Size: 15 TB

<= font color=3D"#000000" style=3D"background-color:rgb(255,255,255)" face=3D"= arial, sans-serif">Parameters configured on both sides:

=
Name=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 Setting=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Unit
= max_sync_workers_per_subscriptionwal_receiver_timeout


log_replication_commandsoff
logical_decoding_work_mem524288kB
max_logical_replication_workers16
max_paral= lel_apply_workers_per_subscription=C2= =A02
max_replication_s= lots20
2
max= _wal_senders20
max_worker_processes40
wal_level= logical
600000ms
wal_segment_size10737= 41824B
wal_sender_timeout600000ms

Optimizations applied:

  1. <= font color=3D"#000000" style=3D"background-color:rgb(255,255,255)" face=3D"= arial, sans-serif">Vacuum freeze is managed during off-hours; no aggressive= vacuum is triggered during business hours.
  2. Converted a few tables to unlogged.
  3. = Removed unwanted tables from publication.
  4. Pending:

    1. Turning off/tun= ing autovacuum parameters to avoid triggering during business hours.=

    Not possible:=C2=A0We are= running all tables in a single publication, and it is not possible to sepa= rate them.

    I would greatly appreciate any suggestions you may have to help avoid logic= al replication delays, whether through tuning database or operating system = parameters, or any other recommendations


    --
    Thanks &= Regards,
    Ram.
    <= /div>
--0000000000005517350622a6a216--