Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1taICd-000V7c-Tm for pgsql-general@arkaria.postgresql.org; Tue, 21 Jan 2025 17:38:16 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1taICb-002Vt2-Kc for pgsql-general@arkaria.postgresql.org; Tue, 21 Jan 2025 17:38:13 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1taICb-002Vsl-5G for pgsql-general@lists.postgresql.org; Tue, 21 Jan 2025 17:38:13 +0000 Received: from mail-ej1-x630.google.com ([2a00:1450:4864:20::630]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1taICY-000lZa-2y for pgsql-general@lists.postgresql.org; Tue, 21 Jan 2025 17:38:12 +0000 Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-aab925654d9so1048589366b.2 for ; Tue, 21 Jan 2025 09:38:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737481089; x=1738085889; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=UM9B+V1CNFhSti4RFUVhR5deuYHZEI9/2Nu82E7SfB4=; b=HJpNGj7M0S0mngcAPE+Wb5S9I7ZSuBmaqGwVJbFM1v+ePHYd5h8Flkom7DWT3PORiB FhSe5pV6NfvHMOxaJ0mKMEsyniNlO8nUP5ni3dmpQKSpJCaZE/zry3o/bbiVZImGzPS8 fA9NXEy0CWvhtU/hJUFCRp76c2P7ZsePwNdLViua0XRWIYH8kl7ed52FGxV9l6/ygDMZ BDIoAcJDVEx++vu/PhQkJRn14tljL0KuCKh6Ag1gJCpXJTGLEfJj3ljOK0S+VwHtvSCP PbCJbfXKxLQOr55f10a58YZ09nI2wKuT9O60A52wHpFvJuXQgll2T7e/I1vTUzehg/Ox V/9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737481089; x=1738085889; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UM9B+V1CNFhSti4RFUVhR5deuYHZEI9/2Nu82E7SfB4=; b=LsO28buikdED1cPUBrhjcixkLy8MyOq837n5kdYkyYQsIjvud06ydhPjvYPktL/Xz0 cmZg1w7ow1pksjetY/r5dUg+MGj59o5q9qGMYyJAL2LVQ9YORfpLcZi1Q0PoXYvii/r0 NwGBhP5xiu2YTMtFENN9UQsmjDGi7y+AQpF42MfzkgaBG1QQwj8GJj2MLG6bSRkoSu/l kIg3Dov2aimmFA8RcP3cAaTmfjozFPXIjtA5rmaXTeYTffFlnHTzPKBCwV2sUyg4wDpm fqVAdWQmW7AVHsqUbWTbRgyfXPY+oosizrgbSxjmzUfOQRbIn95vhRlHFFj8RT5DKeGL BROw== X-Gm-Message-State: AOJu0YwXuPmMA+IPSu1Ebh0Ii14iC2hTQxAeRTKrQtw0bBh6TI6jOqwJ CpTaEJBrp5fNytLZEeL6MPSH49uyP4Y2gn5v6K4sBgO+ZhgQj9urspdhTT8GEg2odAxUI446PK5 8//4XvcuFdP2tJw3REKeDcNH43QQhBjm2460= X-Gm-Gg: ASbGnct1IJzQu57t+xZHyCCfHEwth6Bkdw4mNq8BItSSqPEB1TqOBdbQpwX2guXMun1 u/D0c812/N2wQhbak8wqmM1WC0Vd7s3w/COK4yPuWuFYVYIkdK+A= X-Google-Smtp-Source: AGHT+IE4u8NlhPDt9DfhMzdK55hXcMuVpHfJOGP4yQHQhfgpr8YH5W5/OmMNYg0XDWRAPmLZSjdNBZMxvILAvVm4BEE= X-Received: by 2002:a17:906:dc8c:b0:ab2:f937:b3aa with SMTP id a640c23a62f3a-ab38b4cff42mr1726608966b.56.1737481088586; Tue, 21 Jan 2025 09:38:08 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Durgamahesh Manne Date: Tue, 21 Jan 2025 23:08:42 +0530 X-Gm-Features: AbW1kvb77-VmN-PSDJ4F8mNdTHz-Zv6HzWRlzjTXMCw0xGnPhYxqOuDejOjgEYs Message-ID: Subject: Re: Records count mismatch with logical replication To: Adrian Klaver Cc: pgsql-general Content-Type: multipart/alternative; boundary="00000000000016de14062c3ad7f3" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000016de14062c3ad7f3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jan 21, 2025 at 9:24=E2=80=AFPM Adrian Klaver wrote: > On 1/21/25 04:08, Durgamahesh Manne wrote: > > Hi Team, > > > > I have publication and subscription servers .So seems data replication > > running with minimal lag but records count mismatch with more than 10 > > thousand records between source and destination tables > > > > Could you please help in resolving this issue? > > Not without a good deal more information: > > 1) Postgres version on either side of the replication. > > 2) The replication configuration. > > 3) Define how lag is being calculated and what 'minimal' is. > > 4) Define how the record counts are being derived. > > 5) The network distance between the servers. > > > > > Regards, > > Durga Mahesh > > > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > > Hi Adrian Klaver Really Thanks for your quick response This happened during repack lag went to more than 350Gb then gradually decreased to minimal lag after running pg_repack 1) Postgres version on either side of the replication. Source(publication) :16.4 Destination(subscription) : 14.11 2) The replication configuration. OLAP workload archiving (32GB 8 Vcpus) Source : wal_level =3D logical wal_sender_timeout =3D 30s max_wal_senders =3D 40 max_replication_slots =3D 20 max_logical_replication_workers =3D 4 wal_buffers =3D 64MB commit_delay =3D 2000 commi_siblings =3D 12 wal_writer_delay =3D 300 wal_writer_flush_after =3D 1MB bgwriter_delay =3D 20 min_wal_size =3D 8GB max_wal_size =3D 32Gb Destination : 128GB 32 vcpus wal_level =3D logical wal_receiver_timeout =3D 30s max_wal_senders =3D 40 max_replication_slots =3D 60 max_logical_replication_workers =3D 23 wal_buffers =3D 64MB commit_delay =3D default commi_siblings =3D default wal_writer_delay =3D default wal_writer_flush_after =3D default bgwriter_delay =3D 20 min_wal_size =3D 8GB max_wal_size =3D 32Gb 3) Define how lag is being calculated and what 'minimal' is. postgres=3D> select slot_name,pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn) as bytes_behind= , pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as behind_size, active from pg_replication_slots; slot_name | bytes_behind | behind_size | active -------------+--------------+-------------+-------- cls_eva_msa | 22906216 | 22 MB | t 4) Define how the record counts are being derived. Source : archiving=3D> select count(*) from archiving.events_archive ; count --------- 1262908 (1 row) Destination : archiving=3D> select count(*) from archiving.events_archive ; count --------- 1252062 (1 row) 5) The network distance between the servers. Both are under same vpc security groups Regards Durga Mahesh --00000000000016de14062c3ad7f3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Tue, Jan 21,= 2025 at 9:24=E2=80=AFPM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 1/21/25 04:08, Durgamahesh Manne w= rote:
> Hi=C2=A0 Team,
>
> I have publication and subscription servers .So seems data replication=
> running with minimal lag but records count mismatch=C2=A0with more tha= n 10
> thousand=C2=A0records between source and destination tables
>
> Could you please help in resolving this=C2=A0issue?

Not without a good deal more information:

1) Postgres version on either side of the replication.

2) The replication configuration.

3) Define how lag is being calculated and what 'minimal' is.

4) Define how the record counts are being derived.

5) The network distance between the servers.

>
> Regards,
> Durga Mahesh
>
>
>

--
Adrian Klaver
adrian.klave= r@aklaver.com

Hi Adrian Klaver=C2=A0

Reall= y Thanks for your quick response=C2=A0

This happen= ed during repack lag went to more than 350Gb then gradually decreased=C2=A0= to minimal=C2=A0lag after running pg_repack=C2=A0

= 1) Postgres version on either side of the replication.

=
=C2=A0Source(publication) :16.4
=C2=A0Destination(subscripti= on) : 14.11

2) The replication configuration.
OLAP workloa= d archiving (32GB 8 Vcpus)
=C2=A0Source : wal_level =3D logical
wal_sender_timeout =3D 30s
=C2=A0max_wal_senders =3D 40<= /div>
max_replication_slots =3D 20
max_logical_replication_wo= rkers =3D 4
wal_buffers =3D 64MB=C2=A0
commit_delay =3D= 2000
commi_siblings =3D 12
wal_writer_delay =3D 300
wal_writer_flush_after =3D 1MB
bgwriter_delay =3D 20=C2= =A0
min_wal_size =3D 8GB
max_wal_size =3D 32Gb

Destination :=C2=A0 128GB 32 vcpus=C2=A0
wal_lev= el =3D logical
wal_receiver_timeout =3D 30s
=C2=A0max_w= al_senders =3D 40
max_replication_slots =3D 60
max_logi= cal_replication_workers =3D 23
wal_buffers =3D 64MB=C2=A0
commit_delay =3D default
commi_siblings =3D default
= wal_writer_delay =3D default
wal_writer_flush_after =3D default
bgwriter_delay =3D 20=C2=A0
min_wal_size =3D 8GB
max_wal_size =3D 32Gb




3)= Define how lag is being calculated and what 'minimal' is.

postgres=3D> select slot_name,pg_wal_lsn_diff(pg_curren= t_wal_lsn(),restart_lsn) as bytes_behind,
pg_size_pretty(pg_wal_lsn_diff= (pg_current_wal_lsn(), restart_lsn)) as behind_size, active from pg_replica= tion_slots;
=C2=A0 slot_name =C2=A0| bytes_behind | behind_size | active=
-------------+--------------+-------------+--------
=C2=A0cls_eva_ms= a | =C2=A0 =C2=A0 22906216 | 22 MB =C2=A0 =C2=A0 =C2=A0 | t

4= ) Define how the record counts are being derived.
=C2=A0 Source := archiving=3D> select count(*) from archiving.events_archive ;
=C2=A0= count
---------
=C2=A01262908
(1 row)

Destination : archiv= ing=3D> select count(*) from archiving.events_archive ;
=C2=A0 count<= br>---------
=C2=A01252062
(1 row)

5) The network distance bet= ween the servers.

=C2=A0Both are under same vpc se= curity groups

Regards
Durga Mahesh
=
=C2=A0
--00000000000016de14062c3ad7f3--