Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sFR5s-002GSJ-1B for pgsql-general@arkaria.postgresql.org; Fri, 07 Jun 2024 04:20:48 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sFR5q-007ep3-9s for pgsql-general@arkaria.postgresql.org; Fri, 07 Jun 2024 04:20:47 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sFR5p-007eov-TO for pgsql-general@lists.postgresql.org; Fri, 07 Jun 2024 04:20:46 +0000 Received: from mail-vs1-xe34.google.com ([2607:f8b0:4864:20::e34]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sFR5n-000OlX-LM for pgsql-general@lists.postgresql.org; Fri, 07 Jun 2024 04:20:46 +0000 Received: by mail-vs1-xe34.google.com with SMTP id ada2fe7eead31-48bd4cefe5cso595697137.0 for ; Thu, 06 Jun 2024 21:20:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717734042; x=1718338842; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=zLTSevgdfOlYh+0hfOJSZLOkyqUcqZuWlyidHdJKVVk=; b=hBREGfgtQtlR5v42yr5ZderWzewOpTiKHmT8naGO3ZFlXFQqvW2KR39UU8nIgZVnyA 3vfTVDrFUoLkYPFRgRNg6egkCJbX1xCrDDrVnNPACIYWGvqlUiSg1D3pFN+hWpe3GJDn n2/Z0tD/dfqGXxQkCKpqnrKbROoiimFmtj8kYGCc4104+OkQxvGO6joic0OfBaGffXLg OE21LeuqUDifHXB2jvel5yUYhyQ7AbHJggYpOd/amUzoJ9vm47uAw+isA2pII1zSRrVk jdyxWq1eZ0sMo2FfY/bihdZsDSqIg0vy+eA+HtqMcUHyzAInhvBmHZONVr5TvViD9shY L4Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717734042; x=1718338842; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zLTSevgdfOlYh+0hfOJSZLOkyqUcqZuWlyidHdJKVVk=; b=Yn2iZIteAIgiqHeSpjdqqK/5K2+30V8CfAwmOD6yxzkLmatv5cKzNaMDy0Xt77t5K4 AnhIZuTUBQxHv2BDTyMLyTf3Y7NEZjzXu+dgsISaieIHlQrEH1ARQ51hw/4zqC8kikkZ yzXZtm9QXieec5hOiOwya3rHQQzseOrsSlvbIx7mczE+b1PxiYpCrpAhhTa7j47W4x6F J1o7BIbnD7bb2i1pZTKD9jZGqofcekUDk+aH9MmTtZ0C88+LeaQT17ROl9tftb3LiGwK WG7rjE7BEcxyBO8JtjoPlXvSdakpoXH0zMD8mECUAVXiXsw5Bda14GR9QE4rbSTb5eWF QInA== X-Forwarded-Encrypted: i=1; AJvYcCXCJAWAIH1+UCRrjmvPnYhgp+QDk9seVp0inAJjkA1JMNnL6IQl2gjUx/+F2n4rXbr0FntcILjvMZezWoQqZkoi2nvwUH3jqYO8tDO7cgl+/qQI X-Gm-Message-State: AOJu0YyD/bxpCrjfXsfcNAG//XaUbDq4odVWz8varaYeuTRHo1S/wtrQ M2p8P+288vhTf/QL4b4KO4Q4mk0AQZpgQCWLlkGEFtO9hn7MoiMMM4vr1HiXKro/dG162buB5ng PE89h2biQdrAuW/7JiSlgNH6TZBg= X-Google-Smtp-Source: AGHT+IH5++gGCyJ/e6Mh6dPn+/w83doJQaNBv2mIgLeNEjyIEMpweJPJ3dFJKNQup+Wsu8Hp5D4rx6f1jgwTFSvZTI4= X-Received: by 2002:a05:6102:292c:b0:47c:122b:a3bd with SMTP id ada2fe7eead31-48c272a86e5mr1841903137.2.1717734041913; Thu, 06 Jun 2024 21:20:41 -0700 (PDT) MIME-Version: 1.0 References: <1628637f-419f-4f6a-9cb6-07af90cd0bc4@aklaver.com> <1c0273f5-a90a-48f6-b51f-fe15c16fa1c6@aklaver.com> In-Reply-To: From: Kashif Zeeshan Date: Fri, 7 Jun 2024 09:20:30 +0500 Message-ID: Subject: Re: Questions on logical replication To: Koen De Groote Cc: Adrian Klaver , PostgreSQL General Content-Type: multipart/alternative; boundary="0000000000006311a2061a451f17" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000006311a2061a451f17 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Jun 7, 2024 at 3:19=E2=80=AFAM Koen De Groote w= rote: > I'll give them a read, though it might take a few weekends > > Meanwhile, this seems to be what I'm looking for: > > From > https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLI= CATION-SLOTS > > " Replication slots provide an automated way to ensure that the primary > does not remove WAL segments until they have been received by all standby= s, > and that the primary does not remove rows which could cause a recovery > conflict > > even when the standby is disconnected." > > I'm reading that as: "if there is a replication slot, if the standby is > disconnected, WAL is kept" > > And if we know WAL is kept in the "pg_wal" directory, that sounds like it > could slowly but surely fill up disk space. > Hi Yes that is a consideration with logical replication but the possible cast out weight the benefit. The kept WAL file size will only increase if the standby is offline. Regards Kashif Zeeshan Bitnine Global > > > But again, I'll give them a read. I've read all of logical replication > already, and I feel like I didn't get my answer there. > > Thanks for the help > > > Regards, > Koen De Groote > > On Thu, Jun 6, 2024 at 12:19=E2=80=AFAM Adrian Klaver > wrote: > >> On 6/5/24 14:54, Koen De Groote wrote: >> > https://www.postgresql.org/docs/current/wal-configuration.html >> > >> > >> > "Checkpoints are points in the sequence of transactions at which i= t >> is >> > guaranteed that the heap and index data files have been updated wi= th >> > all >> > information written before that checkpoint. At checkpoint time, al= l >> > dirty data pages are flushed to disk and a special checkpoint >> record is >> > written to the WAL file. (The change records were previously >> flushed to >> > the WAL files.) In the event of a crash, the crash recovery >> procedure >> > looks at the latest checkpoint record to determine the point in th= e >> WAL >> > (known as the redo record) from which it should start the REDO >> > operation. Any changes made to data files before that point are >> > guaranteed to be already on disk. Hence, after a checkpoint, WAL >> > segments preceding the one containing the redo record are no longe= r >> > needed and can be recycled or removed. (When WAL archiving is bein= g >> > done, the WAL segments must be archived before being recycled or >> > removed.)" >> > >> > >> > And this is the same for logical replication and physical replication, >> I >> > take it. >> >> High level explanation, both physical and logical replication use the >> WAL files as the starting point. When the recycling is done is dependent >> on various factors. My suggestion would be to read through the below to >> get a better idea of what is going. There is a lot to cover, but if you >> really want to understand it you will need to go through it. >> >> Physical replication >> >> https://www.postgresql.org/docs/current/high-availability.html >> >> 27.2.5. Streaming Replication >> 27.2.6. Replication Slots >> >> Logical replication >> >> https://www.postgresql.org/docs/current/logical-replication.html >> >> WAL >> >> https://www.postgresql.org/docs/current/wal.html >> >> >> >> > >> > Thus, if a leader has a standby of the same version, and meanwhile >> > logical replication is being done to a newer version, both those >> > replications are taken into account, is that correct? >> >> Yes, see links above. >> >> >> > And if it cannot sync them, due to connectivity loss for instance, the >> > WAL records will not be removed, then? >> >> Depends on the type of replication being done. It is possible for >> physical replication to have WAL records removed that are still needed >> downstream. >> >> From >> >> >> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPL= ICATION >> >> "If you use streaming replication without file-based continuous >> archiving, the server might recycle old WAL segments before the standby >> has received them. If this occurs, the standby will need to be >> reinitialized from a new base backup. You can avoid this by setting >> wal_keep_size to a value large enough to ensure that WAL segments are >> not recycled too early, or by configuring a replication slot for the >> standby. If you set up a WAL archive that's accessible from the standby, >> these solutions are not required, since the standby can always use the >> archive to catch up provided it retains enough segments." >> >> This is why it is good idea to go through the links I posted above. >> >> > >> > Regards, >> > Koen De Groote >> > >> >> >> -- >> Adrian Klaver >> adrian.klaver@aklaver.com >> >> --0000000000006311a2061a451f17 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Fri, Jun 7, 2024 at 3:19=E2=80=AFA= M Koen De Groote <kdg.dev@gmail.com= > wrote:
=
I'll give them a read, though it might take a few= weekends

Meanwhile, this seems to be what I'm= looking for:


" Replication slots provide an automated way to ensure that the primary=20 does not remove WAL segments until they have been received by all=20 standbys, and that the primary does not remove rows which could cause a recovery conflict even when the standby is disconnected."
=

I'm reading that as: "if there is a replicatio= n slot, if the standby is disconnected, WAL is kept"

And if we know WAL is kept in the "pg_wal" directory, tha= t sounds like it could slowly but surely fill up disk space.

Hi

Yes that is a con= sideration with logical replication but the possible cast out weight the be= nefit.
The kept WAL file size will only increase if the standby i= s offline.

Regards
Kashif Zeeshan
<= div>Bitnine Global=C2=A0


But again, I'l= l give them a read. I've read all of logical replication already, and I= feel like I didn't get my answer there.

Thank= s for the help


Regards,
Koen De Groote

On Thu, Jun 6, 2024 at 12:19=E2=80=AFAM Adrian K= laver <ad= rian.klaver@aklaver.com> wrote:
On 6/5/24 14:54, Koen De Groote wrote:
>=C2=A0 =C2=A0 =C2=A0https://www.po= stgresql.org/docs/current/wal-configuration.html
>=C2=A0 =C2=A0 =C2=A0<https://ww= w.postgresql.org/docs/current/wal-configuration.html>
>
>=C2=A0 =C2=A0 =C2=A0"Checkpoints are points in the sequence of tra= nsactions at which it is
>=C2=A0 =C2=A0 =C2=A0guaranteed that the heap and index data files have = been updated with
>=C2=A0 =C2=A0 =C2=A0all
>=C2=A0 =C2=A0 =C2=A0information written before that checkpoint. At chec= kpoint time, all
>=C2=A0 =C2=A0 =C2=A0dirty data pages are flushed to disk and a special = checkpoint record is
>=C2=A0 =C2=A0 =C2=A0written to the WAL file. (The change records were p= reviously flushed to
>=C2=A0 =C2=A0 =C2=A0the WAL files.) In the event of a crash, the crash = recovery procedure
>=C2=A0 =C2=A0 =C2=A0looks at the latest checkpoint record to determine = the point in the WAL
>=C2=A0 =C2=A0 =C2=A0(known as the redo record) from which it should sta= rt the REDO
>=C2=A0 =C2=A0 =C2=A0operation. Any changes made to data files before th= at point are
>=C2=A0 =C2=A0 =C2=A0guaranteed to be already on disk. Hence, after a ch= eckpoint, WAL
>=C2=A0 =C2=A0 =C2=A0segments preceding the one containing the redo reco= rd are no longer
>=C2=A0 =C2=A0 =C2=A0needed and can be recycled or removed. (When WAL ar= chiving is being
>=C2=A0 =C2=A0 =C2=A0done, the WAL segments must be archived before bein= g recycled or
>=C2=A0 =C2=A0 =C2=A0removed.)"
>
>
> And this is the same for logical replication and physical replication,= I
> take it.

High level explanation, both physical and logical replication use the
WAL files as the starting point. When the recycling is done is dependent on various factors. My suggestion would be to read through the below to get a better idea of what is going. There is a lot to cover, but if you really want to understand it you will need to go through it.

Physical replication

https://www.postgresql.org/docs/curren= t/high-availability.html

27.2.5. Streaming Replication
27.2.6. Replication Slots

Logical replication

https://www.postgresql.org/docs/curr= ent/logical-replication.html

WAL

https://www.postgresql.org/docs/current/wal.html=



>
> Thus, if a leader has a standby of the same version, and meanwhile > logical replication is being done to a newer version, both those
> replications are taken into account, is that correct?

Yes, see links above.


> And if it cannot sync them, due to connectivity loss for instance, the=
> WAL records will not be removed, then?

Depends on the type of replication being done. It is possible for
physical replication to have WAL records removed that are still needed
downstream.

From

https://www.postgresq= l.org/docs/current/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous
archiving, the server might recycle old WAL segments before the standby has received them. If this occurs, the standby will need to be
reinitialized from a new base backup. You can avoid this by setting
wal_keep_size to a value large enough to ensure that WAL segments are
not recycled too early, or by configuring a replication slot for the
standby. If you set up a WAL archive that's accessible from the standby= ,
these solutions are not required, since the standby can always use the
archive to catch up provided it retains enough segments."

This is why it is good idea to go through the links I posted above.

>
> Regards,
> Koen De Groote
>


--
Adrian Klaver
adrian.klave= r@aklaver.com

--0000000000006311a2061a451f17--