Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sEybD-005VJd-B3 for pgsql-general@arkaria.postgresql.org; Wed, 05 Jun 2024 21:55:16 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sEybC-00CMfs-SX for pgsql-general@arkaria.postgresql.org; Wed, 05 Jun 2024 21:55:14 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sEybC-00CMfk-7A for pgsql-general@lists.postgresql.org; Wed, 05 Jun 2024 21:55:14 +0000 Received: from mail-wm1-x336.google.com ([2a00:1450:4864:20::336]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sEyb5-003dkt-0U for pgsql-general@lists.postgresql.org; Wed, 05 Jun 2024 21:55:13 +0000 Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-42138eadf64so3861675e9.3 for ; Wed, 05 Jun 2024 14:55:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717624504; x=1718229304; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=pypDflKmyRTZVycwrRsBLOUHHk36U2JlyvXZPXAzNVI=; b=Iu3NZ/2q9ecODDCrAY+sGzPCyHtmR8mpvj2qlKxU9d+Ph+C023XQpM3MZ4O85o5SRf 7DF4rgsHekr0PNvrPYSCiORqqIYArtlJpNUB5xLU8d9S3pp8NrIYNw0ZfIeQwf04t/AF 4Odf9pN56358PmolaSRTYHzltVOnKv3ErlNgkOqMS3n4Iy31bcUAqvKkocFies8f0NXR TxPS/yIGJDm/pUKE2aayTYAabKzqbfn3QW6Xt+PASVylG3zeNqakz5q2P/4fzOEti9tp UgZmOIV0QNd0qxctcGvQ/SzxjU8i+PKp9/sLrR2a9HxKPhYGSmh98X+xpyCe6njw5bBo Mf1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717624504; x=1718229304; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=pypDflKmyRTZVycwrRsBLOUHHk36U2JlyvXZPXAzNVI=; b=b6eCwi0drFk6sNE+QOyMNPmMcJ6S/jCQNDNKZQM/4Yj3mUCTlyjmZFuIcPXeup6oDx f0YcbBma/aVqcSeiuuhrVOukl9lxbhupUiW6XGmzSZ09cbqFHhq326hOkzAf0cSZPE1l 817mhcDzrE6Y9VzOZ0mIyQ8Q7aCwLpyogza+q/6zoZVyTDL6P5iZac+ZO1FXjwd/GQPf lgfngiEIxbsSZVpurqH6U5igrIbtmLJP2t7KCflFEhcNLU6+maCk1bBIMSFPU323B/ei Y912ZMTQIwILryDhwzOGZdioV1jUykWhhOPUgetsOXKySKbbAs7/EKEUB+yC0icDQE7g pLQQ== X-Gm-Message-State: AOJu0YzFRUo+UPV80ybZHrxph1sv1/7sH2eNOsN80tQsKRNioh3gLu/W Y4ZmLEz97s7eXqVIlsJxsoBOA+Pq57/7qKyLMQvxkcq9MK72enEUPJFccCGlLTeaVN9b+zeeJys JYxnWrG+nSh+ywPeyntyPWv+WhluiFO56 X-Google-Smtp-Source: AGHT+IFBv0Dp0aw50XgAd7gVUiwfLqmb8DyU0/wEjZrCAYZ3d9AmJJZa9WS1MKkCquowPTaP5iupTd3m9pnuvZtIg+o= X-Received: by 2002:a7b:c84d:0:b0:41a:9fc2:a6b1 with SMTP id 5b1f17b1804b1-421562ec9cdmr35183105e9.22.1717624504313; Wed, 05 Jun 2024 14:55:04 -0700 (PDT) MIME-Version: 1.0 References: <1628637f-419f-4f6a-9cb6-07af90cd0bc4@aklaver.com> In-Reply-To: <1628637f-419f-4f6a-9cb6-07af90cd0bc4@aklaver.com> From: Koen De Groote Date: Wed, 5 Jun 2024 23:54:52 +0200 Message-ID: Subject: Re: Questions on logical replication To: Adrian Klaver Cc: PostgreSQL General Content-Type: multipart/alternative; boundary="0000000000006ff597061a2b9e29" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000006ff597061a2b9e29 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > > https://www.postgresql.org/docs/current/wal-configuration.html > > "Checkpoints are points in the sequence of transactions at which it is > guaranteed that the heap and index data files have been updated with all > information written before that checkpoint. At checkpoint time, all > dirty data pages are flushed to disk and a special checkpoint record is > written to the WAL file. (The change records were previously flushed to > the WAL files.) In the event of a crash, the crash recovery procedure > looks at the latest checkpoint record to determine the point in the WAL > (known as the redo record) from which it should start the REDO > operation. Any changes made to data files before that point are > guaranteed to be already on disk. Hence, after a checkpoint, WAL > segments preceding the one containing the redo record are no longer > needed and can be recycled or removed. (When WAL archiving is being > done, the WAL segments must be archived before being recycled or removed.= )" > And this is the same for logical replication and physical replication, I take it. Thus, if a leader has a standby of the same version, and meanwhile logical replication is being done to a newer version, both those replications are taken into account, is that correct? When you set up logical replication you are 'asking' via the replication slot that WAL records be kept on the publisher until the subscriber > retrieves them. > And if it cannot sync them, due to connectivity loss for instance, the WAL records will not be removed, then? Regards, Koen De Groote On Wed, Jun 5, 2024 at 1:05=E2=80=AFAM Adrian Klaver wrote: > On 6/4/24 15:55, Koen De Groote wrote: > > I recently read the entire documentation on logical replication, but am > > left with a question on the buildup of WAL > > > > On this page: > > > https://www.postgresql.org/docs/current/logical-replication-subscription.= html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT > < > https://www.postgresql.org/docs/current/logical-replication-subscription.= html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT > > > > > > It is written: " When dropping a subscription, the remote host is not > > reachable. In that case, disassociate the slot from the subscription > > using |ALTER SUBSCRIPTION| before attempting to drop the subscription. > > If the remote database instance no longer exists, no further action is > > then necessary. If, however, the remote database instance is just > > unreachable, the replication slot (and any still remaining table > > synchronization slots) should then be dropped manually; otherwise > > it/they would continue to reserve WAL and might eventually cause the > > disk to fill up. Such cases should be carefully investigated." > > > > > > Assuming a situation where I add tables 1 at a time to the publisher, > > and refresh the subscription every time. > > > > What happens if I shut down the subscriber database for a while? The > > subscription isn't dropped, so am I reading it right that the disk on > > the publisher will slowly be filling up with WAL? Isn't that always the > > case if wall is enabled? > > https://www.postgresql.org/docs/current/wal-configuration.html > > "Checkpoints are points in the sequence of transactions at which it is > guaranteed that the heap and index data files have been updated with all > information written before that checkpoint. At checkpoint time, all > dirty data pages are flushed to disk and a special checkpoint record is > written to the WAL file. (The change records were previously flushed to > the WAL files.) In the event of a crash, the crash recovery procedure > looks at the latest checkpoint record to determine the point in the WAL > (known as the redo record) from which it should start the REDO > operation. Any changes made to data files before that point are > guaranteed to be already on disk. Hence, after a checkpoint, WAL > segments preceding the one containing the redo record are no longer > needed and can be recycled or removed. (When WAL archiving is being > done, the WAL segments must be archived before being recycled or removed.= )" > > > > > This "cause disk to fill up" warning is quite concerning, and I'd like > > to understand what could cause it and how likely it is? I thought > > logical replication uses WAL by default, so doesn't that mean there has > > to be a log of changes kept anyhow? Even if the WAL isn't written to > > disk by an "archive_command"? > > > https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLI= CATION-SLOTS > > "Replication slots provide an automated way to ensure that the primary > does not remove WAL segments until they have been received by all > standbys, and that the primary does not remove rows which could cause a > recovery conflict even when the standby is disconnected." > > When you set up logical replication you are 'asking' via the replication > slot that WAL records be kept on the publisher until the subscriber > retrieves them. > > > > > Regards, > > Koen De Groote > > -- > Adrian Klaver > adrian.klaver@aklaver.com > > --0000000000006ff597061a2b9e29 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
https://www.postgresql.org/docs/curren= t/wal-configuration.html

"Checkpoints are points in the sequence of transactions at which it is=
guaranteed that the heap and index data files have been updated with all information written before that checkpoint. At checkpoint time, all
dirty data pages are flushed to disk and a special checkpoint record is written to the WAL file. (The change records were previously flushed to the WAL files.) In the event of a crash, the crash recovery procedure
looks at the latest checkpoint record to determine the point in the WAL (known as the redo record) from which it should start the REDO
operation. Any changes made to data files before that point are
guaranteed to be already on disk. Hence, after a checkpoint, WAL
segments preceding the one containing the redo record are no longer
needed and can be recycled or removed. (When WAL archiving is being
done, the WAL segments must be archived before being recycled or removed.)&= quot;

And this is the same for logical replicati= on and physical replication, I take it.

Thus, if a= leader has a standby of the same version, and meanwhile logical replicatio= n is being done to a newer version, both those replications are taken into = account, is that correct?


=C2=A0When you set up logical repli= cation you are 'asking' via the replication
slot that WAL records be= kept on the publisher until the subscriber
retrieves them.
<= br>
And if it cannot sync them, due to connectivity loss for inst= ance, the WAL records will not be removed, then?

R= egards,
Koen De Groote


= On Wed, Jun 5, 2024 at 1:05=E2=80=AFAM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 6/4/24 15:55, Koen D= e Groote wrote:
> I recently read the entire documentation on logical replication, but a= m
> left with a question on the buildup of WAL
>
> On this page:
> https://www.postgresql.org/docs/current/logical-replica= tion-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT <https://www.postgresql.org/docs/current/logical-replication-subscr= iption.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT>
>
> It is written: " When dropping a subscription, the remote host is= not
> reachable. In that case, disassociate the slot from the subscription <= br> > using |ALTER SUBSCRIPTION| before attempting to drop the subscription.=
> If the remote database instance no longer exists, no further action is=
> then necessary. If, however, the remote database instance is just
> unreachable, the replication slot (and any still remaining table
> synchronization slots) should then be dropped manually; otherwise
> it/they would continue to reserve WAL and might eventually cause the <= br> > disk to fill up. Such cases should be carefully investigated." >
>
> Assuming a situation where I add tables 1 at a time to the publisher, =
> and refresh the subscription every time.
>
> What happens if I shut down the subscriber database for a while? The <= br> > subscription isn't dropped, so am I reading it right that the disk= on
> the publisher will slowly be filling up with WAL? Isn't that alway= s the
> case if wall is enabled?

https://www.postgresql.org/docs/curren= t/wal-configuration.html

"Checkpoints are points in the sequence of transactions at which it is=
guaranteed that the heap and index data files have been updated with all information written before that checkpoint. At checkpoint time, all
dirty data pages are flushed to disk and a special checkpoint record is written to the WAL file. (The change records were previously flushed to the WAL files.) In the event of a crash, the crash recovery procedure
looks at the latest checkpoint record to determine the point in the WAL (known as the redo record) from which it should start the REDO
operation. Any changes made to data files before that point are
guaranteed to be already on disk. Hence, after a checkpoint, WAL
segments preceding the one containing the redo record are no longer
needed and can be recycled or removed. (When WAL archiving is being
done, the WAL segments must be archived before being recycled or removed.)&= quot;

>
> This "cause disk to fill up" warning is quite concerning, an= d I'd like
> to understand what could cause it and how likely it is? I thought
> logical replication uses WAL by default, so doesn't that mean ther= e has
> to be a log of changes kept anyhow? Even if the WAL isn't written = to
> disk by an "archive_command"?

https://www.pos= tgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS<= br>
"Replication slots provide an automated way to ensure that the primary=
does not remove WAL segments until they have been received by all
standbys, and that the primary does not remove rows which could cause a recovery conflict even when the standby is disconnected."

When you set up logical replication you are 'asking' via the replic= ation
slot that WAL records be kept on the publisher until the subscriber
retrieves them.

>
> Regards,
> Koen De Groote

--
Adrian Klaver
adrian.klave= r@aklaver.com

--0000000000006ff597061a2b9e29--