public inbox for [email protected]  
help / color / mirror / Atom feed
Inaccurate statement about log shipping replication mode
14+ messages / 5 participants
[nested] [flat]

* Inaccurate statement about log shipping replication mode
@ 2025-08-21 15:20 PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 14+ messages in thread

From: PG Doc comments form @ 2025-08-21 15:20 UTC (permalink / raw)
  To: [email protected]; +Cc: [email protected]

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/warm-standby.html
Description:

Hello,

The documentation page about Log-Shipping Standby Servers after describing
that there are file-based log shipping and record-based log shipping
(streaming replication) states: "It should be noted that log shipping is
asynchronous, i.e., the WAL records are shipped after transaction commit.".
This statement is misleading because the same page includes a section about
configuring synchronous streaming replication. To avoid confusion, I think
it makes sense to specify that record-based log shipping can be configured
as either asynchronous or synchronous.

Link: https://www.postgresql.org/docs/current/warm-standby.html


^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
@ 2025-08-25 07:58 ` Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 14+ messages in thread

From: Laurenz Albe @ 2025-08-25 07:58 UTC (permalink / raw)
  To: [email protected]; [email protected]

On Thu, 2025-08-21 at 15:20 +0000, PG Doc comments form wrote:
> Page: https://www.postgresql.org/docs/17/warm-standby.html
> 
> The documentation page about Log-Shipping Standby Servers after describing
> that there are file-based log shipping and record-based log shipping
> (streaming replication) states: "It should be noted that log shipping is
> asynchronous, i.e., the WAL records are shipped after transaction commit.".
> This statement is misleading because the same page includes a section about
> configuring synchronous streaming replication. To avoid confusion, I think
> it makes sense to specify that record-based log shipping can be configured
> as either asynchronous or synchronous.

I think that the statement you quote is not only misleading, but wrong.
WAL can get shipped before the transaction commits.  Perhaps the sentence
had better be

  It should be noted that by default, log shipping is asynchronous, i.e.,
  the primary server does not wait until the standby receives the data.

Yours,
Laurenz Albe





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
@ 2025-08-27 12:13   ` Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  0 siblings, 1 reply; 14+ messages in thread

From: Laurenz Albe @ 2025-08-27 12:13 UTC (permalink / raw)
  To: [email protected]; [email protected]

On Mon, 2025-08-25 at 09:58 +0200, Laurenz Albe wrote:
> On Thu, 2025-08-21 at 15:20 +0000, PG Doc comments form wrote:
> > Page: https://www.postgresql.org/docs/17/warm-standby.html
> > 
> > The documentation page about Log-Shipping Standby Servers after describing
> > that there are file-based log shipping and record-based log shipping
> > (streaming replication) states: "It should be noted that log shipping is
> > asynchronous, i.e., the WAL records are shipped after transaction commit.".
> > This statement is misleading because the same page includes a section about
> > configuring synchronous streaming replication. To avoid confusion, I think
> > it makes sense to specify that record-based log shipping can be configured
> > as either asynchronous or synchronous.
> 
> I think that the statement you quote is not only misleading, but wrong.
> WAL can get shipped before the transaction commits.  Perhaps the sentence
> had better be
> 
>   It should be noted that by default, log shipping is asynchronous, i.e.,
>   the primary server does not wait until the standby receives the data.

Here is a patch for that.

Yours,
Laurenz Albe


Attachments:

  [text/x-patch] v1-0001-Fix-doc-defining-asynchronous-replication.patch (1.5K, 2-v1-0001-Fix-doc-defining-asynchronous-replication.patch)
  download | inline diff:
From 97cb9a4e36ac035e1dcc108dd6d36033898ccd36 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <[email protected]>
Date: Wed, 27 Aug 2025 14:10:41 +0200
Subject: [PATCH v1] Fix doc defining asynchronous replication

The statement was factually wrong: WAL records can get shipped
to the standby before the transaction commits.  The key point
is that the primary does not wait for the standby.

Author: Laurenz Albe <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 doc/src/sgml/high-availability.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index b47d8b4106e..334b8a4652a 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -527,8 +527,8 @@ protocol to make nodes agree on a serializable transactional order.
   </para>
 
   <para>
-   It should be noted that log shipping is asynchronous, i.e., the WAL
-   records are shipped after transaction commit. As a result, there is a
+   It should be noted that log shipping is asynchronous, i.e., the primary server does
+   not wait until the standby receives the data.  As a result, there is a
    window for data loss should the primary server suffer a catastrophic
    failure; transactions not yet shipped will be lost.  The size of the
    data loss window in file-based log shipping can be limited by use of the
-- 
2.51.0



^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
@ 2025-08-31 23:20     ` Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 07:28       ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  0 siblings, 2 replies; 14+ messages in thread

From: Michael Paquier @ 2025-08-31 23:20 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: [email protected]; [email protected]

On Wed, Aug 27, 2025 at 02:13:21PM +0200, Laurenz Albe wrote:
> Here is a patch for that.
> --- a/doc/src/sgml/high-availability.sgml
> +++ b/doc/src/sgml/high-availability.sgml
> @@ -527,8 +527,8 @@ protocol to make nodes agree on a serializable transactional order.
>    </para>
>  
>    <para>
> -   It should be noted that log shipping is asynchronous, i.e., the WAL
> -   records are shipped after transaction commit. As a result, there is a
> +   It should be noted that log shipping is asynchronous, i.e., the primary server does
> +   not wait until the standby receives the data.  As a result, there is a
>     window for data loss should the primary server suffer a catastrophic
>     failure; transactions not yet shipped will be lost.  The size of the
>     data loss window in file-based log shipping can be limited by use of the

Yep, the original statement is rather inexact.  Now, your new wording
does not make me really comfortable with the case of cascading stanbys
in scope, because the asynchronous property applies to them all the
time.

Hmm.  I'd suggest to use a simpler reformulatione, like this one to
outline that there is no relationship between the timing of a
transaction commit and the timing where the commit records are flushed
on a standby server:
   It should be noted that log shipping is asynchronous, i.e., the WAL
   records may be shipped after transaction commit.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
@ 2025-09-01 11:51       ` Artem Gavrilov <[email protected]>
  2025-09-02 07:30         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 07:34         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  1 sibling, 2 replies; 14+ messages in thread

From: Artem Gavrilov @ 2025-09-01 11:51 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Laurenz Albe <[email protected]>; [email protected]

On Mon, Sep 1, 2025 at 1:20 AM Michael Paquier <[email protected]> wrote:
>
> Yep, the original statement is rather inexact.  Now, your new wording
> does not make me really comfortable with the case of cascading stanbys
> in scope, because the asynchronous property applies to them all the
> time.


This is another unclear part. As I understand in configuration `Master
-> Upstream -> Downstream` replication between Master And Upstream
still can be synchronous, while between Upstream and Downstream is't
always async. Am I wrong here?

--

Artem Gavrilov
Senior Software Engineer, Percona

[email protected]





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
@ 2025-09-02 07:30         ` Laurenz Albe <[email protected]>
  1 sibling, 0 replies; 14+ messages in thread

From: Laurenz Albe @ 2025-09-02 07:30 UTC (permalink / raw)
  To: Artem Gavrilov <[email protected]>; Michael Paquier <[email protected]>; +Cc: [email protected]

On Mon, 2025-09-01 at 13:51 +0200, Artem Gavrilov wrote:
> As I understand in configuration `Master
> -> Upstream -> Downstream` replication between Master And Upstream
> still can be synchronous, while between Upstream and Downstream is't
> always async. Am I wrong here?





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
@ 2025-09-02 07:34         ` Laurenz Albe <[email protected]>
  2025-09-02 09:22           ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  1 sibling, 1 reply; 14+ messages in thread

From: Laurenz Albe @ 2025-09-02 07:34 UTC (permalink / raw)
  To: Artem Gavrilov <[email protected]>; Michael Paquier <[email protected]>; +Cc: [email protected]

On Mon, 2025-09-01 at 13:51 +0200, Artem Gavrilov wrote:
> As I understand in configuration `Master
> -> Upstream -> Downstream` replication between Master And Upstream
> still can be synchronous, while between Upstream and Downstream is't
> always async. Am I wrong here?

I don't quite understand.  Sure, you can have synchronous replication
between the master and upstream.  It is the "isn't always async" part
that confuses me.  Do you mean that WAL can reach downstream before
the master commits?  That is certainly the case.

Yours,
Laurenz Albe





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 07:34         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
@ 2025-09-02 09:22           ` Artem Gavrilov <[email protected]>
  2025-09-02 12:48             ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 14+ messages in thread

From: Artem Gavrilov @ 2025-09-02 09:22 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Michael Paquier <[email protected]>; [email protected]

Oh, sorry I made a typo, it should be "is always async". I was
referring to this statement in docs about cascading replication:
"Cascading replication is currently asynchronous". It sounds to me
like the whole replication setup is async (M -> U ->D), but it's only
the (U -> D) part that is always async. But probably it's a topic for
another thread.

My original problem was with the first sentence "It should be noted
that log shipping is asynchronous". I think your original suggestion
"It should be noted that by default, log shipping is asynchronous"
sounds good as it highlights from the beginning that there is some
variety.

On Tue, Sep 2, 2025 at 9:34 AM Laurenz Albe <[email protected]> wrote:
>
> On Mon, 2025-09-01 at 13:51 +0200, Artem Gavrilov wrote:
> > As I understand in configuration `Master
> > -> Upstream -> Downstream` replication between Master And Upstream
> > still can be synchronous, while between Upstream and Downstream is't
> > always async. Am I wrong here?
>
> I don't quite understand.  Sure, you can have synchronous replication
> between the master and upstream.  It is the "isn't always async" part
> that confuses me.  Do you mean that WAL can reach downstream before
> the master commits?  That is certainly the case.
>
> Yours,
> Laurenz Albe



-- 

Artem Gavrilov
Senior Software Engineer, Percona

[email protected]





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 07:34         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 09:22           ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
@ 2025-09-02 12:48             ` Laurenz Albe <[email protected]>
  2025-09-02 15:10               ` Re: Inaccurate statement about log shipping replication mode Robert Treat <[email protected]>
  0 siblings, 1 reply; 14+ messages in thread

From: Laurenz Albe @ 2025-09-02 12:48 UTC (permalink / raw)
  To: Artem Gavrilov <[email protected]>; +Cc: Michael Paquier <[email protected]>; [email protected]

On Tue, 2025-09-02 at 11:22 +0200, Artem Gavrilov wrote:
> My original problem was with the first sentence "It should be noted
> that log shipping is asynchronous". I think your original suggestion
> "It should be noted that by default, log shipping is asynchronous"
> sounds good as it highlights from the beginning that there is some
> variety.

Hm, yes, we could add "by default".

Yours,
Laurenz Albe





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 07:34         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 09:22           ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 12:48             ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
@ 2025-09-02 15:10               ` Robert Treat <[email protected]>
  2025-09-03 05:59                 ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-03 07:37                 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  0 siblings, 2 replies; 14+ messages in thread

From: Robert Treat @ 2025-09-02 15:10 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Artem Gavrilov <[email protected]>; Michael Paquier <[email protected]>; [email protected]

On Tue, Sep 2, 2025 at 8:48 AM Laurenz Albe <[email protected]> wrote:
>
> On Tue, 2025-09-02 at 11:22 +0200, Artem Gavrilov wrote:
> > My original problem was with the first sentence "It should be noted
> > that log shipping is asynchronous". I think your original suggestion
> > "It should be noted that by default, log shipping is asynchronous"
> > sounds good as it highlights from the beginning that there is some
> > variety.
>
> Hm, yes, we could add "by default".
>

I think the issue here is that this section is supposed to focus on
continuous archiving / file based WAL shipping, which is asynchronous.
All of the complexity that is being discussed in this thread is really
about WAL streaming, which IMO should not be discussed here. Per the
docs, "Record-based log shipping is more granular and streams WAL
changes incrementally over a network connection (see Section 26.2.5)."

I actually think the thing that is wrong (or at least confusing) in
the docs is this line "Directly moving WAL records from one database
server to another is typically described as log shipping." because it
is too loose with its definition. I don't recall postgres people
referring to streaming replication as "wal shipping", that term is
pretty exclusively used for continuous archiving. If you look in the
aforementioned 26.2.5. Streaming Replication, the term "shipping" is
only ever used in conjunction with the phrase "file-based log
shipping".

So with that said, I would suggest fixing this by changing the first
sentence of paragraph 4 to "It should be noted that file based log
shipping is asynchronous", as this also emphasizes that this section
is focused on file based wal shipping.

A larger fix would likely involve reworking this section to start with
defining log shipping and how it is used in Postgres, and then
continuing with the file based specific info (something like moving
the third paragraph to the beginning and then editing things for
clarity / readability). I could work up a patch for that if people
were interested.

Robert Treat
https://xzilla.net





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 07:34         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 09:22           ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 12:48             ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 15:10               ` Re: Inaccurate statement about log shipping replication mode Robert Treat <[email protected]>
@ 2025-09-03 05:59                 ` Michael Paquier <[email protected]>
  1 sibling, 0 replies; 14+ messages in thread

From: Michael Paquier @ 2025-09-03 05:59 UTC (permalink / raw)
  To: Robert Treat <[email protected]>; +Cc: Laurenz Albe <[email protected]>; Artem Gavrilov <[email protected]>; [email protected]

On Tue, Sep 02, 2025 at 11:10:42AM -0400, Robert Treat wrote:
> So with that said, I would suggest fixing this by changing the first
> sentence of paragraph 4 to "It should be noted that file based log
> shipping is asynchronous", as this also emphasizes that this section
> is focused on file based wal shipping.

Not sure that there is a strong need for "file-based", still it is
true that we could just remove the inexact part of the sentence and
call it a day, as of:
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -527,8 +527,7 @@ protocol to make nodes agree on a serializable transactional order.
   </para>
 
   <para>
-   It should be noted that log shipping is asynchronous, i.e., the WAL
-   records are shipped after transaction commit. As a result, there is a
+   It should be noted that log shipping is asynchronous. As a result, there is a

--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 07:34         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 09:22           ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 12:48             ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 15:10               ` Re: Inaccurate statement about log shipping replication mode Robert Treat <[email protected]>
@ 2025-09-03 07:37                 ` Laurenz Albe <[email protected]>
  2025-09-04 01:28                   ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  1 sibling, 1 reply; 14+ messages in thread

From: Laurenz Albe @ 2025-09-03 07:37 UTC (permalink / raw)
  To: Robert Treat <[email protected]>; +Cc: Artem Gavrilov <[email protected]>; Michael Paquier <[email protected]>; [email protected]

On Tue, 2025-09-02 at 11:10 -0400, Robert Treat wrote:
> I think the issue here is that this section is supposed to focus on
> continuous archiving / file based WAL shipping, which is asynchronous.
> All of the complexity that is being discussed in this thread is really
> about WAL streaming, which IMO should not be discussed here. Per the
> docs, "Record-based log shipping is more granular and streams WAL
> changes incrementally over a network connection (see Section 26.2.5)."

Chapter 26.2. is "Log-Shipping Standby Servers".
The first line seems to confirm what you are saying:

    Continuous archiving can be used to create a high availability (HA)
    cluster configuration with one or more standby servers ready to
    take over operations if the primary server fails. This capability
    is widely referred to as warm standby or log shipping.

But one of the subsections is 26.2.5. "Streaming Replication", which
suggests that streaming replication is a kind of log shipping.

> I actually think the thing that is wrong (or at least confusing) in
> the docs is this line "Directly moving WAL records from one database
> server to another is typically described as log shipping." because it
> is too loose with its definition. I don't recall postgres people
> referring to streaming replication as "wal shipping", that term is
> pretty exclusively used for continuous archiving. If you look in the
> aforementioned 26.2.5. Streaming Replication, the term "shipping" is
> only ever used in conjunction with the phrase "file-based log
> shipping".
> 
> So with that said, I would suggest fixing this by changing the first
> sentence of paragraph 4 to "It should be noted that file based log
> shipping is asynchronous", as this also emphasizes that this section
> is focused on file based wal shipping.
> 
> A larger fix would likely involve reworking this section to start with
> defining log shipping and how it is used in Postgres, and then
> continuing with the file based specific info (something like moving
> the third paragraph to the beginning and then editing things for
> clarity / readability). I could work up a patch for that if people
> were interested.

I agree that it is a worthwhile goal to clarify the terms, and I
think that the whole chapter should be reorganized:

Sections 26.2.5. to 26.2.9. should be moved to a new chapter
26.3. "Streaming Replication" (which will renumber the present 26.3.
and 26.4.).

Perhaps "WAL shipping" would be a better term, with "WAL streaming"
as alternative.

But that would be a bigger endeavour that would require going over
bigger parts of the documentation.  If you want to do that, I'd be
happy to review it.

But I think that the factually wrong statement that my patch
tries to address should get fixed first - who knows how long the
bigger patch would take.

I am OK with Michael's suggestion to just remove the wrong line,
although it wouldn't be bad to have an explanation of what we mean
by "asynchronous" here.

Yours,
Laurenz Albe





^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
  2025-09-01 11:51       ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 07:34         ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 09:22           ` Re: Inaccurate statement about log shipping replication mode Artem Gavrilov <[email protected]>
  2025-09-02 12:48             ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-09-02 15:10               ` Re: Inaccurate statement about log shipping replication mode Robert Treat <[email protected]>
  2025-09-03 07:37                 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
@ 2025-09-04 01:28                   ` Michael Paquier <[email protected]>
  0 siblings, 0 replies; 14+ messages in thread

From: Michael Paquier @ 2025-09-04 01:28 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Robert Treat <[email protected]>; Artem Gavrilov <[email protected]>; [email protected]; David Steele <[email protected]>

On Wed, Sep 03, 2025 at 09:37:08AM +0200, Laurenz Albe wrote:
> I agree that it is a worthwhile goal to clarify the terms, and I
> think that the whole chapter should be reorganized:
> 
> Sections 26.2.5. to 26.2.9. should be moved to a new chapter
> 26.3. "Streaming Replication" (which will renumber the present 26.3.
> and 26.4.).

I would not disagree with that, the situation in the docs can be
confusing for one, as we mix file-based WAL files moved around and
streaming with the replication protocol.

One interesting portion is about replication slots, where we rely on
XLogGetReplicationSlotMinimumLSN() to decide the retention threshold,
Physical slots are updated in WAL senders via
PhysicalConfirmReceivedLocation, meaning that the replication protocol
is required.  Mixing that with the file-shipping part is a mistake.

Just moving the contents to a new "Streaming" section sounds like an
improvement, but the "log-shipping" part would still suck.  So this
stands for cleanup as well, providing a better split.  Perhaps we
should embrace the term "file-based WAL shipping" or "file-based log
shipping" and use that, giving a structure of:
* WAL shipping methods, log-shipping methods or just "Log Shipping"
** File-based WAL shipping
** Streaming

Warm standbys can use both methods.  The part about planning,
operation and preparing may be worth splitting outside the "method" 
portion..  The "continuous" archiving on standbys is not about
streaming, but about the file-based method, so it would need to be
inside the file-based subsection.  We could replace "Log" with just
"WAL", as well, if we're looking at more standardization of the whole
area, while on it.

> Perhaps "WAL shipping" would be a better term, with "WAL streaming"
> as alternative.

Perhaps that stands for improvement and more standarization.  This
term originates from 5e550acbc4d1 in 2006.  The industry has changed a
lot since and there may be standard terms which are much more adapted
for the "modern" user, even if there's a lot of Postgres-ism in the
architecture and how things are done.  There have been some proposals,
but nobody really stood up to commit something.

> But that would be a bigger endeavour that would require going over
> bigger parts of the documentation.  If you want to do that, I'd be
> happy to review it.
> 
> But I think that the factually wrong statement that my patch
> tries to address should get fixed first - who knows how long the
> bigger patch would take.
> 
> I am OK with Michael's suggestion to just remove the wrong line,
> although it wouldn't be bad to have an explanation of what we mean
> by "asynchronous" here.

Yeah, this statement is confusing as-is because there is no
dependency with the timing of a transaction commit, records may be
shipped before or after depending on how your system balances your IO
and/or CPU.  I am not sure if this is worth applying on its own, TBH,
because this stuff needs much more rework than a simple sentence.  If
somebody takes the time to write a patch, I'd be OK to step in this
time for review and doing some reorganization of the whole section,
even if that would mean a HEAD-only change.  I had the attached staged
at some point, for reference.

Adding David Steele in CC, I recall that he may have done a proposal
around all that for the docs, and he's involved in backrest.
--
Michael


Attachments:

  [text/x-diff] 0001-doc-Remove-confusing-sentence-about-async-log-shippi.patch (1.6K, 2-0001-doc-Remove-confusing-sentence-about-async-log-shippi.patch)
  download | inline diff:
From 542db9e02f5aaafe4c831797133acb1aff5d7828 Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Thu, 4 Sep 2025 10:22:08 +0900
Subject: [PATCH] doc: Remove confusing sentence about async log shipping

The original sentence is old, as of 5e550acbc4d1, referring to a
dependency with transaction commit and the timing of the records
flushed, which may not be always true.

Reported-by: Artem Gavrilov <[email protected]>
Reviewed-by: Laurenz Albe <[email protected]>
Reviewed-by: Robert Treat <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 doc/src/sgml/high-availability.sgml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index b47d8b4106ef..ffeff3f2b247 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -527,9 +527,8 @@ protocol to make nodes agree on a serializable transactional order.
   </para>
 
   <para>
-   It should be noted that log shipping is asynchronous, i.e., the WAL
-   records are shipped after transaction commit. As a result, there is a
-   window for data loss should the primary server suffer a catastrophic
+   It should be noted that log shipping is asynchronous. As a result, there
+   is a window for data loss should the primary server suffer a catastrophic
    failure; transactions not yet shipped will be lost.  The size of the
    data loss window in file-based log shipping can be limited by use of the
    <varname>archive_timeout</varname> parameter, which can be set as low
-- 
2.51.0



  [application/pgp-signature] signature.asc (833B, 3-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 14+ messages in thread

* Re: Inaccurate statement about log shipping replication mode
  2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
  2025-08-25 07:58 ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-27 12:13   ` Re: Inaccurate statement about log shipping replication mode Laurenz Albe <[email protected]>
  2025-08-31 23:20     ` Re: Inaccurate statement about log shipping replication mode Michael Paquier <[email protected]>
@ 2025-09-02 07:28       ` Laurenz Albe <[email protected]>
  1 sibling, 0 replies; 14+ messages in thread

From: Laurenz Albe @ 2025-09-02 07:28 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: [email protected]; [email protected]

On Mon, 2025-09-01 at 08:20 +0900, Michael Paquier wrote:
> On Wed, Aug 27, 2025 at 02:13:21PM +0200, Laurenz Albe wrote:
> > Here is a patch for that.
> > --- a/doc/src/sgml/high-availability.sgml
> > +++ b/doc/src/sgml/high-availability.sgml
> > @@ -527,8 +527,8 @@ protocol to make nodes agree on a serializable transactional order.
> >    </para>
> >  
> >    <para>
> > -   It should be noted that log shipping is asynchronous, i.e., the WAL
> > -   records are shipped after transaction commit. As a result, there is a
> > +   It should be noted that log shipping is asynchronous, i.e., the primary server does
> > +   not wait until the standby receives the data.  As a result, there is a
> >     window for data loss should the primary server suffer a catastrophic
> >     failure; transactions not yet shipped will be lost.  The size of the
> >     data loss window in file-based log shipping can be limited by use of the
> 
> Yep, the original statement is rather inexact.  Now, your new wording
> does not make me really comfortable with the case of cascading stanbys
> in scope, because the asynchronous property applies to them all the
> time.
> 
> Hmm.  I'd suggest to use a simpler reformulatione, like this one to
> outline that there is no relationship between the timing of a
> transaction commit and the timing where the commit records are flushed
> on a standby server:
>    It should be noted that log shipping is asynchronous, i.e., the WAL
>    records may be shipped after transaction commit.

That is a less invasive change and probably preferable.
The attached patch does it like you suggested.

I noticed that the paragraph speaks about the asynchronicity of replication
and the potential of data loss, so I couldn't resist the temptation to add
a remark that synchronous streaming replication can avoid that problem.

Yours,
Laurenz Albe


Attachments:

  [text/x-patch] v2-0001-Fix-doc-defining-asynchronous-replication.patch (2.1K, 2-v2-0001-Fix-doc-defining-asynchronous-replication.patch)
  download | inline diff:
From 221e86b2a821b4f0d812448fbe879df242c6ca05 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <[email protected]>
Date: Tue, 2 Sep 2025 09:24:06 +0200
Subject: [PATCH v2] Fix doc defining asynchronous replication

The statement was factually wrong: WAL records can get shipped
to the standby before the transaction commits.  The key point
is that the primary does not wait for the standby.

Since the paragraph stresses the potential data loss, add a
remark that synchronous replication can be used to avoid that
problem.

Author: Laurenz Albe <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 doc/src/sgml/high-availability.sgml | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index b47d8b4106e..041caba239d 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -528,7 +528,7 @@ protocol to make nodes agree on a serializable transactional order.
 
   <para>
    It should be noted that log shipping is asynchronous, i.e., the WAL
-   records are shipped after transaction commit. As a result, there is a
+   records may be shipped after transaction commit.  As a result, there is a
    window for data loss should the primary server suffer a catastrophic
    failure; transactions not yet shipped will be lost.  The size of the
    data loss window in file-based log shipping can be limited by use of the
@@ -536,7 +536,10 @@ protocol to make nodes agree on a serializable transactional order.
    as a few seconds.  However such a low setting will
    substantially increase the bandwidth required for file shipping.
    Streaming replication (see <xref linkend="streaming-replication"/>)
-   allows a much smaller window of data loss.
+   allows a much smaller window of data loss, and synchronous streaming
+   replication (see <xref linkend="synchronous-replication"/>) can
+   guarantee that no transaction is reported as committed before the
+   WAL records have reached the standby server.
   </para>
 
   <para>
-- 
2.51.0



^ permalink  raw  reply  [nested|flat] 14+ messages in thread


end of thread, other threads:[~2025-09-04 01:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-08-21 15:20 Inaccurate statement about log shipping replication mode PG Doc comments form <[email protected]>
2025-08-25 07:58 ` Laurenz Albe <[email protected]>
2025-08-27 12:13   ` Laurenz Albe <[email protected]>
2025-08-31 23:20     ` Michael Paquier <[email protected]>
2025-09-01 11:51       ` Artem Gavrilov <[email protected]>
2025-09-02 07:30         ` Laurenz Albe <[email protected]>
2025-09-02 07:34         ` Laurenz Albe <[email protected]>
2025-09-02 09:22           ` Artem Gavrilov <[email protected]>
2025-09-02 12:48             ` Laurenz Albe <[email protected]>
2025-09-02 15:10               ` Robert Treat <[email protected]>
2025-09-03 05:59                 ` Michael Paquier <[email protected]>
2025-09-03 07:37                 ` Laurenz Albe <[email protected]>
2025-09-04 01:28                   ` Michael Paquier <[email protected]>
2025-09-02 07:28       ` Laurenz Albe <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox