Re: Unexpected data when subscribing to logical replication slot

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: Unexpected data when subscribing to logical replication slot
7+ messages / 3 participants
[nested] [flat]

* Re: Unexpected data when subscribing to logical replication slot
@ 2024-05-08 16:28  Adrian Klaver <[email protected]>
  0 siblings, 1 reply; 7+ messages in thread

From: Adrian Klaver @ 2024-05-08 16:28 UTC (permalink / raw)
  To: Daniel McKenzie <[email protected]>; +Cc: pgsql-general

On 5/8/24 08:24, Daniel McKenzie wrote:
> It's running both (in docker containers) and also quite a few more 
> docker containers running various .NET applications.

I think what you found is that the r7a.medium instance is not capable 
enough to do all that it is asked without introducing lag under load. 
Answering the questions posed by Tomas Vondra would help get to the 
actual cause of the lag.

In meantime my suspicion is this part:

"For example, when I use a psql terminal to update a user's last name 
from "Jones" to "Smith" then I would expect the enrichment query to find 
"Smith" but it will sometimes still find "Jones". It finds the old data 
perhaps 1 in 50 times."

If this is being run against the Postgres server my guess is that 
synchronous_commit=on is causing the commit on the server to wait for 
the WAL records to be flushed to disk and this is not happening in a 
timely manner in the '... 1 in 50 times' you mention. In that case you 
see the old values not the new committed values. This seems to be 
confirmed when you set synchronous_commit=off and don't see old values.
For completeness per:

https://www.postgresql.org/docs/current/wal-async-commit.html

"However, for short transactions this delay is a major component of the 
total transaction time. Selecting asynchronous commit mode means that 
the server returns success as soon as the transaction is logically 
completed, before the WAL records it generated have actually made their 
way to disk. This can provide a significant boost in throughput for 
small transactions.

Asynchronous commit introduces the risk of data loss. There is a short 
time window between the report of transaction completion to the client 
and the time that the transaction is truly committed (that is, it is 
guaranteed not to be lost if the server crashes).  ...
"

> 
> Daniel McKenzie
> Software Developer

-- 
Adrian Klaver
[email protected]

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Unexpected data when subscribing to logical replication slot
@ 2024-05-09 07:32  Daniel McKenzie <[email protected]>
  parent: Adrian Klaver <[email protected]>
  0 siblings, 3 replies; 7+ messages in thread

From: Daniel McKenzie @ 2024-05-09 07:32 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; [email protected]; +Cc: pgsql-general

>
> Asynchronous commit introduces the risk of data loss. There is a short
> time window between the report of transaction completion to the client
> and the time that the transaction is truly committed.

The documentation speaks about synchronous_commit changing how transactions
change behaviour for the client. So in this case, my psql terminal is the
client, and I would expect a faster commit (from its perspective) and then
a period of risk (as a process usually done as part of the commit is now
being done in the background) but it's not clear how that affects a
replication slot subscriber.

What we're struggling to understand is: why are we seeing any updates in
the replication slot before they have been "truly committed"?

There appears to be a state of limbo between updating data and that data
being available to query (and our subscriber is picking up changes in this
period of time) but I can't pin down any documentation which describes it.

We've had this running in live now for years without a hiccup so we are
surprised to learn that we have this massive race condition and it just so
happens that the hardware is fast enough to process the transaction before
the .NET application can react to replication slot changes.

Daniel McKenzie
Software Developer

Office: +1 403.910.5927 x 251
Mobile: +44 7712 159045
Website: www.curvedental.com

*Curve Dental Confidentiality Notice*
This message is intended exclusively for the individual or entity to which
it is addressed. This communication may contain information that is
proprietary, privileged, confidential, or otherwise legally exempt from
disclosure. If you are not the named addressee, you are not authorized to
read, print, retain, copy, or disseminate this message or any part of it.
If you have received this message in error, please notify the sender
immediately by replying to this e-mail and delete all copies of this
message.

On Wed, May 8, 2024 at 5:28 PM Adrian Klaver <[email protected]>
wrote:

> On 5/8/24 08:24, Daniel McKenzie wrote:
> > It's running both (in docker containers) and also quite a few more
> > docker containers running various .NET applications.
>
> I think what you found is that the r7a.medium instance is not capable
> enough to do all that it is asked without introducing lag under load.
> Answering the questions posed by Tomas Vondra would help get to the
> actual cause of the lag.
>
> In meantime my suspicion is this part:
>
> "For example, when I use a psql terminal to update a user's last name
> from "Jones" to "Smith" then I would expect the enrichment query to find
> "Smith" but it will sometimes still find "Jones". It finds the old data
> perhaps 1 in 50 times."
>
> If this is being run against the Postgres server my guess is that
> synchronous_commit=on is causing the commit on the server to wait for
> the WAL records to be flushed to disk and this is not happening in a
> timely manner in the '... 1 in 50 times' you mention. In that case you
> see the old values not the new committed values. This seems to be
> confirmed when you set synchronous_commit=off and don't see old values.
> For completeness per:
>
> https://www.postgresql.org/docs/current/wal-async-commit.html
>
> "However, for short transactions this delay is a major component of the
> total transaction time. Selecting asynchronous commit mode means that
> the server returns success as soon as the transaction is logically
> completed, before the WAL records it generated have actually made their
> way to disk. This can provide a significant boost in throughput for
> small transactions.
>
> Asynchronous commit introduces the risk of data loss. There is a short
> time window between the report of transaction completion to the client
> and the time that the transaction is truly committed (that is, it is
> guaranteed not to be lost if the server crashes).  ...
> "
>
> >
> > Daniel McKenzie
> > Software Developer
>
> --
> Adrian Klaver
> [email protected]
>
>

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Unexpected data when subscribing to logical replication slot
@ 2024-05-09 15:07  Adrian Klaver <[email protected]>
  parent: Daniel McKenzie <[email protected]>
  2 siblings, 0 replies; 7+ messages in thread

From: Adrian Klaver @ 2024-05-09 15:07 UTC (permalink / raw)
  To: Daniel McKenzie <[email protected]>; [email protected]; +Cc: pgsql-general

On 5/9/24 00:32, Daniel McKenzie wrote:
>     Asynchronous commit introduces the risk of data loss. There is a short
>     time window between the report of transaction completion to the client
>     and the time that the transaction is truly committed.
> 

To get anywhere with this issue you will need to provide the information 
Tomas Vondra requested upstream:

"
Where/how does the enrichment query run? How does the whole process look 
like? I guess an application is receiving decoded changes as JSON, and 
then querying the database?
"

and

"
Would be good to have some sort of reproducer - ideally a script that
sets up an instance + replication, and demonstrates the issue. Or at
least a sufficiently detailed steps to reproduce it without having to
guess what exactly you did.
"
> 
> The documentation speaks about synchronous_commit changing how 
> transactions change behaviour for the client. So in this case, my psql 
> terminal is the client, and I would expect a faster commit (from its 
> perspective) and then a period of risk (as a process usually done as 
> part of the commit is now being done in the background) but it's not 
> clear how that affects a replication slot subscriber.
> 
> What we're struggling to understand is: why are we seeing any updates in 
> the replication slot before they have been "truly committed"?

From:

https://www.postgresql.org/docs/current/wal-async-commit.html

"As described in the previous section, transaction commit is normally 
synchronous: the server waits for the transaction's WAL records to be 
flushed to permanent storage before returning a success indication to 
the client. The client is therefore guaranteed that a transaction 
reported to be committed will be preserved, even in the event of a 
server crash immediately after. However, for short transactions this 
delay is a major component of the total transaction time. Selecting 
asynchronous commit mode means that the server returns success as soon 
as the transaction is logically completed, before the WAL records it 
generated have actually made their way to disk. This can provide a 
significant boost in throughput for small transactions."

It is about the state of the WAL record, in synchronous_commit=on the 
commit will not happen on the server and be seen by the client until the 
WAL record has been recorded as saved to disk. For 
synchronous_commit=off the commit happens as soon as COMMIT is reached 
in the transaction and the WAL record save happens after that.

I don't use wal2json so I don't know how it deals with the above.

> 
> There appears to be a state of limbo between updating data and that data 
> being available to query (and our subscriber is picking up changes in 
> this period of time) but I can't pin down any documentation which 
> describes it.
The answer awaits a the full description of the process requested by 
Tomas Vondra.

Best guess, the fact that synchronous_commit=off 'cures' it implies that 
in synchronous_commit=on mode you are picking up data on the receiving 
and sending end at different points in '... the server waits for the 
transaction's WAL records to be flushed to permanent storage before 
returning a success indication to the client.'

> 
> We've had this running in live now for years without a hiccup so we are 
> surprised to learn that we have this massive race condition and it just 

I would not say "...perhaps 1 in 50 times" is massive.

> so happens that the hardware is fast enough to process the transaction 
> before the .NET application can react to replication slot changes.
> 
> Daniel McKenzie
> Software Developer

-- 
Adrian Klaver
[email protected]

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Unexpected data when subscribing to logical replication slot
@ 2024-05-09 15:14  Torsten Förtsch <[email protected]>
  parent: Daniel McKenzie <[email protected]>
  2 siblings, 1 reply; 7+ messages in thread

From: Torsten Förtsch @ 2024-05-09 15:14 UTC (permalink / raw)
  To: Daniel McKenzie <[email protected]>; +Cc: Adrian Klaver <[email protected]>; [email protected]; pgsql-general

I would not find this behavior surprising in particular if you have a
synchronous replica. According to the documentation of synchronous_commit:

   The local behavior of all non-off modes is to wait for local flush of
WAL to disk.

This is when the logical decoder sees the item. But that does not mean the
change is visible to other transactions in the MVCC sense. So, if wal2json
and the rest of your stuff is fast enough, then the enrichment query may
very well read old data.

A transaction being committed means all the WAL has been written (and
usually synced) to disk including the bit in the pg_xact directory.

On Thu, May 9, 2024 at 9:32 AM Daniel McKenzie <
[email protected]> wrote:

> Asynchronous commit introduces the risk of data loss. There is a short
>> time window between the report of transaction completion to the client
>> and the time that the transaction is truly committed.
>
>
> The documentation speaks about synchronous_commit changing how
> transactions change behaviour for the client. So in this case, my psql
> terminal is the client, and I would expect a faster commit (from its
> perspective) and then a period of risk (as a process usually done as part
> of the commit is now being done in the background) but it's not clear how
> that affects a replication slot subscriber.
>
> What we're struggling to understand is: why are we seeing any updates in
> the replication slot before they have been "truly committed"?
>
> There appears to be a state of limbo between updating data and that data
> being available to query (and our subscriber is picking up changes in this
> period of time) but I can't pin down any documentation which describes it.
>
> We've had this running in live now for years without a hiccup so we are
> surprised to learn that we have this massive race condition and it just so
> happens that the hardware is fast enough to process the transaction before
> the .NET application can react to replication slot changes.
>
> Daniel McKenzie
> Software Developer
>
> Office: +1 403.910.5927 x 251
> Mobile: +44 7712 159045
> Website: www.curvedental.com
>
> *Curve Dental Confidentiality Notice*
> This message is intended exclusively for the individual or entity to which
> it is addressed. This communication may contain information that is
> proprietary, privileged, confidential, or otherwise legally exempt from
> disclosure. If you are not the named addressee, you are not authorized to
> read, print, retain, copy, or disseminate this message or any part of it.
> If you have received this message in error, please notify the sender
> immediately by replying to this e-mail and delete all copies of this
> message.
>
>
> On Wed, May 8, 2024 at 5:28 PM Adrian Klaver <[email protected]>
> wrote:
>
>> On 5/8/24 08:24, Daniel McKenzie wrote:
>> > It's running both (in docker containers) and also quite a few more
>> > docker containers running various .NET applications.
>>
>> I think what you found is that the r7a.medium instance is not capable
>> enough to do all that it is asked without introducing lag under load.
>> Answering the questions posed by Tomas Vondra would help get to the
>> actual cause of the lag.
>>
>> In meantime my suspicion is this part:
>>
>> "For example, when I use a psql terminal to update a user's last name
>> from "Jones" to "Smith" then I would expect the enrichment query to find
>> "Smith" but it will sometimes still find "Jones". It finds the old data
>> perhaps 1 in 50 times."
>>
>> If this is being run against the Postgres server my guess is that
>> synchronous_commit=on is causing the commit on the server to wait for
>> the WAL records to be flushed to disk and this is not happening in a
>> timely manner in the '... 1 in 50 times' you mention. In that case you
>> see the old values not the new committed values. This seems to be
>> confirmed when you set synchronous_commit=off and don't see old values.
>> For completeness per:
>>
>> https://www.postgresql.org/docs/current/wal-async-commit.html
>>
>> "However, for short transactions this delay is a major component of the
>> total transaction time. Selecting asynchronous commit mode means that
>> the server returns success as soon as the transaction is logically
>> completed, before the WAL records it generated have actually made their
>> way to disk. This can provide a significant boost in throughput for
>> small transactions.
>>
>> Asynchronous commit introduces the risk of data loss. There is a short
>> time window between the report of transaction completion to the client
>> and the time that the transaction is truly committed (that is, it is
>> guaranteed not to be lost if the server crashes).  ...
>> "
>>
>> >
>> > Daniel McKenzie
>> > Software Developer
>>
>> --
>> Adrian Klaver
>> [email protected]
>>
>>


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Unexpected data when subscribing to logical replication slot
@ 2024-05-09 15:16  Torsten Förtsch <[email protected]>
  parent: Torsten Förtsch <[email protected]>
  0 siblings, 0 replies; 7+ messages in thread

From: Torsten Förtsch @ 2024-05-09 15:16 UTC (permalink / raw)
  To: Daniel McKenzie <[email protected]>; +Cc: Adrian Klaver <[email protected]>; [email protected]; pgsql-general

Sorry, to correct myself. The pg_xact bit is written with the next
checkpoint. But the COMMIT record in the WAL is there.

On Thu, May 9, 2024 at 5:14 PM Torsten Förtsch <[email protected]>
wrote:

> I would not find this behavior surprising in particular if you have a
> synchronous replica. According to the documentation of synchronous_commit:
>
>    The local behavior of all non-off modes is to wait for local flush of
> WAL to disk.
>
> This is when the logical decoder sees the item. But that does not mean the
> change is visible to other transactions in the MVCC sense. So, if wal2json
> and the rest of your stuff is fast enough, then the enrichment query may
> very well read old data.
>
> A transaction being committed means all the WAL has been written (and
> usually synced) to disk including the bit in the pg_xact directory.
>
> On Thu, May 9, 2024 at 9:32 AM Daniel McKenzie <
> [email protected]> wrote:
>
>> Asynchronous commit introduces the risk of data loss. There is a short
>>> time window between the report of transaction completion to the client
>>> and the time that the transaction is truly committed.
>>
>>
>> The documentation speaks about synchronous_commit changing how
>> transactions change behaviour for the client. So in this case, my psql
>> terminal is the client, and I would expect a faster commit (from its
>> perspective) and then a period of risk (as a process usually done as part
>> of the commit is now being done in the background) but it's not clear how
>> that affects a replication slot subscriber.
>>
>> What we're struggling to understand is: why are we seeing any updates in
>> the replication slot before they have been "truly committed"?
>>
>> There appears to be a state of limbo between updating data and that data
>> being available to query (and our subscriber is picking up changes in this
>> period of time) but I can't pin down any documentation which describes it.
>>
>> We've had this running in live now for years without a hiccup so we are
>> surprised to learn that we have this massive race condition and it just so
>> happens that the hardware is fast enough to process the transaction before
>> the .NET application can react to replication slot changes.
>>
>> Daniel McKenzie
>> Software Developer
>>
>> Office: +1 403.910.5927 x 251
>> Mobile: +44 7712 159045
>> Website: www.curvedental.com
>>
>> *Curve Dental Confidentiality Notice*
>> This message is intended exclusively for the individual or entity to
>> which it is addressed. This communication may contain information that
>> is proprietary, privileged, confidential, or otherwise legally exempt from
>> disclosure. If you are not the named addressee, you are not authorized
>> to read, print, retain, copy, or disseminate this message or any part of
>> it. If you have received this message in error, please notify the sender
>> immediately by replying to this e-mail and delete all copies of this
>> message.
>>
>>
>> On Wed, May 8, 2024 at 5:28 PM Adrian Klaver <[email protected]>
>> wrote:
>>
>>> On 5/8/24 08:24, Daniel McKenzie wrote:
>>> > It's running both (in docker containers) and also quite a few more
>>> > docker containers running various .NET applications.
>>>
>>> I think what you found is that the r7a.medium instance is not capable
>>> enough to do all that it is asked without introducing lag under load.
>>> Answering the questions posed by Tomas Vondra would help get to the
>>> actual cause of the lag.
>>>
>>> In meantime my suspicion is this part:
>>>
>>> "For example, when I use a psql terminal to update a user's last name
>>> from "Jones" to "Smith" then I would expect the enrichment query to find
>>> "Smith" but it will sometimes still find "Jones". It finds the old data
>>> perhaps 1 in 50 times."
>>>
>>> If this is being run against the Postgres server my guess is that
>>> synchronous_commit=on is causing the commit on the server to wait for
>>> the WAL records to be flushed to disk and this is not happening in a
>>> timely manner in the '... 1 in 50 times' you mention. In that case you
>>> see the old values not the new committed values. This seems to be
>>> confirmed when you set synchronous_commit=off and don't see old values.
>>> For completeness per:
>>>
>>> https://www.postgresql.org/docs/current/wal-async-commit.html
>>>
>>> "However, for short transactions this delay is a major component of the
>>> total transaction time. Selecting asynchronous commit mode means that
>>> the server returns success as soon as the transaction is logically
>>> completed, before the WAL records it generated have actually made their
>>> way to disk. This can provide a significant boost in throughput for
>>> small transactions.
>>>
>>> Asynchronous commit introduces the risk of data loss. There is a short
>>> time window between the report of transaction completion to the client
>>> and the time that the transaction is truly committed (that is, it is
>>> guaranteed not to be lost if the server crashes).  ...
>>> "
>>>
>>> >
>>> > Daniel McKenzie
>>> > Software Developer
>>>
>>> --
>>> Adrian Klaver
>>> [email protected]
>>>
>>>


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Unexpected data when subscribing to logical replication slot
@ 2024-05-09 16:45  Adrian Klaver <[email protected]>
  parent: Daniel McKenzie <[email protected]>
  2 siblings, 1 reply; 7+ messages in thread

From: Adrian Klaver @ 2024-05-09 16:45 UTC (permalink / raw)
  To: Daniel McKenzie <[email protected]>; [email protected]; +Cc: pgsql-general

On 5/9/24 00:32, Daniel McKenzie wrote:

> We've had this running in live now for years without a hiccup so we are 
> surprised to learn that we have this massive race condition and it just 
> so happens that the hardware is fast enough to process the transaction 
> before the .NET application can react to replication slot changes.

On broad scale I'm going to say that over 'for years' there has been an 
increase in load on the Postgres server as well as the I/0 system of the 
machine it is running on. What you are seeing now is the canary in the 
mine giving you the heads up that more trouble lies ahead as the 
hardware and software is reaching load limits.

On finer scale my guess is that the following is happening when 
synchronous_commit = on:

1) Postgres session #1 does data change.

2) This is replicated out and picked up by wal2json, which sees the new 
data.

3) The Postgres server waits for the confirmation that the WAL record 
has been written out to disk. Upon confirmation it commits on the 
server. This is the part that I am not sure of in relation to wal2json.

4) Postgres session #2 queries the database for the record. In the case 
where 3) has not completed it sees the old values as the data change in 
session #1 has not committed and therefore the new values are not seen 
by other sessions.

> 
> Daniel McKenzie
> Software Developer

-- 
Adrian Klaver
[email protected]

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Unexpected data when subscribing to logical replication slot
@ 2024-05-10 12:11  Daniel McKenzie <[email protected]>
  parent: Adrian Klaver <[email protected]>
  0 siblings, 0 replies; 7+ messages in thread

From: Daniel McKenzie @ 2024-05-10 12:11 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; [email protected]; [email protected]; +Cc: pgsql-general

Thank you all for your input.

We have solved the problem by -

   1. Configuring wal2json to include xids
   <https://github.com/eulerto/wal2json/blob/master/README.md;.
   2. Updating our enrichment queries to return the xmin
   <https://www.postgresql.org/docs/9.1/ddl-system-columns.html;.
   3. Retrying our enrichment queries after a brief sleep if xmin < xid.

This seems to be working very well so far.

Daniel McKenzie
Software Developer

Office: +1 403.910.5927 x 251
Mobile: +44 7712 159045
Website: www.curvedental.com

*Curve Dental Confidentiality Notice*
This message is intended exclusively for the individual or entity to which
it is addressed. This communication may contain information that is
proprietary, privileged, confidential, or otherwise legally exempt from
disclosure. If you are not the named addressee, you are not authorized to
read, print, retain, copy, or disseminate this message or any part of it.
If you have received this message in error, please notify the sender
immediately by replying to this e-mail and delete all copies of this
message.

On Thu, May 9, 2024 at 5:45 PM Adrian Klaver <[email protected]>
wrote:

> On 5/9/24 00:32, Daniel McKenzie wrote:
>
> > We've had this running in live now for years without a hiccup so we are
> > surprised to learn that we have this massive race condition and it just
> > so happens that the hardware is fast enough to process the transaction
> > before the .NET application can react to replication slot changes.
>
> On broad scale I'm going to say that over 'for years' there has been an
> increase in load on the Postgres server as well as the I/0 system of the
> machine it is running on. What you are seeing now is the canary in the
> mine giving you the heads up that more trouble lies ahead as the
> hardware and software is reaching load limits.
>
> On finer scale my guess is that the following is happening when
> synchronous_commit = on:
>
> 1) Postgres session #1 does data change.
>
> 2) This is replicated out and picked up by wal2json, which sees the new
> data.
>
> 3) The Postgres server waits for the confirmation that the WAL record
> has been written out to disk. Upon confirmation it commits on the
> server. This is the part that I am not sure of in relation to wal2json.
>
> 4) Postgres session #2 queries the database for the record. In the case
> where 3) has not completed it sees the old values as the data change in
> session #1 has not committed and therefore the new values are not seen
> by other sessions.
>
> >
> > Daniel McKenzie
> > Software Developer
>
> --
> Adrian Klaver
> [email protected]
>
>

^ permalink  raw  reply  [nested|flat] 7+ messages in thread

end of thread, other threads:[~2024-05-10 12:11 UTC | newest]

Thread overview: 7+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-05-08 16:28 Re: Unexpected data when subscribing to logical replication slot Adrian Klaver <[email protected]>
2024-05-09 07:32 ` Daniel McKenzie <[email protected]>
2024-05-09 15:07   ` Adrian Klaver <[email protected]>
2024-05-09 15:14   ` Torsten Förtsch <[email protected]>
2024-05-09 15:16     ` Torsten Förtsch <[email protected]>
2024-05-09 16:45   ` Adrian Klaver <[email protected]>
2024-05-10 12:11     ` Daniel McKenzie <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox