public inbox for [email protected]help / color / mirror / Atom feed
On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null? 4+ messages / 2 participants [nested] [flat]
* On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null? @ 2024-08-23 14:33 Koen De Groote <[email protected]> 0 siblings, 1 reply; 4+ messages in thread From: Koen De Groote @ 2024-08-23 14:33 UTC (permalink / raw) To: PostgreSQL General <[email protected]> I'm running this query: SELECT subname, received_lsn, latest_end_lsn, last_msg_send_time, last_msg_receipt_time FROM pg_catalog.pg_stat_subscription; And "last_msg_send_time" will sometimes be null. All I can find in the documentation is that this value represents: "Send time of last message received from origin WAL sender; NULL for parallel apply workers" But I have no context for what parallel apply workers are or why they are NULL sometimes and other times not. Can anyone explain? Regards, Koen De Groote ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null? @ 2024-08-23 15:02 Adrian Klaver <[email protected]> parent: Koen De Groote <[email protected]> 0 siblings, 1 reply; 4+ messages in thread From: Adrian Klaver @ 2024-08-23 15:02 UTC (permalink / raw) To: Koen De Groote <[email protected]>; PostgreSQL General <[email protected]> On 8/23/24 07:33, Koen De Groote wrote: > I'm running this query: > > SELECT subname, received_lsn, latest_end_lsn, > last_msg_send_time, last_msg_receipt_time > FROM pg_catalog.pg_stat_subscription; > > And "last_msg_send_time" will sometimes be null. All I can find in the > documentation is that this value represents: "Send time of last message > received from origin WAL sender; NULL for parallel apply workers" > > But I have no context for what parallel apply workers are or why they > are NULL sometimes and other times not. > > Can anyone explain? The source has that information: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/applypa... The top part explains the process. As to the time: /* * There are three fields in each message received by the parallel apply * worker: start_lsn, end_lsn and send_time. Because we have updated these * statistics in the leader apply worker, we can ignore these fields in the * parallel apply worker (see function LogicalRepApplyLoop). */ > > Regards, > Koen De Groote -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null? @ 2024-08-23 19:31 Koen De Groote <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 1 reply; 4+ messages in thread From: Koen De Groote @ 2024-08-23 19:31 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: PostgreSQL General <[email protected]> Adrian, Thanks for pointing me at the source code. Digging a bit, the view seems to eventually get its data on last msg send time from here: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/launche... And in particular, this: 1332 if (worker.last_send_time == 0) 1333 nulls[5] = true; 1334 else 1335 values[5] = TimestampTzGetDatum(worker.last_send_time); 1336 if (worker.last_recv_time == 0) 1337 nulls[6] = true; 1338 else 1339 values[6] = TimestampTzGetDatum(worker.last_recv_time); I don't have any knowledge of C, or postgres internals, so I may well be wrong in what follows: From the bit of comment you posted my impression is that this means there are separate workers that each send their update, that is then reflected in the output of the pg_stat_subscription table... many workers, but only 1 table to show metrics, to show both the update by the leader and the parallel workers... And these parallel workers get created on the fly, I assume? So they might well have a last_send_time of 0 if they haven't done anything yet? What I would expect to see is a table that tells me how a particular publishers/subscriber is doing, and the metrics around that process, and the concept of "when data was last sent" to be persistent on the level of the publisher/subscriber, not based on the lifespan of ephemeral workers that each time they get created start at 0 and so now the table claims "last_send_msg" is NULL because of it. Am I getting that wrong? Is my understanding mistaken? Regards, Koen De Groote On Fri, Aug 23, 2024 at 5:02 PM Adrian Klaver <[email protected]> wrote: > On 8/23/24 07:33, Koen De Groote wrote: > > I'm running this query: > > > > SELECT subname, received_lsn, latest_end_lsn, > > last_msg_send_time, last_msg_receipt_time > > FROM pg_catalog.pg_stat_subscription; > > > > And "last_msg_send_time" will sometimes be null. All I can find in the > > documentation is that this value represents: "Send time of last message > > received from origin WAL sender; NULL for parallel apply workers" > > > > But I have no context for what parallel apply workers are or why they > > are NULL sometimes and other times not. > > > Can anyone explain? > > The source has that information: > > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/applypa... > > The top part explains the process. > > As to the time: > > /* > * There are three fields in each message received by the parallel apply > * worker: start_lsn, end_lsn and send_time. Because we have updated these > * statistics in the leader apply worker, we can ignore these fields in > the > * parallel apply worker (see function LogicalRepApplyLoop). > */ > > > > > Regards, > > Koen De Groote > > -- > Adrian Klaver > [email protected] > > ^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null? @ 2024-08-23 20:14 Adrian Klaver <[email protected]> parent: Koen De Groote <[email protected]> 0 siblings, 0 replies; 4+ messages in thread From: Adrian Klaver @ 2024-08-23 20:14 UTC (permalink / raw) To: Koen De Groote <[email protected]>; +Cc: PostgreSQL General <[email protected]> On 8/23/24 12:31, Koen De Groote wrote: > Adrian, > > Thanks for pointing me at the source code. > > Digging a bit, the view seems to eventually get its data on last msg > send time from here: > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/launche... <https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/launche...; > > And in particular, this: > > 1332 if (worker.last_send_time == 0) > 1333 nulls[5] = true; > 1334 else > 1335 values[5] = TimestampTzGetDatum(worker.last_send_time); > 1336 if (worker.last_recv_time == 0) > 1337 nulls[6] = true; > 1338 else > 1339 values[6] = TimestampTzGetDatum(worker.last_recv_time); > > I don't have any knowledge of C, or postgres internals, so I may well be > wrong in what follows: > > From the bit of comment you posted my impression is that this means > there are separate workers that each send their update, that is then > reflected in the output of the pg_stat_subscription table... many > workers, but only 1 table to show metrics, to show both the update by > the leader and the parallel workers... This is getting out of my depth, but that has not stopped me before so onward. I think it is important to realize parallel workers are optional: https://www.postgresql.org/docs/current/sql-createsubscription.html streaming (enum) Specifies whether to enable streaming of in-progress transactions for this subscription. The default value is off, meaning all transactions are fully decoded on the publisher and only then sent to the subscriber as a whole. If set to on, the incoming changes are written to temporary files and then applied only after the transaction is committed on the publisher and received by the subscriber. If set to parallel, incoming changes are directly applied via one of the parallel apply workers, if available. If no parallel apply worker is free to handle streaming transactions then the changes are written to temporary files and applied after the transaction is committed. Note that if an error happens in a parallel apply worker, the finish LSN of the remote transaction might not be reported in the server log. > > And these parallel workers get created on the fly, I assume? So they > might well have a last_send_time of 0 if they haven't done anything yet? > > What I would expect to see is a table that tells me how a particular > publishers/subscriber is doing, and the metrics around that process, and > the concept of "when data was last sent" to be persistent on the level > of the publisher/subscriber, not based on the lifespan of ephemeral > workers that each time they get created start at 0 and so now the table > claims "last_send_msg" is NULL because of it. Some combination of?: https://www.postgresql.org/docs/current/view-pg-replication-slots.html https://www.postgresql.org/docs/current/catalog-pg-subscription-rel.html > > Am I getting that wrong? Is my understanding mistaken? > > Regards, > Koen De Groote > > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 4+ messages in thread
end of thread, other threads:[~2024-08-23 20:14 UTC | newest] Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2024-08-23 14:33 On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null? Koen De Groote <[email protected]> 2024-08-23 15:02 ` Adrian Klaver <[email protected]> 2024-08-23 19:31 ` Koen De Groote <[email protected]> 2024-08-23 20:14 ` Adrian Klaver <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox