public inbox for [email protected]help / color / mirror / Atom feed
pgsql: Fix rare assertion failure in standby, if primary is restarted 5+ messages / 1 participants [nested] [flat]
* pgsql: Fix rare assertion failure in standby, if primary is restarted @ 2025-03-23 18:45 Heikki Linnakangas <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Heikki Linnakangas @ 2025-03-23 18:45 UTC (permalink / raw) To: [email protected] Fix rare assertion failure in standby, if primary is restarted During hot standby, ExpireAllKnownAssignedTransactionIds() and ExpireOldKnownAssignedTransactionIds() functions mark old transactions as no-longer running, but they failed to update xactCompletionCount and latestCompletedXid. AFAICS it would not lead to incorrect query results, because those functions effectively turn in-progress transactions into aborted transactions and an MVCC snapshot considers both as "not visible". But it could surprise GetSnapshotDataReuse() and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin))" assertion in it, if the apparent xmin in a backend would move backwards. We saw this happen when GetCatalogSnapshot() would reuse an older catalog snapshot, when GetTransactionSnapshot() had already advanced TransactionXmin. The bug goes back all the way to commit 623a9ba79b in v14 that introduced the snapshot reuse mechanism, but it started to happen more frequently with commit 952365cded6 which removed a GetTransactionSnapshot() call from backend startup. That made it more likely for ExpireOldKnownAssignedTransactionIds() to be called between GetCatalogSnapshot() and the first GetTransactionSnapshot() in a backend. Andres Freund first spotted this assertion failure on buildfarm member 'skink'. Reproduction and analysis by Tomas Vondra. Backpatch-through: 14 Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5z... Branch ------ REL_15_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/b30c77a0e480352cce573195af819bd41a3c1b42 Modified Files -------------- src/backend/storage/ipc/procarray.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) ^ permalink raw reply [nested|flat] 5+ messages in thread
* pgsql: Fix rare assertion failure in standby, if primary is restarted @ 2025-03-23 18:45 Heikki Linnakangas <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Heikki Linnakangas @ 2025-03-23 18:45 UTC (permalink / raw) To: [email protected] Fix rare assertion failure in standby, if primary is restarted During hot standby, ExpireAllKnownAssignedTransactionIds() and ExpireOldKnownAssignedTransactionIds() functions mark old transactions as no-longer running, but they failed to update xactCompletionCount and latestCompletedXid. AFAICS it would not lead to incorrect query results, because those functions effectively turn in-progress transactions into aborted transactions and an MVCC snapshot considers both as "not visible". But it could surprise GetSnapshotDataReuse() and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin))" assertion in it, if the apparent xmin in a backend would move backwards. We saw this happen when GetCatalogSnapshot() would reuse an older catalog snapshot, when GetTransactionSnapshot() had already advanced TransactionXmin. The bug goes back all the way to commit 623a9ba79b in v14 that introduced the snapshot reuse mechanism, but it started to happen more frequently with commit 952365cded6 which removed a GetTransactionSnapshot() call from backend startup. That made it more likely for ExpireOldKnownAssignedTransactionIds() to be called between GetCatalogSnapshot() and the first GetTransactionSnapshot() in a backend. Andres Freund first spotted this assertion failure on buildfarm member 'skink'. Reproduction and analysis by Tomas Vondra. Backpatch-through: 14 Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5z... Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/2817525f0d56075e1f3a14c0dc6a180b337d8aed Modified Files -------------- src/backend/storage/ipc/procarray.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) ^ permalink raw reply [nested|flat] 5+ messages in thread
* pgsql: Fix rare assertion failure in standby, if primary is restarted @ 2025-03-23 18:45 Heikki Linnakangas <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Heikki Linnakangas @ 2025-03-23 18:45 UTC (permalink / raw) To: [email protected] Fix rare assertion failure in standby, if primary is restarted During hot standby, ExpireAllKnownAssignedTransactionIds() and ExpireOldKnownAssignedTransactionIds() functions mark old transactions as no-longer running, but they failed to update xactCompletionCount and latestCompletedXid. AFAICS it would not lead to incorrect query results, because those functions effectively turn in-progress transactions into aborted transactions and an MVCC snapshot considers both as "not visible". But it could surprise GetSnapshotDataReuse() and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin))" assertion in it, if the apparent xmin in a backend would move backwards. We saw this happen when GetCatalogSnapshot() would reuse an older catalog snapshot, when GetTransactionSnapshot() had already advanced TransactionXmin. The bug goes back all the way to commit 623a9ba79b in v14 that introduced the snapshot reuse mechanism, but it started to happen more frequently with commit 952365cded6 which removed a GetTransactionSnapshot() call from backend startup. That made it more likely for ExpireOldKnownAssignedTransactionIds() to be called between GetCatalogSnapshot() and the first GetTransactionSnapshot() in a backend. Andres Freund first spotted this assertion failure on buildfarm member 'skink'. Reproduction and analysis by Tomas Vondra. Backpatch-through: 14 Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5z... Branch ------ REL_17_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/302ce5bd93b48549bf6717512ea92252319dc944 Modified Files -------------- src/backend/storage/ipc/procarray.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) ^ permalink raw reply [nested|flat] 5+ messages in thread
* pgsql: Fix rare assertion failure in standby, if primary is restarted @ 2025-03-23 18:45 Heikki Linnakangas <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Heikki Linnakangas @ 2025-03-23 18:45 UTC (permalink / raw) To: [email protected] Fix rare assertion failure in standby, if primary is restarted During hot standby, ExpireAllKnownAssignedTransactionIds() and ExpireOldKnownAssignedTransactionIds() functions mark old transactions as no-longer running, but they failed to update xactCompletionCount and latestCompletedXid. AFAICS it would not lead to incorrect query results, because those functions effectively turn in-progress transactions into aborted transactions and an MVCC snapshot considers both as "not visible". But it could surprise GetSnapshotDataReuse() and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin))" assertion in it, if the apparent xmin in a backend would move backwards. We saw this happen when GetCatalogSnapshot() would reuse an older catalog snapshot, when GetTransactionSnapshot() had already advanced TransactionXmin. The bug goes back all the way to commit 623a9ba79b in v14 that introduced the snapshot reuse mechanism, but it started to happen more frequently with commit 952365cded6 which removed a GetTransactionSnapshot() call from backend startup. That made it more likely for ExpireOldKnownAssignedTransactionIds() to be called between GetCatalogSnapshot() and the first GetTransactionSnapshot() in a backend. Andres Freund first spotted this assertion failure on buildfarm member 'skink'. Reproduction and analysis by Tomas Vondra. Backpatch-through: 14 Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5z... Branch ------ REL_14_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/66235baab72b22456c88a28db788048b52712100 Modified Files -------------- src/backend/storage/ipc/procarray.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) ^ permalink raw reply [nested|flat] 5+ messages in thread
* pgsql: Fix rare assertion failure in standby, if primary is restarted @ 2025-03-23 18:45 Heikki Linnakangas <[email protected]> 0 siblings, 0 replies; 5+ messages in thread From: Heikki Linnakangas @ 2025-03-23 18:45 UTC (permalink / raw) To: [email protected] Fix rare assertion failure in standby, if primary is restarted During hot standby, ExpireAllKnownAssignedTransactionIds() and ExpireOldKnownAssignedTransactionIds() functions mark old transactions as no-longer running, but they failed to update xactCompletionCount and latestCompletedXid. AFAICS it would not lead to incorrect query results, because those functions effectively turn in-progress transactions into aborted transactions and an MVCC snapshot considers both as "not visible". But it could surprise GetSnapshotDataReuse() and trigger the "TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin))" assertion in it, if the apparent xmin in a backend would move backwards. We saw this happen when GetCatalogSnapshot() would reuse an older catalog snapshot, when GetTransactionSnapshot() had already advanced TransactionXmin. The bug goes back all the way to commit 623a9ba79b in v14 that introduced the snapshot reuse mechanism, but it started to happen more frequently with commit 952365cded6 which removed a GetTransactionSnapshot() call from backend startup. That made it more likely for ExpireOldKnownAssignedTransactionIds() to be called between GetCatalogSnapshot() and the first GetTransactionSnapshot() in a backend. Andres Freund first spotted this assertion failure on buildfarm member 'skink'. Reproduction and analysis by Tomas Vondra. Backpatch-through: 14 Discussion: https://www.postgresql.org/message-id/oey246mcw43cy4qw2hqjmurbd62lfdpcuxyqiu7botx3typpax%40h7o7mfg5z... Branch ------ REL_16_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/2f33de3cdbc814bcc4270aafd98880a12d265777 Modified Files -------------- src/backend/storage/ipc/procarray.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) ^ permalink raw reply [nested|flat] 5+ messages in thread
end of thread, other threads:[~2025-03-23 18:45 UTC | newest] Thread overview: 5+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-03-23 18:45 pgsql: Fix rare assertion failure in standby, if primary is restarted Heikki Linnakangas <[email protected]> 2025-03-23 18:45 pgsql: Fix rare assertion failure in standby, if primary is restarted Heikki Linnakangas <[email protected]> 2025-03-23 18:45 pgsql: Fix rare assertion failure in standby, if primary is restarted Heikki Linnakangas <[email protected]> 2025-03-23 18:45 pgsql: Fix rare assertion failure in standby, if primary is restarted Heikki Linnakangas <[email protected]> 2025-03-23 18:45 pgsql: Fix rare assertion failure in standby, if primary is restarted Heikki Linnakangas <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox