Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2XeB-000Nq2-1l for pgsql-hackers@arkaria.postgresql.org; Tue, 17 Mar 2026 16:51:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w2XeA-003MQQ-1Q for pgsql-hackers@arkaria.postgresql.org; Tue, 17 Mar 2026 16:51:58 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2XeA-003MQH-0G for pgsql-hackers@lists.postgresql.org; Tue, 17 Mar 2026 16:51:58 +0000 Received: from mail-yw1-x1131.google.com ([2607:f8b0:4864:20::1131]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w2Xe7-00000000dZ0-0IHh for pgsql-hackers@postgresql.org; Tue, 17 Mar 2026 16:51:57 +0000 Received: by mail-yw1-x1131.google.com with SMTP id 00721157ae682-79885f4a8ffso50510707b3.3 for ; Tue, 17 Mar 2026 09:51:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773766314; cv=none; d=google.com; s=arc-20240605; b=iOTQTEk4TVsNqLYkBQtjzEqo6wD1eBQDBpyb1nTzeH8MZTNEqQ2AKeN6AQPgc+LgAY ksLOBFHJYNcfDo9nJr59nFnwRLobgB2QZSbeC2c8VAr22diVMRZGBF10wCFRxlPq2Pw3 aj2GDpwzk1GwTi/GFNIpEHuvr5VYle0fN4K+9EnU6358W+GdSkowYBsFqS1XuiYpL1vy El0lOcaiUyZYcpQd6Q53oowNLLyeApo5C95KwITGn0blHLMRMWZ+xnecGLgR8cu84uKa 2LpACYf5YUliD16o9oiddXBWO0DC+HAJBql6aqoIbDZuW4hFSZsB1x0N5AyKUG9RyCm/ D3DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=xmsvRVW47yrXQ1/t9SjRSROsPDuJUE5ezaGAIAcWKQI=; fh=vkLtzsPTyfVG7bKRBNlxJSBbLmn+UCAl0VUVGwUC6ck=; b=Mb9UsSKmP6sB8G+Qya2q8OziY1m60eN7hBbKxuvEqtPT5N6Be6fLj8gM9393/P/M0q FWMLY1DMAOA0z2m4HFLUBRmr51QS6Ms3lVOhieNSsA5Xa6T50ANTTe8i9qgPc9WkztHb KxuRE0vaZqWnqzuO4ZgPfBca11f7mgGq4ZQxLUXLnktU23bte7pQdtjZBcRSkrrEg2bC guOo7tzBDgKluxSx4vquWG/Ba2LQb/KZ/zg31/29AG/Gl8qhDeiotLpefDPnRwwEd19m tqJZkYyqB1bwEhBT65lLJ+9KS1zbSCI5KPMGwKDy5kX95us7n9dbLdfia2qV8QiOh8QM Nlig==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datadoghq.com; s=google; t=1773766314; x=1774371114; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xmsvRVW47yrXQ1/t9SjRSROsPDuJUE5ezaGAIAcWKQI=; b=ZXKJlx6ryuKBoP6r0vhcSEoyneroiCWPmS+gQJ6xBlClW7j79zoniCVg8L7B0A2i4F zMKoTKXNbNRE/p7gb5cBHhujGJQBZeyKMFmm9DqQKeuy3bWNRZOGmdHXeb36MFE4Ww4S W7l74EA9uw4Y9qQ8ohaHV9KGi/O3WbgYehpXw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773766314; x=1774371114; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=xmsvRVW47yrXQ1/t9SjRSROsPDuJUE5ezaGAIAcWKQI=; b=khPvRuTEKynQdm0kU1Ma2wZNiawqYS2dGYRk0EmwIniGzhzHvw4qQYVCom9PWaamr3 7qlOVibZkQBjzN88npPd4IVWh8KmHh71Y3XAUNotLnyDUtWf4CY8jKuN1kHcbC0JH0p4 g6IGWYUjHaH9RColEc2NF22/uCSoTHD/mCo4uAAVLPyF6rfP5ZpYv4bpqx2H9EePhELq 0I5g8lrJLgDwYDKPUhr3a1EaMvj1qosCtJncNgzHRBrdL3OnvfqTHogSOnCYsv03nxMc adDggrBqIkJcbsDtz9QkYBRKF5hofCMDaNpAwS48oiptjy94kVUlB9GBNoKL7Z8pX29s tKgw== X-Forwarded-Encrypted: i=1; AJvYcCWQ7qn6/4XOv6eoHDr3a1fzsxqdFDnV32U2iZwZ8q1hcbPY/8yDqw4vqIo1jGU+Oh7BAOwc7zWCBoIboGwf@postgresql.org X-Gm-Message-State: AOJu0YyAbooblbY6Wj1zCSPeVdCXInt2IrltDqBblGhY7j+i8yG1y67q 7G0DUMHhsNOycn01vUwY1rNTvJcOYJF2lMuIRhGw2DIX7YHcjp5xlO3sy2qwfoOvpYpie3ToZSi +pySBIiGAXee9lz3700RVoG4ncFzlz6FpY7n0ZQcUSzcQodrHYi4FUMdXuQ== X-Gm-Gg: ATEYQzwXcbMQqMzRcMOmQ4ZHcLuapscvN3JO1oiecu9I6pjwCW7CGbc9M8cZKNiHQHh w9elUaSTrEoecgdKIozs9fhF+8SqLKaePvGkzqG+UZyamQoxGs4um4JslQl1U3i+76K5ew2bSus qFPrCkmZ+Bni0RS/7s4AtjUmOCTbMLojpkHRK8U3o9dv1A4RfLHa7PuySJcuRNgjAUDwye3+rfv FIoXyFRJXDJFRkTG0hGoNGKr42HhFIQN4VRrNY5D1iOaalGKzmSt5A8+XXcO83Z3YQfp0+r/HaN /L65wG91bpGgD/YfzJVNFvgA+YINM/e2CtEbViqH X-Received: by 2002:a05:690c:81:b0:798:6a6b:5b0f with SMTP id 00721157ae682-79a71ade8abmr219847b3.32.1773766313231; Tue, 17 Mar 2026 09:51:53 -0700 (PDT) MIME-Version: 1.0 References: <2631a3c3-5e60-4a1a-9e20-377024322602@gmail.com> In-Reply-To: From: Anthonin Bonnefoy Date: Tue, 17 Mar 2026 17:51:40 +0100 X-Gm-Features: AaiRm51mMa8n7IaN6pfJRfUes60f11WgkVrnRfi2IymMBtJ2_tchUDmlizplTe8 Message-ID: Subject: Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record To: Michael Paquier Cc: Fujii Masao , Andres Freund , Alexander Lakhin , PostgreSQL Hackers Content-Type: multipart/mixed; boundary="00000000000004d7fd064d3b27c0" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000004d7fd064d3b27c0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Mar 17, 2026 at 12:26=E2=80=AFAM Michael Paquier wrote: > This stuff seems sensible enough that I think we should at least have > a test, no? It does not have to be absolutely perfect in terms of > reproducibility, just good enough to be able to detect it across the > buildfarm. We already do various things with page boundaries in WAL > during recovery, and a shutdown could be perhaps timed to increase the > reproducibility rate of the issues discussed? I initially thought that there was no easy way to trigger this issue reliably in a test: the script I've been using won't work as soon as there are changes in the record sizes. Then I remembered that pg_logical_emit_message existed and could be used to write a WAL record of a specific size, without allocating a xid and without flushing the record. With this, the test can be simplified to: SELECT pg_switch_wal(); BEGIN; SELECT pg_logical_emit_message(false, '', repeat('a', 16265), false); ROLLBACK; Any change in WAL short header, long header or xl_logical_message struct will "break" the test since the record won't be at the exact end of the page boundary. This also assumes that we have an 8 byte alignment. 32 bits machine will have the WAL record ends at 3FF0, so not exactly the end, but that should be fine to test different conditions. A word of caution about this test: While running it on my machine, I've managed to trigger some weird WAL corruption. The new segment after the switch had 1 or 2 excessive bytes at the start of the segment just before the xlog page magic, shifting the whole file. The first time it happened, I thought I'd messed something up and added the bytes myself while looking at the WAL with imhex. The second time, I've only run the script, and the new segment had a 1.1MB size shortly after, so I'm pretty sure I didn't do anything that could have introduced those excessive bytes. I'm still trying to understand the trigger conditions (some race condition between the switch and the walwriter?), but if this test is merged, it may trigger this WAL corruption issue on the buildfarm. Regards, Anthonin Bonnefoy --00000000000004d7fd064d3b27c0 Content-Type: application/octet-stream; name="v1-0001-Add-test-shutting-down-walsender-with-unflushed-r.patch" Content-Disposition: attachment; filename="v1-0001-Add-test-shutting-down-walsender-with-unflushed-r.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_mmuujg9l0 RnJvbSA5N2I5YmE0Mzc0YjU0MjNmNDBiMzg2OGQ4M2I1MTVkOTQ5NjlkZmE2IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBBbnRob25pbiBCb25uZWZveSA8YW50aG9uaW4uYm9ubmVmb3lA ZGF0YWRvZ2hxLmNvbT4KRGF0ZTogVHVlLCAxNyBNYXIgMjAyNiAxNTowNzo1NSArMDEwMApTdWJq ZWN0OiBBZGQgdGVzdCBzaHV0dGluZyBkb3duIHdhbHNlbmRlciB3aXRoIHVuZmx1c2hlZCByZWNv cmQKCjZlZWRiMmE1ZmQ4IGZpeGVkIGFuIGlzc3VlIHdoZXJlIHRoZSB3YWxzZW5kZXIgd2FzIHN0 dWNrIGluIGEgYnVzeSBsb29wLAp0cnlpbmcgdG8gcmVhZCBhbiB1bmZsdXNlZCByZWNvcmQuCgpk OTI3YjRiZDk3IGZpeGVkIGFub3RoZXIgaXNzdWUgaW50cm9kdWNlZCBieSB0aGUgcHJldmlvdXMg Y29tbWl0LCB3aGVyZQpYTG9nRmx1c2ggd291bGQgYmUgY2FsbGVkIHBhc3QgdGhlIGVuZCBvZiB0 aGUgZ2VuZXJhdGVkIFdBTCwgZ2VuZXJhdGluZwphbiBlcnJvciBsb2cuCgpUaGlzIGNvbW1pdCBh ZGRzIGEgdGVzdCB0byBjb3ZlciB0aG9zZSB0d28gZml4ZXMuIEJ5IHdyaXRpbmcgYSBsb2dpY2Fs Cm1lc3NhZ2Ugb2YgYSBzcGVjaWZpYyBzaXplLCB3ZSBjYW4gcmVhY2ggYSBzdGF0ZSB3aGVyZSB0 aGUgbGFzdCByZWNvcmQKaXMgY3Jvc3NpbmcgdGhlIHBhZ2UgYm91bmRhcnksIGFuZCBlbmRzIGF0 IHRoZSBlbmQgb2YgdGhlIG5leHQgcGFnZSwKcmVjcmVhdGluZyB0aGUgY29uZGl0aW9ucyBmb3Ig dGhlIGFmb3JlbWVudGlvbmVkIGlzc3Vlcy4KLS0tCiBzcmMvdGVzdC9yZWNvdmVyeS90LzAwNl9s b2dpY2FsX2RlY29kaW5nLnBsIHwgNTAgKysrKysrKysrKysrKysrKysrKystCiAxIGZpbGUgY2hh bmdlZCwgNDkgaW5zZXJ0aW9ucygrKSwgMSBkZWxldGlvbigtKQoKZGlmZiAtLWdpdCBhL3NyYy90 ZXN0L3JlY292ZXJ5L3QvMDA2X2xvZ2ljYWxfZGVjb2RpbmcucGwgYi9zcmMvdGVzdC9yZWNvdmVy eS90LzAwNl9sb2dpY2FsX2RlY29kaW5nLnBsCmluZGV4IDk3ZDExZjk4YjU5Li4xZWZiNmY3YjUy NyAxMDA2NDQKLS0tIGEvc3JjL3Rlc3QvcmVjb3ZlcnkvdC8wMDZfbG9naWNhbF9kZWNvZGluZy5w bAorKysgYi9zcmMvdGVzdC9yZWNvdmVyeS90LzAwNl9sb2dpY2FsX2RlY29kaW5nLnBsCkBAIC0y NzUsNyArMjc1LDU1IEBAIGlzKCAkbm9kZV9wcmltYXJ5LT5zYWZlX3BzcWwoCiAJcXEoQ2hlY2sg dGhhdCByZXNldCB0aW1lc3RhbXAgaXMgbGF0ZXIgYWZ0ZXIgcmVzZXR0aW5nIHN0YXRzIGZvciBz bG90ICckc3RhdHNfdGVzdF9zbG90MScgYWdhaW4uKQogKTsKIAotIyBkb25lIHdpdGggdGhlIG5v ZGUKK1NLSVA6Cit7CisKKwkjIHNvbWUgV2luZG93cyBQZXJscyBhdCBsZWFzdCBkb24ndCBsaWtl IElQQzo6UnVuJ3Mgc3RhcnQva2lsbF9raWxsIHJlZ2ltZS4KKwlza2lwICJUZXN0IGZhaWxzIG9u IFdpbmRvd3MgcGVybCIsIDIgaWYgJENvbmZpZ3tvc25hbWV9IGVxICdNU1dpbjMyJzsKKworCSMg VGVzdCBzdG9wcGluZyB0aGUgcHJpbWFyeSB3aXRoIGFuIGFjdGl2ZSB3YWxzZW5kZXIgYW5kIGFu IHVuZmx1c2hlZCByZWNvcmQKKwkjIHdoaWNoIGNyb3NzZXMgdGhlIHBhZ2UgYm91bmRhcnkgYW5k IGVuZHMgYXQgdGhlIGVuZCBvZiB0aGUgbmV4dCBwYWdlLgorCSMKKwkjIEZpcnN0LCBzdGFydCBw Z19yZWN2bG9naWNhbCB0byBoYXZlIGFuIGFjdGl2ZSB3YWxzZW5kZXIKKwlteSAkcGdfcmVjdmxv Z2ljYWwgPSBJUEM6OlJ1bjo6c3RhcnQoCisJCVsKKwkJCSdwZ19yZWN2bG9naWNhbCcsCisJCQkn LS1kYm5hbWUnID0+ICRub2RlX3ByaW1hcnktPmNvbm5zdHIoJ3Bvc3RncmVzJyksCisJCQknLS1z bG90JyA9PiAndGVzdF9zbG90JywKKwkJCSctLWZpbGUnID0+ICctJywKKwkJCSctLXN0YXJ0Jwor CQldKTsKKworCSMgVGhlbiwgd2Ugd3JpdGUgYSBsb2dpY2FsIG1lc3NhZ2UgV0FMIHJlY29yZCB3 aGljaCBmaW5pc2hlcyBhdCB0aGUgZW5kIG9mIGEKKwkjIFdBTCBwYWdlLCB1c2luZyBhIHJvbGxi YWNrIHNvIHRoZSBXQUwgcmVjb3JkIGlzbid0IGZsdXNoZWQuCisJIworCSMgVGhlIHNpemUgb2Yg YSBXQUwgbG9naWNhbCBtZXNzYWdlIHJlY29yZCBpcyA1NSBieXRlcyArIG1lc3NhZ2UgbGVuZ3Ro CisJIyBTdGFydGluZyBmcm9tIGEgZnJlc2ggV0FMIHNlZ21lbnQsIHdlIGhhdmU6CisJIyAgIC0g ODE1MiBieXRlcyBhdmFpbGFibGUgaW4gdGhlIGZpcnN0IHBhZ2UgKGxvbmcgaGVhZGVyKQorCSMg ICAtIDgxNjggYnl0ZXMgYXZhaWxhYmxlIGluIHRoZSBzZWNvbmQgcGFnZSAoc2hvcnQgaGVhZGVy KQorCSMgV2UgbmVlZCB0byB3cml0ZSAxNjMyMCBieXRlcyBvZiBsb2dpY2FsIG1lc3NhZ2UgV0FM IHJlY29yZCwgd2hpY2ggY2FuIGJlIGRvbmUKKwkjIHVzaW5nIGEgMTYyNjUgYnl0ZXMgbG9uZyBt ZXNzYWdlLgorCSRub2RlX3ByaW1hcnktPnNhZmVfcHNxbCgncG9zdGdyZXMnLAorCQlxcVsKKwkJ U0VMRUNUIHBnX3N3aXRjaF93YWwoKTsKKwkJQkVHSU47CisJCVNFTEVDVCBwZ19sb2dpY2FsX2Vt aXRfbWVzc2FnZShmYWxzZSwgJycsIHJlcGVhdCgnYScsIDE2MjY1KSwgZmFsc2UpOworCQlST0xM QkFDSzsKKwkJXQorCSk7CisKKwkjIHRyeSB0byByZXN0YXJ0CisJJG5vZGVfcHJpbWFyeS0+cmVz dGFydDsKKwkkcGdfcmVjdmxvZ2ljYWwtPmtpbGxfa2lsbDsKKworCW15ICRsb2dmaWxlID0gc2x1 cnBfZmlsZSgkbm9kZV9wcmltYXJ5LT5sb2dmaWxlKCkpOworCXVubGlrZSgKKwkJJGxvZ2ZpbGUs CisJCXFyL3JlcXVlc3QgdG8gZmx1c2ggcGFzdCBlbmQgb2YgZ2VuZXJhdGVkIFdBTC8sCisJCSJU aGVyZSdzIG5vIGZsdXNoIHJlcXVlc3QgcGFzdCBlbmQgb2YgZ2VuZXJhdGVkIFdBTCIpOworfQor CisjIHN0b3AgdGhlIG5vZGUKICRub2RlX3ByaW1hcnktPnN0b3A7CiAKIGRvbmVfdGVzdGluZygp OwotLSAKMi41My4wCgo= --00000000000004d7fd064d3b27c0--