public inbox for [email protected]
help / color / mirror / Atom feedFrom: SATYANARAYANA NARLAPURAM <[email protected]>
To: PostgreSQL Hackers <[email protected]>
To: Andres Freund <[email protected]>
To: Thomas Munro <[email protected]>
Subject: Fix checkpointer PANIC due to missing fsync cancel in mdunlinkfork()
Date: Mon, 13 Apr 2026 18:55:28 -0700
Message-ID: <CAHg+QDdH+uBc0deP7uPE5rBkn0EjdkVC2_8ykK3KV90V4z6_5Q@mail.gmail.com> (raw)
Hi hackers,
I think I found a bug in mdunlinkfork() that causes the checkpointer
to PANIC with "could not fsync file ... No such file or directory" when
a relation is moved between tablespaces and the old tablespace is dropped.
mdunlinkfork() has two code paths for unlinking a relation segment file:
(1) The 'if' branch (isRedo, binary upgrade, non-MAIN forknum, or temp
relations) correctly calls register_forget_request() to cancel any pending
fsync before unlinking. (2) The 'else' branch that handles normal
MAIN_FORKNUM unlink is missing register_forget_request().
This leaves a stale SYNC_REQUEST entry in the checkpointer's
pendingOps hash table.
The comment in ProcessSyncRequests() states below:
/*
* The fsync table could contain requests to fsync segments that
* have been deleted (unlinked) by the time we get to them. Rather
* than just hoping an ENOENT (or EACCES on Windows) error can be
* ignored, what we do on error is absorb pending requests and
* then retry. Since mdunlink() queues a "cancel" message before
* actually unlinking, the fsync request is guaranteed to be
* marked canceled after the absorb if it really was this case.
* DROP DATABASE likewise has to tell us to forget fsync requests
* before it starts deletions.
*/
but this guarantee is not provided by the else branch.
Repro:
Reproducing this is not easy as it is a race but the test below
when repeated sufficient number of times you can see ocasionally
checkpointer PANIC. Though I used TABLESPACE for somewhat
consistent repro, this race exists for operations like DROP table,
TRUNCATE table as well I believe.
CREATE TABLESPACE ts LOCATION '/tmp/ts';
CREATE TABLE t (id int, pad text) TABLESPACE ts;
INSERT INTO t SELECT g, repeat('x', 500) FROM generate_series(1,50000) g;
-- Concurrent updates to dirty shared buffers (background sessions)
UPDATE t SET pad = repeat('y', 500) WHERE id <= 25000;
-- Move table away; FlushRelationBuffers sends SYNC_REQUESTs for old path
ALTER TABLE t SET TABLESPACE pg_default;
-- Drop tablespace forces checkpoint + removes directory
DROP TABLESPACE ts;
-- Next checkpoint hits stale entry -> PANIC
CHECKPOINT;
PANIC: could not fsync file "pg_tblspc/41343/PG_19_202604061/5/41348": No
such file or directory
LOG: checkpointer process (PID 166749) was terminated by signal 6: Aborted
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing
The fix is to add register_forget_request() in the else branch of
mdunlinkfork(), before register_unlink_segment(), matching what the
'if' branch already does.
Thoughts?
Thanks,
Satya
Attachments:
[application/octet-stream] 0001-Fix-checkpointer-PANIC-due-to-missing-fsync-cancel-i.patch (1.2K, 3-0001-Fix-checkpointer-PANIC-due-to-missing-fsync-cancel-i.patch)
download | inline diff:
From 48089e678b425291442f4dc14d87ca25c524f65f Mon Sep 17 00:00:00 2001
From: Satyanarayana Narlapuram <[email protected]>
Date: Tue, 14 Apr 2026 00:09:44 +0000
Subject: [PATCH] Fix checkpointer PANIC due to missing fsync cancel in
mdunlinkfork()
Add missing register_forget_request() in the 'else' branch of
mdunlinkfork(), before register_unlink_segment(), matching what the
'if' branch already does.
---
src/backend/storage/smgr/md.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index dee2903..d715307 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -419,6 +419,11 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo)
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path.str);
+ /* Forget any pending sync requests for the first segment */
+ save_errno = errno;
+ register_forget_request(rlocator, forknum, 0 /* first seg */ );
+ errno = save_errno;
+
/* Register request to unlink first segment later */
save_errno = errno;
register_unlink_segment(rlocator, forknum, 0 /* first seg */ );
--
2.43.0
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Fix checkpointer PANIC due to missing fsync cancel in mdunlinkfork()
In-Reply-To: <CAHg+QDdH+uBc0deP7uPE5rBkn0EjdkVC2_8ykK3KV90V4z6_5Q@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox