Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCT0E-0020vL-16 for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 01:55:46 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wCT0B-009P0x-1L for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 01:55:44 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCT0B-009P0p-0J for pgsql-hackers@lists.postgresql.org; Tue, 14 Apr 2026 01:55:44 +0000 Received: from mail-vs1-xe2c.google.com ([2607:f8b0:4864:20::e2c]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wCT0A-00000000wlW-0AnZ for pgsql-hackers@lists.postgresql.org; Tue, 14 Apr 2026 01:55:43 +0000 Received: by mail-vs1-xe2c.google.com with SMTP id ada2fe7eead31-60fd9b71745so533324137.1 for ; Mon, 13 Apr 2026 18:55:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1776131740; cv=none; d=google.com; s=arc-20240605; b=bv6ZlUPeAJ8BlwEZWCuyie++m1JOVhyaB1aN3wFwVBnXXdi7qoI7uatBY0HxKTBwfv ghG5a7EEfRRTzbWqWYDDkdB2fgnZzZowvOr8lSuqhpFB67D9oa6sMTLElsF+9rI5mDtD 419jT4TmfG54HAK9AeKbttZNBbIe94DkPvXz7vFH03QHpaR5BoMumOZF1/RZxiLpRZMw 3dFMxJxEAxq1XR2gFirukx49W3z5StlxXoTeHGD4KgRvUaPCx4NJGSZ6lhsrzV8Y2qS2 M2Wjz5KgpC2Ij7VkBvW2tvH+EKlWf6Z2s2SpQ8G5VCcXHXqt0EpUusrcUD97s69EqYdb fzLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=to:subject:message-id:date:from:mime-version:dkim-signature; bh=gMsuoZC9MQsZbfoV3BnNVctuBJbwGi205Si5bpFuprs=; fh=d1rOMzqe1RgKPPPDM1FqbNQnsOARihUL3TocOlKm9Qc=; b=DMXtLwZbmQffYUyW5z65GqxlsziiHv2NZfrtuwn2mgVsBdt1l4Z2SBMKK0ZlUJ1m0j uvJI7LQ69ihrrQ3CRx2xjrv1ELPQMWHJnOGs87OQ7ON3OKvGaDFS5vbK7evkZRpwTzPe G9yq6XYqvqeE48+XBnyq5YxrNfsfcJmjnUSbdGKWUOhDdawNSXAOOSlYs/Dsd7T8YJRK M4rcoNOn0taJPEOfVM87etF8OKVwizY6/dezoRLOYs3iLZoi+NM31d27Lkcws79UeCYe oqvOpSoOF/zWv92MvkShV0dshCDY/ppim5sSO64cxDIP+DzAfU4dpcJZSUDTpIdQ//DR mGmg==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776131740; x=1776736540; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=gMsuoZC9MQsZbfoV3BnNVctuBJbwGi205Si5bpFuprs=; b=Sv1VQhutCqi5n5tDUl7BiwiZj5dGQ55hHbTxadBZ7p3ny24YDAklA/bnzoVRA6LJVu tSFItPEbGO2jjWx1FG03KPzNFY0zQqHNw4iDCB5yBSuS6/rAwUWqj2uuLt/j9gV2WWNt WWWxSKdVdDfLyq6/oywPBzu793de2NXGSpOSxZKoHYwXTlZySYyVzvH7QC+hvj9hO6kS JBKQ6t8DA3iP66uwApc4th9TAefyNzo/Nprxlj59/eP2V39fxn6w6R3PK/YZHmb5iFfL 28mp9sbd8Yx/YLgtZDyKskb3Qs0sUKs39lYUN2wPowiYdyLlUe9GSTz1bzlWY9euPUKC Hbdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776131740; x=1776736540; h=to:subject:message-id:date:from:mime-version:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gMsuoZC9MQsZbfoV3BnNVctuBJbwGi205Si5bpFuprs=; b=bNO51MBde3OrRcCOKvaBZVEqA6Jm5t1lXxJm8BzxXVx7tEWAJcDuuFB78V72ODGWvN mLOTWCc2kAZo4ilFJg5g1MS2poNLEQ68W7PTLdPn+VD/WrIb/uQ4KU3mWOf9VfNcBWZ+ 4rc/+701OVDsRmnLOUHE0NhqRiRkBX3UH6SfUothPUlJ+nVP9itkumlK3ZD8Jp/r1Ekc F2mtkXmrlAvhSwuhY6nTE9ret9gQCRNnkPC1TijwicEVxzo3CjoJFcDH0bsvQj6YaetJ edBjN7UWN1Pc3CGx9JAlrT0gF5/Du/gigNaRInJt741xfAPUam9DicrF66AzRyekQF0O bY7A== X-Gm-Message-State: AOJu0YzR0SnzYo/P5lNAmFMcZnudGVUuKzGfSCvztvFJ1y4K6MxheItN wHYMyiRwKKBV6LAS0KGn3a+eFVrk6CLv5t+NNqMgZE8NLhcg4ZokihylqVsEg4DLWbsm6dycaG1 +9v/2ITOqenOxxxkftFnovu8/tAB54CRYYdf0 X-Gm-Gg: AeBDiet0vgm2/fcVIOqEOpOdy/1mxzqxqBDDTow4cAyhBIHf+dYivXEbP9oGvcfwPGl /GgpOMOBHwRqvXr7e4bC25SUfkKOKbaynTUc7ty20GknjO5s2kN8YeJF/Whsyyf47iC2sWJH9Yk G0RYbDwe4VPCqPpcvcSojd8V2vQTT5f4Mebor9et5Q+XjUH+551G39nfFbyQI+FfI2ds2x1oH/t +yPocebZo5E4wvw4tVPeQPiwwPkKUXNLdY3tR0Mc9hhafdUyXadv7RTrpKUUZ4/fflIHsl6+/Yg i3IlB6J2adaDB9zUGA== X-Received: by 2002:a67:e113:0:b0:602:6e95:bc82 with SMTP id ada2fe7eead31-60a0198fbc3mr5872636137.33.1776131739678; Mon, 13 Apr 2026 18:55:39 -0700 (PDT) MIME-Version: 1.0 From: SATYANARAYANA NARLAPURAM Date: Mon, 13 Apr 2026 18:55:28 -0700 X-Gm-Features: AQROBzA9vRw0wFdu0PoCFuYdN6lDB8fQtaYq-lCXt79no0avXhY4NVD-f0oI3S0 Message-ID: Subject: Fix checkpointer PANIC due to missing fsync cancel in mdunlinkfork() To: PostgreSQL Hackers , Andres Freund , Thomas Munro Content-Type: multipart/mixed; boundary="0000000000006b14ee064f61e5e0" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000006b14ee064f61e5e0 Content-Type: multipart/alternative; boundary="0000000000006b14ec064f61e5de" --0000000000006b14ec064f61e5de Content-Type: text/plain; charset="UTF-8" Hi hackers, I think I found a bug in mdunlinkfork() that causes the checkpointer to PANIC with "could not fsync file ... No such file or directory" when a relation is moved between tablespaces and the old tablespace is dropped. mdunlinkfork() has two code paths for unlinking a relation segment file: (1) The 'if' branch (isRedo, binary upgrade, non-MAIN forknum, or temp relations) correctly calls register_forget_request() to cancel any pending fsync before unlinking. (2) The 'else' branch that handles normal MAIN_FORKNUM unlink is missing register_forget_request(). This leaves a stale SYNC_REQUEST entry in the checkpointer's pendingOps hash table. The comment in ProcessSyncRequests() states below: /* * The fsync table could contain requests to fsync segments that * have been deleted (unlinked) by the time we get to them. Rather * than just hoping an ENOENT (or EACCES on Windows) error can be * ignored, what we do on error is absorb pending requests and * then retry. Since mdunlink() queues a "cancel" message before * actually unlinking, the fsync request is guaranteed to be * marked canceled after the absorb if it really was this case. * DROP DATABASE likewise has to tell us to forget fsync requests * before it starts deletions. */ but this guarantee is not provided by the else branch. Repro: Reproducing this is not easy as it is a race but the test below when repeated sufficient number of times you can see ocasionally checkpointer PANIC. Though I used TABLESPACE for somewhat consistent repro, this race exists for operations like DROP table, TRUNCATE table as well I believe. CREATE TABLESPACE ts LOCATION '/tmp/ts'; CREATE TABLE t (id int, pad text) TABLESPACE ts; INSERT INTO t SELECT g, repeat('x', 500) FROM generate_series(1,50000) g; -- Concurrent updates to dirty shared buffers (background sessions) UPDATE t SET pad = repeat('y', 500) WHERE id <= 25000; -- Move table away; FlushRelationBuffers sends SYNC_REQUESTs for old path ALTER TABLE t SET TABLESPACE pg_default; -- Drop tablespace forces checkpoint + removes directory DROP TABLESPACE ts; -- Next checkpoint hits stale entry -> PANIC CHECKPOINT; PANIC: could not fsync file "pg_tblspc/41343/PG_19_202604061/5/41348": No such file or directory LOG: checkpointer process (PID 166749) was terminated by signal 6: Aborted LOG: terminating any other active server processes LOG: all server processes terminated; reinitializing The fix is to add register_forget_request() in the else branch of mdunlinkfork(), before register_unlink_segment(), matching what the 'if' branch already does. Thoughts? Thanks, Satya --0000000000006b14ec064f61e5de Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi hackers,

I think I found a bug in md= unlinkfork() that causes the checkpointer=C2=A0
to PANIC with &qu= ot;could not fsync file ... No such file or directory" when=C2=A0
a relation is moved between tablespaces and the old tablespace is dro= pped.

mdunlinkfork() has two code paths for unlink= ing a relation segment file:
(1) The 'if' branch (isRedo, binary= upgrade, non-MAIN forknum, or temp
relations) correctly calls register_= forget_request() to cancel any pending=C2=A0
fsync before unlinki= ng. (2) The 'else' branch that handles=C2=A0normal=C2=A0
= MAIN_FORKNUM unlink is missing=C2=A0register_forget_request().=C2=A0
<= div>This leaves a stale SYNC_REQUEST entry in the checkpointer's=C2=A0<= /div>
pendingOps hash table.

The comment in ProcessSyncRe= quests() states below:

/*
* The fsync ta= ble could contain requests to fsync segments that
* have been delete= d (unlinked) by the time we get to them. Rather
* than just hoping a= n ENOENT (or EACCES on Windows) error can be
* ignored, what we do o= n error is absorb pending requests and
* then retry. Since mdunlink(= ) queues a "cancel" message before
* actually unlinking, t= he fsync request is guaranteed to be
* marked canceled after the abs= orb if it really was this case.
* DROP DATABASE likewise has to tell= us to forget fsync requests
* before it starts deletions.
*/=

but this guarantee is not provided by the else br= anch.

Repro:
Reproducing this is not = easy as it is a race but the test below
when repeated sufficient = number of times you can see ocasionally
checkpointer PANIC. Thoug= h I used TABLESPACE for somewhat
consistent repro, this race exis= ts for operations like DROP table,
TRUNCATE table as well I belie= ve.=C2=A0

CREATE TABLESPACE ts LOCATION '/tmp/= ts';
CREATE TABLE t (id int, pad text) TABLESPACE ts;
INSERT INTO= t SELECT g, repeat('x', 500) FROM generate_series(1,50000) g;
<= br>-- Concurrent updates to dirty shared buffers (background sessions)
U= PDATE t SET pad =3D repeat('y', 500) WHERE id <=3D 25000;
-- Move table away; FlushRelationBuffers sends SYNC_REQUESTs for old path<= br>ALTER TABLE t SET TABLESPACE pg_default;

-- Drop tablespace force= s checkpoint + removes directory
DROP TABLESPACE ts;

-- Next chec= kpoint hits stale entry -> PANIC
CHECKPOINT;

PANIC: =C2=A0could not fsync file "pg_tblspc/41343/PG_19_202604061/5/= 41348": No such file or directory
LOG: =C2=A0checkpointer process (= PID 166749) was terminated by signal 6: Aborted
LOG: =C2=A0terminating a= ny other active server processes
LOG: =C2=A0all server processes termina= ted; reinitializing

The fix is to add = register_forget_request() in the else branch of
mdunlinkfork(), before r= egister_unlink_segment(), matching what the
'if' branch already = does.

Thoughts?

Thanks,
Satya
--0000000000006b14ec064f61e5de-- --0000000000006b14ee064f61e5e0 Content-Type: application/octet-stream; name="0001-Fix-checkpointer-PANIC-due-to-missing-fsync-cancel-i.patch" Content-Disposition: attachment; filename="0001-Fix-checkpointer-PANIC-due-to-missing-fsync-cancel-i.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_mnxyxooa0 RnJvbSA0ODA4OWU2NzhiNDI1MjkxNDQyZjRkYzE0ZDg3Y2EyNWM1MjRmNjVmIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBTYXR5YW5hcmF5YW5hIE5hcmxhcHVyYW0gPHNhdHlhbmFybGFw dXJhbUBnbWFpbC5jb20+CkRhdGU6IFR1ZSwgMTQgQXByIDIwMjYgMDA6MDk6NDQgKzAwMDAKU3Vi amVjdDogW1BBVENIXSBGaXggY2hlY2twb2ludGVyIFBBTklDIGR1ZSB0byBtaXNzaW5nIGZzeW5j IGNhbmNlbCBpbgogbWR1bmxpbmtmb3JrKCkKCkFkZCBtaXNzaW5nIHJlZ2lzdGVyX2ZvcmdldF9y ZXF1ZXN0KCkgaW4gdGhlICdlbHNlJyBicmFuY2ggb2YKbWR1bmxpbmtmb3JrKCksIGJlZm9yZSBy ZWdpc3Rlcl91bmxpbmtfc2VnbWVudCgpLCBtYXRjaGluZyB3aGF0IHRoZQonaWYnIGJyYW5jaCBh bHJlYWR5IGRvZXMuCi0tLQogc3JjL2JhY2tlbmQvc3RvcmFnZS9zbWdyL21kLmMgfCA1ICsrKysr CiAxIGZpbGUgY2hhbmdlZCwgNSBpbnNlcnRpb25zKCspCgpkaWZmIC0tZ2l0IGEvc3JjL2JhY2tl bmQvc3RvcmFnZS9zbWdyL21kLmMgYi9zcmMvYmFja2VuZC9zdG9yYWdlL3NtZ3IvbWQuYwppbmRl eCBkZWUyOTAzLi5kNzE1MzA3IDEwMDY0NAotLS0gYS9zcmMvYmFja2VuZC9zdG9yYWdlL3NtZ3Iv bWQuYworKysgYi9zcmMvYmFja2VuZC9zdG9yYWdlL3NtZ3IvbWQuYwpAQCAtNDE5LDYgKzQxOSwx MSBAQCBtZHVubGlua2ZvcmsoUmVsRmlsZUxvY2F0b3JCYWNrZW5kIHJsb2NhdG9yLCBGb3JrTnVt YmVyIGZvcmtudW0sIGJvb2wgaXNSZWRvKQogCQkvKiBQcmV2ZW50IG90aGVyIGJhY2tlbmRzJyBm ZHMgZnJvbSBob2xkaW5nIG9uIHRvIHRoZSBkaXNrIHNwYWNlICovCiAJCXJldCA9IGRvX3RydW5j YXRlKHBhdGguc3RyKTsKIAorCQkvKiBGb3JnZXQgYW55IHBlbmRpbmcgc3luYyByZXF1ZXN0cyBm b3IgdGhlIGZpcnN0IHNlZ21lbnQgKi8KKwkJc2F2ZV9lcnJubyA9IGVycm5vOworCQlyZWdpc3Rl cl9mb3JnZXRfcmVxdWVzdChybG9jYXRvciwgZm9ya251bSwgMCAvKiBmaXJzdCBzZWcgKi8gKTsK KwkJZXJybm8gPSBzYXZlX2Vycm5vOworCiAJCS8qIFJlZ2lzdGVyIHJlcXVlc3QgdG8gdW5saW5r IGZpcnN0IHNlZ21lbnQgbGF0ZXIgKi8KIAkJc2F2ZV9lcnJubyA9IGVycm5vOwogCQlyZWdpc3Rl cl91bmxpbmtfc2VnbWVudChybG9jYXRvciwgZm9ya251bSwgMCAvKiBmaXJzdCBzZWcgKi8gKTsK LS0gCjIuNDMuMAoK --0000000000006b14ee064f61e5e0--