public inbox for [email protected]
help / color / mirror / Atom feedFrom: vignesh C <[email protected]>
To: Heikki Linnakangas <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Random pg_upgrade 004_subscription test failure on drongo
Date: Mon, 22 Sep 2025 14:28:35 +0530
Message-ID: <CALDaNm1NtWVosSSb9mp3OKic60em5HF2zmURC77MLWyYLMWqyw@mail.gmail.com> (raw)
In-Reply-To: <CALDaNm2y+nf-V9tjKwvbPprobZs1t_UrcCpJ0qYD5-KkOUFAyg@mail.gmail.com>
References: <CALDaNm3tjY44HoSwY84=XGEbTg0ruVfD4hAMTm=TgBqVysH4Qw@mail.gmail.com>
<[email protected]>
<CALDaNm2y+nf-V9tjKwvbPprobZs1t_UrcCpJ0qYD5-KkOUFAyg@mail.gmail.com>
On Fri, 21 Mar 2025 at 18:54, vignesh C <[email protected]> wrote:
>
> On Thu, 13 Mar 2025 at 18:10, Heikki Linnakangas <[email protected]> wrote:
> >
> >
> > Hmm, this problem isn't limited to this one pg_upgrade test, right? It
> > could happen with any pg_upgrade invocation. And perhaps in a running
> > server too, if a relfilenumber is reused quickly. In dropdb() and
> > DropTableSpace() we do this:
> >
> > WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
> >
> > Should we do the same here? Not sure where exactly to put that; perhaps
> > in mdcreate(), if the creation fails with STATUS_DELETE_PENDING.
>
> How about a patch similar to the attached one? I have run pg_upgrade
> tests multiple times, but unfortunately, I was unable to reproduce the
> issue or verify these changes.
CFBot reported an issue in one of the machines, here is an updated
version for the same.
Regards,
Vignesh
Attachments:
[application/octet-stream] v2-0001-Fix-issue-with-file-handle-retention-during-CREAT.patch (2.7K, 2-v2-0001-Fix-issue-with-file-handle-retention-during-CREAT.patch)
download | inline diff:
From f076ec514631034e081740291d069a1f20fbb0a1 Mon Sep 17 00:00:00 2001
From: Vignesh <[email protected]>
Date: Fri, 21 Mar 2025 18:24:48 +0530
Subject: [PATCH v2] Fix issue with file handle retention during CREATE
DATABASE in pg_restore
During upgrades, when pg_restore performs CREATE DATABASE, the
bgwriter or checkpointer may flush buffers and hold a file handle
for the table. This causes issues if the table needs to be re-created
later (e.g., after a TRUNCATE command), especially on OSes like older
versions of Windows, where unlinked files aren't fully removed until
they are no longer open.
This commit fixes the issue by checking for STATUS_DELETE_PENDING and
calling WaitForProcSignalBarrier, ensuring that all smgr file descriptors
are closed across all backends before retrying the file operation.
---
src/backend/storage/smgr/md.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 2ccb0faceb5..a97afedafdd 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -31,10 +31,20 @@
#include "miscadmin.h"
#include "pg_trace.h"
#include "pgstat.h"
+
+#if defined(WIN32) && !defined(__CYGWIN__)
+#include "port/win32ntdll.h"
+#endif
+
#include "storage/aio.h"
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/md.h"
+
+#if defined(WIN32) && !defined(__CYGWIN__)
+#include "storage/procsignal.h"
+#endif
+
#include "storage/relfilelocator.h"
#include "storage/smgr.h"
#include "storage/sync.h"
@@ -214,6 +224,9 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
MdfdVec *mdfd;
RelPathStr path;
File fd;
+#if defined(WIN32) && !defined(__CYGWIN__)
+ bool retryattempted = false;
+#endif
if (isRedo && reln->md_num_open_segs[forknum] > 0)
return; /* created and opened already... */
@@ -235,6 +248,9 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
path = relpath(reln->smgr_rlocator, forknum);
+#if defined(WIN32) && !defined(__CYGWIN__)
+retry:
+#endif
fd = PathNameOpenFile(path.str, _mdfd_open_flags() | O_CREAT | O_EXCL);
if (fd < 0)
@@ -245,6 +261,15 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
fd = PathNameOpenFile(path.str, _mdfd_open_flags());
if (fd < 0)
{
+#if defined(WIN32) && !defined(__CYGWIN__)
+ if (!retryattempted && pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
+ {
+ retryattempted = true;
+ WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
+ goto retry;
+ }
+#endif
+
/* be sure to report the error reported by create, not open */
errno = save_errno;
ereport(ERROR,
--
2.43.0
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Random pg_upgrade 004_subscription test failure on drongo
In-Reply-To: <CALDaNm1NtWVosSSb9mp3OKic60em5HF2zmURC77MLWyYLMWqyw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox