public inbox for [email protected]
help / color / mirror / Atom feedFrom: Mihail Nikalayeu <[email protected]>
To: Antonin Houska <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Pg Hackers <[email protected]>
Cc: Robert Treat <[email protected]>
Subject: Re: Adding REPACK [concurrently]
Date: Thu, 11 Dec 2025 21:38:00 +0100
Message-ID: <CADzfLwXp4c-MJx7yVDxAGNNxPbX4o9dqyivxavtHvmUsdXYqBQ@mail.gmail.com> (raw)
In-Reply-To: <171530.1765306357@localhost>
References: <[email protected]>
<116433.1764870207@localhost>
<CADzfLwWFbXVN-QrKaVXvW96eYQD1AiRVCYd99nX7EQFG3q_yfg@mail.gmail.com>
<CADzfLwWS6Ukme5uhv9=1ZyyG=D5Bp0dE+zGn=qcfGV=jSY6mpw@mail.gmail.com>
<CADzfLwXudUtPi1xFC_CBpGP=vSmDY4pAvBbS4_BCwOUyNTT5WA@mail.gmail.com>
<171530.1765306357@localhost>
Hello, Antonin!
On Tue, Dec 9, 2025 at 7:52 PM Antonin Houska <[email protected]> wrote:
> Worker makes more sense to me - the initial implementation is in 0005.
Comments for 0005, so far:
---
> export_initial_snapshot
Hm, should we use ExportSnapshot instead? And ImportSnapshort to import it.
---
> get_initial_snapshot
Should we check if a worker is still alive while waiting? Also is
"process_concurrent_changes".
And AFAIU RegisterDynamicBackgroundWorker does not guarantee new
workers to be started (in case of some fork-related issues).
---
> Assert(res = SHM_MQ_DETACHED);
==
---
> /* Wait a bit before we retry reading WAL. */
> (void) WaitLatch(MyLatch,
> WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
> 1000L,
> WAIT_EVENT_REPACK_WORKER_MAIN);
Looks like we need ResetLatch(MyLatch); here.
---
> * - decoding_ctx - logical decoding context, to capture concurrent data
Need to be removed together with parameters.
---
> hpm_context = AllocSetContextCreate(TopMemoryContext,
> "ProcessParallelMessages",
> ALLOCSET_DEFAULT_SIZES);
"ProcessRepacklMessages"
---
> if (XLogRecPtrIsInvalid(lsn_upto))
> {
> SpinLockAcquire(&shared->mutex);
> lsn_upto = shared->lsn_upto;
> /* 'done' should be set at the same time as 'lsn_upto' */
> done = shared->done;
> SpinLockRelease(&shared->mutex);
>
> /* Check if the work happens to be complete. */
> continue;
> }
May be moved to the start of the loop to avoid duplication.
---
> SpinLockAcquire(&shared->mutex);
> valid = shared->sfs_valid;
> SpinLockRelease(&shared->mutex);
Better to remember last_exported here to avoid any races/misses.
---
> shared->lsn_upto = InvalidXLogRecPtr;
I think it is better to clear it once it is read (after removing duplication).
---
> bool done;
bool exit_after_lsn_upto?
---
> bool sfs_valid;
Do we really need it? I think it is better to leave only last_exported
and in process_concurrent_changes wait add argument
(last_processed_file) and wait for last_exported to become higher.
---
What if we reverse roles of leader-worker?
Leader gets a snapshot, transfers it to workers (multiple probably for
parallel scan) using already ready mechanics - workers are processing
the scan of the table in parallel. Leader decodes the WAL.
Also, workers may be assigned with a list of indexes they need to build.
Feels like it reuses more from current infrastructure and also needs
less different synchronization logic. But I'm not sure about the
indexes phase - maybe it is not so easy to do.
---
Also, should we add some kind of back pressure between building
indexes/new heap and num of WAL we have?
But probably it is out of scope of the patch.
---
To build N indexes we need to scan table N times. What is about
building multiple indexes during a single heap scan?
--
Just a gentle reminder about the XMIN_COMMITTED flag and WAL storm
after the switch.
Best regards,
Mikhail.
view thread (106+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Adding REPACK [concurrently]
In-Reply-To: <CADzfLwXp4c-MJx7yVDxAGNNxPbX4o9dqyivxavtHvmUsdXYqBQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox