Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTnRx-00BIbM-0x for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Dec 2025 20:39:46 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vTnRv-005DWu-3D for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Dec 2025 20:39:44 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTnRv-005DWl-2G for pgsql-hackers@lists.postgresql.org; Thu, 11 Dec 2025 20:39:44 +0000 Received: from mail-lf1-x136.google.com ([2a00:1450:4864:20::136]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vTnRt-000E2W-2t for pgsql-hackers@lists.postgresql.org; Thu, 11 Dec 2025 20:39:43 +0000 Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-5957ac0efc2so653127e87.1 for ; Thu, 11 Dec 2025 12:39:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765485581; x=1766090381; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7gm/s7+av7fYxLwin0udHHR3irXbNj7puJz/R2c0zTg=; b=m8HtnYPb3kdUIKLgEAZijrFB4FeGgTDMU1PMuNdItxUQWudEpEnvFIbWqjr32czHon J2vhwOjROZFnNd5wBAUQwyoHt3sTJDfVhmAQE/rZt/cOV6Oj0885T9b4Ni+1DmNaMJ38 KTuuUeRxB8bEoklAsQ3N9+vrzHqsWxqL4njCyFG0wQF2T4J70y7VjkQiU0sW5qUd4glH UcnJAGdGmNYxsUOPbRhwwOTzRppgjK4P+n4mgC3TTTU3mqQt5E2uT+LwaBJyc95WOUjd 1HzZ1phTDz/74qrLfDBU9gj8WFfAkKyCNIeDxXC0DsJCAdovE3Yd2KEvigc7leVa651+ s1rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765485581; x=1766090381; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7gm/s7+av7fYxLwin0udHHR3irXbNj7puJz/R2c0zTg=; b=kbTxnCCcEY0pFXFYQXT5zDvjMQ/PK+ORukjGOpvRMnA1M5n4NYa8j3hdeRm1Zd/VO0 87vOYXQgRXs0AXq+/aD03IJrRSl9zSQdKKSD6kivq9Kzc6EU0bG3lH7jZKctkdB7JeI/ HzKvH0EkQKYqegrmBpnKOFOBQWO+Zdtv57waSDh/uKMygvGEg7OSYCYCoDUdXAJS5DTW OjXg3xDobOSg1eZzkT9yGEKwEPxZHbU4AtX9D6Eyd2yyZYO8COdPVuWVTAuz8X7STph0 akrDT1vxn/rg0dXKzcPICwAQdv4kIgyfWSirsZucOr+nMaR1iR0MCPyMhBZ7PAqzi5jU s3cQ== X-Forwarded-Encrypted: i=1; AJvYcCXq4uzclOyOH+pK+5QkN3C9GMAcERbphReAIIGasv15JbrztCBJ5OtzIeCZ3UkBZ0N0HAY3a/6C1bGnWh67@lists.postgresql.org X-Gm-Message-State: AOJu0YyPwkpF8DcS0oFpY7pMu56UNqnIqLkciLJhKK8uDrDlu1sjTnYd OevBqRBEa2Y1T1XU+GkHnMq0PG5/4w2vwciaearBzt9cRAZX8nW2fC6UFWO4ftbvEwNd4TO+MYs u9r3+4/OVuGT2fp/Gv7MozzcAn9ml/5svT/tm X-Gm-Gg: AY/fxX6LT+yjfD4J6Ezp5lLyLIMi+rLky+JXIaO1Ea5axjFVvoGJ/ibdt7xtj9+Mvqz hgAFVQKRAO9atkl8KdfT2TJ25blOGtMVaV3iKHtZ5v6nikoClM1aYjtn7rSQzXblkn9Kt/uzDFZ oxztBr63InOExLBvTx4efR0qYKNiguLtTgarpKPThMs6s7BpxaTKqAGwp3Qtu2V9QOdVBLrTjaI BUeqfk8zx+H4JGK1ZbHX8iMxSBo5zEyMk8NZIYPY8wrUdj3AXf/urRR7FiSVT8zOqVL5ulBvrF3 no8zcIr2brOFPHY9hXjE+bZz7CLfRWyuCAheucfo X-Google-Smtp-Source: AGHT+IFQiq+6+c04Of6PyfGr3HsOY9jGBDoyJzWGz3AzogoMmiy849IwD1Y6kJpEWmHWOQYOS+izPQg1lTBwrQMcRBg= X-Received: by 2002:a05:6512:a96:b0:598:de38:9150 with SMTP id 2adb3069b0e04-598ee3fff8amr2774070e87.0.1765485580368; Thu, 11 Dec 2025 12:39:40 -0800 (PST) MIME-Version: 1.0 References: <202510301734.pj4uds3mqxx4@alvherre.pgsql> <116433.1764870207@localhost> <171530.1765306357@localhost> In-Reply-To: <171530.1765306357@localhost> From: Mihail Nikalayeu Date: Thu, 11 Dec 2025 21:38:00 +0100 X-Gm-Features: AQt7F2oM8841g_pSBcD6rFlinV7R1pUaf1jFFc1y60diEqOgi2dD0PQ_FgNLhKI Message-ID: Subject: Re: Adding REPACK [concurrently] To: Antonin Houska Cc: Alvaro Herrera , Pg Hackers , Robert Treat Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hello, Antonin! On Tue, Dec 9, 2025 at 7:52=E2=80=AFPM Antonin Houska wrot= e: > Worker makes more sense to me - the initial implementation is in 0005. Comments for 0005, so far: --- > export_initial_snapshot Hm, should we use ExportSnapshot instead? And ImportSnapshort to import it. --- > get_initial_snapshot Should we check if a worker is still alive while waiting? Also is "process_concurrent_changes". And AFAIU RegisterDynamicBackgroundWorker does not guarantee new workers to be started (in case of some fork-related issues). --- > Assert(res =3D SHM_MQ_DETACHED); =3D=3D --- > /* Wait a bit before we retry reading WAL. */ > (void) WaitLatch(MyLatch, > WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, > 1000L, > WAIT_EVENT_REPACK_WORKER_MAIN); Looks like we need ResetLatch(MyLatch); here. --- > * - decoding_ctx - logical decoding context, to capture concurrent data Need to be removed together with parameters. --- > hpm_context =3D AllocSetContextCreate(TopMemoryContext, > "ProcessParallelMessages", > ALLOCSET_DEFAULT_SIZES); "ProcessRepacklMessages" --- > if (XLogRecPtrIsInvalid(lsn_upto)) > { > SpinLockAcquire(&shared->mutex); > lsn_upto =3D shared->lsn_upto; > /* 'done' should be set at the same time as 'lsn_upto' */ > done =3D shared->done; > SpinLockRelease(&shared->mutex); > > /* Check if the work happens to be complete. */ > continue; > } May be moved to the start of the loop to avoid duplication. --- > SpinLockAcquire(&shared->mutex); > valid =3D shared->sfs_valid; > SpinLockRelease(&shared->mutex); Better to remember last_exported here to avoid any races/misses. --- > shared->lsn_upto =3D InvalidXLogRecPtr; I think it is better to clear it once it is read (after removing duplicatio= n). --- > bool done; bool exit_after_lsn_upto? --- > bool sfs_valid; Do we really need it? I think it is better to leave only last_exported and in process_concurrent_changes wait add argument (last_processed_file) and wait for last_exported to become higher. --- What if we reverse roles of leader-worker? Leader gets a snapshot, transfers it to workers (multiple probably for parallel scan) using already ready mechanics - workers are processing the scan of the table in parallel. Leader decodes the WAL. Also, workers may be assigned with a list of indexes they need to build. Feels like it reuses more from current infrastructure and also needs less different synchronization logic. But I'm not sure about the indexes phase - maybe it is not so easy to do. --- Also, should we add some kind of back pressure between building indexes/new heap and num of WAL we have? But probably it is out of scope of the patch. --- To build N indexes we need to scan table N times. What is about building multiple indexes during a single heap scan? -- Just a gentle reminder about the XMIN_COMMITTED flag and WAL storm after the switch. Best regards, Mikhail.