public inbox for [email protected]  
help / color / mirror / Atom feed
From: Alvaro Herrera <[email protected]>
To: Antonin Houska <[email protected]>
Cc: Mihail Nikalayeu <[email protected]>
Cc: Pg Hackers <[email protected]>
Cc: Robert Treat <[email protected]>
Subject: Re: Adding REPACK [concurrently]
Date: Mon, 15 Dec 2025 15:25:22 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <210036.1765651719@localhost>

On 2025-Dec-13, Antonin Houska wrote:

> From 6279394135f2b693b6fffd174822509e0a067cbf Mon Sep 17 00:00:00 2001
> From: Antonin Houska <[email protected]>
> Date: Sat, 13 Dec 2025 19:27:18 +0100
> Subject: [PATCH 4/6] Add CONCURRENTLY option to REPACK command.

> diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
> index cc03f0706e9..a956892f42f 100644
> --- a/src/backend/replication/logical/decode.c
> +++ b/src/backend/replication/logical/decode.c
> @@ -472,6 +473,88 @@ heap_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)

> +	/*
> +	 * Second, skip records which do not contain sufficient information for
> +	 * the decoding.
> +	 *
> +	 * The problem we solve here is that REPACK CONCURRENTLY generates WAL
> +	 * when doing changes in the new table. Those changes should not be useful
> +	 * for any other user (such as logical replication subscription) because
> +	 * the new table will eventually be dropped (after REPACK CONCURRENTLY has
> +	 * assigned its file to the "old table").
> +	 */
> +	switch (info)
> +	{
> +		case XLOG_HEAP_INSERT:
> +			{
> +				xl_heap_insert *rec;
> +
> +				rec = (xl_heap_insert *) XLogRecGetData(buf->record);
> +
> +				/*
> +				 * This does happen when 1) raw_heap_insert marks the TOAST
> +				 * record as HEAP_INSERT_NO_LOGICAL, 2) REPACK CONCURRENTLY
> +				 * replays inserts performed by other backends.
> +				 */
> +				if ((rec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE) == 0)
> +					return;
> +
> +				break;
> +			}
> +
> +		case XLOG_HEAP_HOT_UPDATE:
> +		case XLOG_HEAP_UPDATE:
> +			{
> +				xl_heap_update *rec;
> +
> +				rec = (xl_heap_update *) XLogRecGetData(buf->record);
> +				if ((rec->flags &
> +					 (XLH_UPDATE_CONTAINS_NEW_TUPLE |
> +					  XLH_UPDATE_CONTAINS_OLD_TUPLE |
> +					  XLH_UPDATE_CONTAINS_OLD_KEY)) == 0)
> +					return;
> +
> +				break;
> +			}
> +
> +		case XLOG_HEAP_DELETE:
> +			{
> +				xl_heap_delete *rec;
> +
> +				rec = (xl_heap_delete *) XLogRecGetData(buf->record);
> +				if (rec->flags & XLH_DELETE_NO_LOGICAL)
> +					return;
> +				break;
> +			}
> +	}

I'm confused as to the purpose of this addition.  I took this whole
block out, and no tests seem to fail.  Moreover, some of the cases that
are being skipped because of this, would already be skipped by code in
DecodeInsert / DecodeUpdate anyway.  The case for XLOG_HEAP_DELETE seems
to have no effect (that is, the "return" there never hits for any tests
as far as I can tell.)

The reason I ask is that the line immediately below does this:

>  	ReorderBufferProcessXid(ctx->reorder, xid, buf->origptr);

which means the Xid is tracked for snapshot building purposes.  Which is
probably important, because of what the comment right below it says:

	/*
	 * If we don't have snapshot or we are just fast-forwarding, there is no
	 * point in decoding data changes. However, it's crucial to build the base
	 * snapshot during fast-forward mode (as is done in
	 * SnapBuildProcessChange()) because we require the snapshot's xmin when
	 * determining the candidate catalog_xmin for the replication slot. See
	 * SnapBuildProcessRunningXacts().
	 */

So what happens here is that we would skip processing the Xid of a xlog
record during snapshot-building, on the grounds that it doesn't contain
logical changes.  I'm not sure this is okay.  If we do indeed need this,
then perhaps it should be done after ReorderBufferProcessXid().

Or did you intend to make this conditional on the backend running
REPACK?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/





view thread (106+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Adding REPACK [concurrently]
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox