public inbox for [email protected]  
help / color / mirror / Atom feed
From: Antonin Houska <[email protected]>
To: Mihail Nikalayeu <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Fujii Masao <[email protected]>
Cc: Robert Treat <[email protected]>
Cc: Pg Hackers <[email protected]>
Subject: Re: Adding REPACK [concurrently]
Date: Wed, 27 Aug 2025 08:16:07 +0200
Message-ID: <6931.1756275367@localhost> (raw)
In-Reply-To: <CADzfLwW3XAFKYkBY7Jktf_rETzcqMj20GXgjESqu+WEoeEMkog@mail.gmail.com>
References: <CADzfLwXx46j8KwQjjM1ZcqNBsx-k6GxHOzDJkm4SHjh+cv31Rw@mail.gmail.com>
	<[email protected]>
	<CADzfLwW=b=U3e6aasi=XorN8hZSiCKZErKs9qhyK7m=w=wokAg@mail.gmail.com>
	<40729.1755799624@localhost>
	<CADzfLwXpyzGGVB+nbSAMBWDBhzTyn6F2hRrAqcyJd3c6gT2EOQ@mail.gmail.com>
	<9536.1756127358@localhost>
	<CADzfLwWdw=5KhXE5Fr3khytp4mW2girFVJHHb4jSgEncc6237Q@mail.gmail.com>
	<21931.1756136535@localhost>
	<CADzfLwWPihwPM=e=EJ_U5niqDdVyq9pS30ZNxwZ8f_yXyikooQ@mail.gmail.com>
	<24483.1756142534@localhost>
	<CADzfLwUqyOmpkLmciecBy4aBN1sohQVZ2Hgc6m-tjSUqDRHwyQ@mail.gmail.com>
	<4790.1756197960@localhost>
	<CADzfLwV+80MfPM=MC5y3nA34djUWuYU6YKcZUO8JjD7_8p7nkg@mail.gmail.com>
	<29527.1756215093@localhost>
	<CADzfLwW3XAFKYkBY7Jktf_rETzcqMj20GXgjESqu+WEoeEMkog@mail.gmail.com>

Mihail Nikalayeu <[email protected]> wrote:

> Hello, Antonin!
> 
> Antonin Houska <[email protected]>:
> >
> > Where exactly should HeapTupleSatisfiesDirty() conclude that the tuple is
> > visible? TransactionIdIsCurrentTransactionId() will not do w/o the
> > modifications that you proposed earlier [1] and TransactionIdIsInProgress() is
> > not suitable as I explained in [2].
> 
> HeapTupleSatisfiesDirty is designed to respect even not-yet-committed
> transactions and provides additional related information.
> 
>     else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)))
>     {
>        /*
>         * Return the speculative token to caller.  Caller can worry about
>         * xmax, since it requires a conclusively locked row version, and
>         * a concurrent update to this tuple is a conflict of its
>         * purposes.
>         */
>        if (HeapTupleHeaderIsSpeculative(tuple))
>        {
>           snapshot->speculativeToken =
>              HeapTupleHeaderGetSpeculativeToken(tuple);
> 
>           Assert(snapshot->speculativeToken != 0);
>        }
> 
>        snapshot->xmin = HeapTupleHeaderGetRawXmin(tuple);
>        /* XXX shouldn't we fall through to look at xmax? */
>        return true;      /* in insertion by other */
>     }
> 
> So, it returns true when TransactionIdIsInProgress is true.
> However, that alone is not sufficient to trust the result in the common case.
> 
> You may check check_exclusion_or_unique_constraint or
> RelationFindReplTupleByIndex for that pattern:
> if xmin is set in the snapshot (a special hack in SnapshotDirty to
> provide additional information from the check), we wait for the
> ongoing transaction (or one that is actually committed but not yet
> properly reflected in the proc array), and then retry the entire tuple
> search.
> 
> So, the race condition you explained in [2] will be resolved by a
> retry, and the changes to TransactionIdIsInProgress described in [1]
> are not necessary.

I insist that this is a misuse of TransactionIdIsInProgress(). When dealing
with logical decoding, only WAL should tell whether particular transaction is
still running. AFAICS this is how reorderbuffer.c works.

A new kind of snapshot seems like (much) cleaner solution at the moment.

> I'll try to make some kind of prototype this weekend + cover race
> condition you mentioned in specs.
> Maybe some corner cases will appear.

No rush. First, the MVCC safety is not likely to be included in v19
[1]. Second, I think it's good to let others propose their ideas before
writing code.

> By the way, there's one more optimization we could apply in both
> MVCC-safe and non-MVCC-safe cases: setting the HEAP_XMIN_COMMITTED /
> HEAP_XMAX_COMMITTED bit in the new table:
> * in the MVCC-safe approach, the transaction is already committed.
> * in the non-MVCC-safe case, it isn’t committed yet - but no one will
> examine that bit before it commits (though this approach does feel
> more fragile).
> 
> This could help avoid potential storms of full-page writes caused by
> SetHintBit after the table switch.

Good idea, thanks.

[1] https://www.postgresql.org/message-id/202504040733.ysuy5gad55md%40alvherre.pgsql

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com





view thread (106+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Adding REPACK [concurrently]
  In-Reply-To: <6931.1756275367@localhost>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox