public inbox for [email protected]
help / color / mirror / Atom feedFrom: Hannu Krosing <[email protected]>
To: PostgreSQL Hackers <[email protected]>
Cc: Antonin Houska <[email protected]>
Cc: Matthias van de Meent <[email protected]>
Cc: Mihail Nikalayeu <[email protected]>
Cc: Sergey Sargsyan <[email protected]>
Cc: [email protected]
Cc: Andres Freund <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: Melanie Plageman <[email protected]>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: Thu, 4 Dec 2025 22:03:33 +0100
Message-ID: <CAMT0RQRT2wjMtH0iDLkzqAh9=SSo+Oed7jDT3jvsHXOgjY91Gw@mail.gmail.com> (raw)
In-Reply-To: <120851.1764875744@localhost>
References: <CADzfLwW9QczZW-E=McxcjUv0e5VMDctQNETbgao0K-SimVhFPA@mail.gmail.com>
<[email protected]>
<CADzfLwXKtriMnfCNVGNH2ahwXaByjo-QOMWiDTU-9WZqh+zQ5g@mail.gmail.com>
<CADzfLwW5bDWSxjHK7mqX8Lewki3+5FBydBC+nVcxg4xMGKscyw@mail.gmail.com>
<CAMAof6-4xaV3QE2ErYJaJhu6qjFn99sWyo_HQeBhHikZM3GexA@mail.gmail.com>
<CADzfLwXocKhpW3eFP1oScz+m+1XJ3bpi9QmVpoqC9RX9oyX=UA@mail.gmail.com>
<CAMAof695VA+mbVRhWCTus=E0WnsMAQyqXxfOTohbcb7VUHSP4g@mail.gmail.com>
<CAMAof69JSL8MYWG2qRScs3RQDpfcyZT_wFwW4SoAvftW+K_p1g@mail.gmail.com>
<CADzfLwVMtwjHh8KY9kP=_vcYPqHs=JDzuexO4RFQ2fM8VoqovA@mail.gmail.com>
<CAMAof68L0GO0F0bwuXtLZAjh9k_Hj+o0-8mqfO6iEQyXr4PuVA@mail.gmail.com>
<CADzfLwUrodAcOggK+3j3LbPLaSXemgHxa-n=LhZTwRAsaakL2g@mail.gmail.com>
<CAMAof691D4O=3QTuPwJXBYxYpG6s3A=tVhL9vN=T3eeRTMnaig@mail.gmail.com>
<CADzfLwVT3Y14g6Maz2y92sP2L7rPvpznt+MHM++xiy-U3XMLZQ@mail.gmail.com>
<CADzfLwXQe9XfQfJs3W-DCPqeqG4rq-6FoYUpGbbpgjcT1Eotpg@mail.gmail.com>
<CAMAof68kNgwWdkhmZd1ysfyU3PF66Wz+UaUr9g-LJg-_0xBV_Q@mail.gmail.com>
<CADzfLwUtLqYrupZp4QQuWwv4W_LgYWBRStybvQ+S0SZiHrp62A@mail.gmail.com>
<CADzfLwVYUBb8cUVQ_1mzVzNMyJH84VZKFCRyATvBZKbLW377CA@mail.gmail.com>
<CADzfLwWbV1i7+cP_Hqr3qgQnBXkAqgrCQxd5PFzqp2AOTK=40w@mail.gmail.com>
<CADzfLwXJc0jdDDS43-Fj0gKmwX-FURS3eY7MyLQ89qDPA6T5Ug@mail.gmail.com>
<CADzfLwVaV15R2rUNZmKqLKweiN3SnUBg=6_qGE_ERb7cdQUD8g@mail.gmail.com>
<CAEze2WgBffcC_SKGLmVxW8uRTEsrwWOHDQujN6zyxy1tSYLJ=Q@mail.gmail.com>
<CADzfLwVon8ESWOkg+8KU0F9=Hg7QKriNVX-hqcm-v-XZmHkzig@mail.gmail.com>
<CAEze2WiXYx1LKr=9d7PLsZOYrGytY9AN__tFFw4p_Ysgm1-e5g@mail.gmail.com>
<CADzfLwUKXcXKZgX+e8ACsOXe_CgtWmNJY_6dyn8EO0AXYOn2pA@mail.gmail.com>
<CAEze2WiiR2PeXg_vaURjjiiwvjQ=Um8wxWi1BcVS0BGyxiD2gQ@mail.gmail.com>
<5784.1764580169@localhost>
<CAEze2Wi_wZqxzZE0g==XuROUUnhpLwZM6=9Snkxbo7-ZH2J2yg@mail.gmail.com>
<5293.1764837294@localhost>
<CAEze2WgCuzsFpH_=ySAToT7bAFyj_cb82ZFnMeexnd-NYiVeOw@mail.gmail.com>
<120851.1764875744@localhost>
I just sent a small patch for logical decoding to pgsql-hackers@
exposing to logical decoding old and new tuple ids and a boolean
telling if an UPDATE is HOT.
Feel free to test if this helps here as well
On Thu, Dec 4, 2025 at 8:15 PM Antonin Houska <[email protected]> wrote:
>
> Matthias van de Meent <[email protected]> wrote:
>
> > On Thu, 4 Dec 2025 at 09:34, Antonin Houska <[email protected]> wrote:
> > >
> > > ISTM that what you consider a problem is copying the table using PGPROC-based
> > > snapshot and applying logically decoded commits to the result - is that what
> > > you mean?
> >
> > Correct.
> >
> > > In fact, LR (and also REPACK) uses snapshots generated by the logical decoding
> > > system. The information on running/committed transactions is based here on
> > > replaying WAL, not on PGPROC.
> >
> > OK, that's good to know. For reference, do you know where this is
> > documented, explained, or implemented?
>
> All my knowledge of these things is from source code.
>
> > I'm asking, because the code that I could find didn't seem use any
> > special snapshot (tablesync.c uses
> > `PushActiveSnapshot(GetTransactionSnapshot())`),
>
> My understanding is that this is what happens on the subscription side. Some
> lines above that however, walrcv_create_slot(..., CRS_USE_SNAPSHOT, ...) is
> called which in turn calls CreateReplicationSlot(..., CRS_USE_SNAPSHOT, ...)
> on the publication side and it sets that snapshot for the transaction on the
> remote (publication) side:
>
> else if (snapshot_action == CRS_USE_SNAPSHOT)
> {
> Snapshot snap;
>
> snap = SnapBuildInitialSnapshot(ctx->snapshot_builder);
> RestoreTransactionSnapshot(snap, MyProc);
> }
>
> > and the other
> > reference to LR's snapshots (snapbuild.c, and inside
> > `GetTransactionSnapshot()`) explicitly said that its snapshots are
> > only to be used for catalog lookups, never for general-purpose
> > queries.
>
> I think the reason is that snapbuild.c only maintains snapshots for catalog
> scans, because in logical decoding you only need to scan catalog tables. This
> is especially to find out which tuple descriptor was valid when particular
> data change (INSERT / UPDATE / DELETE) was WAL-logged - the output plugin
> needs the correct version of tuple descriptor to deform each tuple. However
> there is no need to scan non-catalog tables: as long as wal_level=logical, the
> WAL records contains all the information needed for logical replication
> (including key values). So snapbuild.c only keeps track of transactions that
> modify system catalog and uses this information to create the snapshots.
>
> A special case is if you pass need_full_snapshot=true to
> CreateInitDecodingContext(). In this case the snapshot builder tracks commits
> of all transactions, but only does so until SNAPBUILD_CONSISTENT state is
> reached. Thus, just before the actual decoding starts, you can get a snapshot
> to scan even non-catalog tables (SnapBuildInitialSnapshot() creates that, like
> in the code above). (For REPACK, I'm trying to teach snapbuild.c recognize
> that transaction changed one particular non-catalog table, so it can build
> snapshots to scan this one table anytime.)
>
> Another reason not to use those snapshots for non-catalog tables is that
> snapbuild.c creates snapshots of the kind SNAPSHOT_HISTORIC_MVCC. If you used
> this for non-catalog tables, HeapTupleSatisfiesHistoricMVCC() would be used
> for visibility checks instead of HeapTupleSatisfiesMVCC(). The latter can
> handle tuples surviving from older version of postgres, but the earlier
> cannot:
>
> /* Used by pre-9.0 binary upgrades */
> if (tuple->t_infomask & HEAP_MOVED_OFF)
>
> No such tuples should appear in the catalog because initdb always creates it
> from scratch.
>
> For LR, SnapBuildInitialSnapshot() takes care of the conversion from
> SNAPSHOT_HISTORIC_MVCC to SNAPSHOT_MVCC.
>
> --
> Antonin Houska
> Web: https://www.cybertec-postgresql.com
>
>
view thread (64+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
In-Reply-To: <CAMT0RQRT2wjMtH0iDLkzqAh9=SSo+Oed7jDT3jvsHXOgjY91Gw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox