public inbox for [email protected]
help / color / mirror / Atom feedFrom: Peter J. Holzer <[email protected]>
To: [email protected]
Subject: Re: Repeatable Read Isolation Level "transaction start time"
Date: Sat, 5 Oct 2024 11:14:24 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <CAKAnmmLt2ezcf7-LgJkO_oeqK+Ei+XB6F3pHk2mnyadRq3corQ@mail.gmail.com>
<[email protected]>
<CANzqJaBjwtzLg_bZHkokUJbtA8gy6RpHjirfdBH9OJGfn5=tmw@mail.gmail.com>
<[email protected]>
<CAKAnmmLDpwgqu071t_Q7Fq349dYAgQim1x46U1DCAiiTcUmU2A@mail.gmail.com>
<CANzqJaAAQFD_JvK=ChmfdGCRNP0G5gMtVBgF52bZiFjQxH8N_Q@mail.gmail.com>
<CAKAnmmL05Lt89EOnMW111qdUOau+dYG7u8ff5o+ckKKohnNwGg@mail.gmail.com>
<[email protected]>
<[email protected]>
<[email protected]>
On 2024-09-25 18:09:44 -0400, Tom Lane wrote:
> "Peter J. Holzer" <[email protected]> writes:
> > On 2024-09-25 13:53:30 -0700, Christophe Pettus wrote:
> >> This might well be a failure of imagination on my part, but when would
> >> it pragmatically matter that the snapshot is taken at the first
> >> statement as opposed to at BEGIN?
>
> > It may make a difference if you're comparing timestamps.
>
> > For example, if you're using isolation level REPEATABLE READ and
> > (mistakenly) assume that the snapshot is taken at BEGIN, you would
> > expect any transaction_timestamp() written by a different transaction
> > and readable by this transaction to be earlier than the
> > transaction_timestamp() of this transaction.
>
> > But that's wrong because the other transaction could have happened
> > entirely in the time between your BEGIN and the statement which actually
> > triggers the snapshot.
>
> I don't find that hugely compelling, because there's always going
> to be some skew between the time we read the clock for the timestamp
> and the time we obtain the snapshot.
I wasn't arguing for a change in PostgreSQL's behaviour, just pointing
out a scenario where that would make a user-observable difference.
> Admittedly, that would normally not be a very long interval if BEGIN
> did both things ... but on a busy system you could lose the CPU for
> awhile in between.
Assuming that the system does have a global clock of sufficiently
fine resolution which returns strictly monotonically increasing
timestamps[1], I think the following is true:
Every snapshot divides the set of transactions into two non-overlapping
subsets: Those which have committed early enough that their effects are
visible in the snapshot and those which haven't. Let's call the first set
the "earlier" transactions and the second the "later" transactions. Let's
call the current transaction c and any transaction in the earlier set e
(we ignore the later transactions for now).
Performing a commit and taking a snapshot take some time, but there
should be a time t_C(e) in each commit and t_S(c) in the snapshot, such
that t_C(e) < t_S(c) for each "earlier" transaction.
Within each transaction each timestamp t which could be visible outside
of the transaction must have been obtained before the commit,
so t(e) < t_C(e) < t_S(c).
If we choose the transaction_timestamp to be >= t_S, then
transaction_timestamp(e) < t_C(e) < t_S(c) <= transaction_timestamp(c)
and therefore
transaction_timestamp(e) < transaction_timestamp(c)
Such a guarantee might be useful for some applications and it's not
(IMHO) an entirely unreasonable assumption, but it's not true for
PostgreSQL. So a programmer should be aware of that.
hp
[1] Not sure of this is true for Linux. It's certainly possible to never
step the clock backwards (not even during leap seconds), and the
resolution should be fine enough that you won't get two equal
timestamps on the same core, but with multiple cores I'm not
sure that processes running on different cores are guaranteed to get
strictly monotonically increasing timestamps. But as they say, while
all models are wrong, some are useful, and I think this is close
enough to reality to be useful.
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | [email protected] | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
view thread (11+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Repeatable Read Isolation Level "transaction start time"
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox