public inbox for [email protected]
help / color / mirror / Atom feedFrom: Alexander Korotkov <[email protected]>
To: pgsql-hackers <[email protected]>
Subject: pg_atomic_compare_exchange_*() and memory barriers
Date: Fri, 7 Mar 2025 17:47:08 +0200
Message-ID: <CAPpHfdudejddHLLw=Uk_7pZ=1g2SZXfwubSgmWUhJ7fBUtR9rg@mail.gmail.com> (raw)
Hi all,
While investigating a bug in the patch to get rid of WALBufMappingLock, I
found that the surrounding pg_atomic_compare_exchange_u64() fixes the
problem for me. That doesn't feel right because, according to the
comments, both pg_atomic_compare_exchange_u32() and
pg_atomic_compare_exchange_u64() should provide full memory barrier
semantics. So, I decided to investigate this further.
In my case, the implementation of pg_atomic_compare_exchange_u64() is based
on __atomic_compare_exchange_n().
static inline bool
pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
uint64 *expected, uint64 newval)
{
AssertPointerAlignment(expected, 8);
return __atomic_compare_exchange_n(&ptr->value, expected, newval, false,
__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
}
According to the docs __ATOMIC_SEQ_CST enforces total ordering with *all
other __ATOMIC_SEQ_CST operations*. It only says about other
__ATOMIC_SEQ_CST operations; nothing is said about regular reads/writes.
This sounds quite far from our comment, promising full barrier semantics,
which, in my understanding, means the equivalent of
both pg_read_barrier()/pg_write_barrier(), which should barrier *all other
read/writes*.
I've found that __atomic_compare_exchange_n() ends up with usage
of __aarch64_cas8_acq_rel, which disassembles as following.
0000000000000860 <__aarch64_cas8_acq_rel>:
860: d503245f bti c
864: 90000110 adrp x16, 20000 <__data_start>
868: 3940a210 ldrb w16, [x16, #40]
86c: 34000070 cbz w16, 878 <__aarch64_cas8_acq_rel+0x18>
870: c8e0fc41 casal x0, x1, [x2]
874: d65f03c0 ret
878: aa0003f0 mov x16, x0
87c: c85ffc40 ldaxr x0, [x2]
880: eb10001f cmp x0, x16
884: 54000061 b.ne 890 <__aarch64_cas8_acq_rel+0x30> // b.any
888: c811fc41 stlxr w17, x1, [x2]
88c: 35ffff91 cbnz w17, 87c <__aarch64_cas8_acq_rel+0x1c>
890: d65f03c0 ret
This function first checks if LSE instructions are present. If so,
instruction LSE casal is used. If not (in my case), the loop of
ldaxr/stlxr is used instead. The documentation says both ldaxr/stlxr
provides one-way barriers. Read/writes after ldaxr will be observed after,
and read/writes before stlxr will be observed before. So, a pair of these
instructions provides a full memory barrier. However, if we don't observe
the expected value, only ldaxr gets executed. So, an unsuccessful
pg_atomic_compare_exchange_u64() attempt will leave us with a one-way
barrier, and that caused a bug in my case. In contrast,
__sync_val_compare_and_swap() uses __aarch64_cas8_sync, which is quite the
same as __aarch64_cas8_acq_rel, but has an explicit memory barrier in the
end.
I have a couple of ideas on how to fix this problem.
1. Provide full barrier semantics for pg_atomic_compare_exchange_*(). In
particular, for __atomic_compare_exchange_n(), we should probably
use __ATOMIC_ACQ_REL for success and run an explicit memory barrier in the
case of failure.
2. Document that pg_atomic_compare_exchange_*() doesn't necessarily provide
a memory barrier on failure. But then we need to carefully check if
existing use-cases need explicit memory barriers now.
Any thoughts?
Links
1.
https://www.postgresql.org/message-id/CAPpHfdsWcQb-u-9K%3DipneBf8CMhoUuBWKYc%2BXWJEHVdtONOepQ%40mail...
2. https://developer.arm.com/documentation/100941/0101/Barriers
------
Regards,
Alexander Korotkov
Supabase
view thread (20+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected]
Subject: Re: pg_atomic_compare_exchange_*() and memory barriers
In-Reply-To: <CAPpHfdudejddHLLw=Uk_7pZ=1g2SZXfwubSgmWUhJ7fBUtR9rg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox