Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tqbB5-007WwH-JZ for pgsql-hackers@arkaria.postgresql.org; Fri, 07 Mar 2025 17:08:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tqbB4-00CDaI-Br for pgsql-hackers@arkaria.postgresql.org; Fri, 07 Mar 2025 17:08:02 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tqbB3-00CDU2-7J for pgsql-hackers@lists.postgresql.org; Fri, 07 Mar 2025 17:08:02 +0000 Received: from fout-b4-smtp.messagingengine.com ([202.12.124.147]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tqbB1-001WrY-0R for pgsql-hackers@postgresql.org; Fri, 07 Mar 2025 17:08:00 +0000 Received: from phl-compute-12.internal (phl-compute-12.phl.internal [10.202.2.52]) by mailfout.stl.internal (Postfix) with ESMTP id 67FF1114019B; Fri, 7 Mar 2025 12:07:58 -0500 (EST) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-12.internal (MEProxy); Fri, 07 Mar 2025 12:07:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1741367278; x=1741453678; bh=x9NEP6QZLO3hZBUpZgAecKpa9z+39gi9S8epJHc5iUc=; b= D+vWZZsGaMDDFvfjp4xRGThG3M3zh5Aqo73atvItmNmtNkJS3+bH/Ur/gbSXa22W HCjIY+BRbxHvZOKa0vFaVsoxa4XpofYgDGt8ZBbZI6xwkpX5aEgtgEnybp5MdGkz cvvh2ivFQdNia4CKDX9maQA94732vlpnfzLdhc/s5GmbJq0v/RHINmA2ovhPqvqA g5GgcDzVfsd233cQSxplBYiYkNUHgyT6yfOHetOL3P9oIzKLnQ/3HhH7cnzU/sYk 1vMKimazZCt+bAcxR6hMbL7qubetkOemfnAcBGevNGq6ecqcV2xpqCKUv8wMoVMi RU5ml1zJemasixctmi7gKA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1741367278; x= 1741453678; bh=x9NEP6QZLO3hZBUpZgAecKpa9z+39gi9S8epJHc5iUc=; b=x ZIa8kOcoZ3ZdOUOjz6v80a3ADqsIegOFZYUjrhD3MEdiLw8pmb8Sxo6QL1Zpj4iY gttfX6uZ/ZyxYnXvtxDG4fsxb42yK3ya5wrT4mAuMZEDWOnmA+Cev18vNQQT++T7 fAdQvKA75aKZJZY4cKAiPm8uaZHsNroqBDMVW7CAEJggg5IbzUsm/Ajz3M39H1SN m5C4qDHGrtiJg+6EWqg4z3f8GUo2M6fWgtL3Wm5bR48W41vHdx4paUKk9mEm91Dy dqTmPfieXmEelRg76VEakO3Fxt2JFRKs1tpSb4k/E/WzvjIYOLE/Qd98P6VZJKjr US8vkpNtE1DKurFeLfpCQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdduudduvddvucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggv pdfurfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpih gvnhhtshculddquddttddmnecujfgurhepfffhvfevuffkfhggtggugfgjsehtkefstddt tdejnecuhfhrohhmpeetnhgurhgvshcuhfhrvghunhguuceorghnughrvghssegrnhgrrh griigvlhdruggvqeenucggtffrrghtthgvrhhnpeeigeehudegudduuddtuddtfeevgfdt fefhvdehteeikeevvdeiieekleehgfeltdenucffohhmrghinhepsggrrhhrihgvrhhsrd hrvggrugenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhm pegrnhgurhgvshesrghnrghrrgiivghlrdguvgdpnhgspghrtghpthhtohepvddpmhhoug gvpehsmhhtphhouhhtpdhrtghpthhtoheprggvkhhorhhothhkohhvsehgmhgrihhlrdgt ohhmpdhrtghpthhtohepphhgshhqlhdqhhgrtghkvghrshesphhoshhtghhrvghsqhhlrd horhhg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 7 Mar 2025 12:07:57 -0500 (EST) Date: Fri, 7 Mar 2025 12:07:56 -0500 From: Andres Freund To: Alexander Korotkov Cc: pgsql-hackers Subject: Re: pg_atomic_compare_exchange_*() and memory barriers Message-ID: References: <2muwyx6a5vojkg7iegknhnkcch3lfxptsxk7icwuh7szkvvu2y@vc3ukkfvnu6i> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2025-03-07 18:39:42 +0200, Alexander Korotkov wrote: > On Fri, Mar 7, 2025 at 6:02 PM Andres Freund wrote: > > > > On 2025-03-07 17:47:08 +0200, Alexander Korotkov wrote: > > > While investigating a bug in the patch to get rid of WALBufMappingLock, I > > > found that the surrounding pg_atomic_compare_exchange_u64() fixes the > > > problem for me. > > > > That sounds more likely to be due to slowing down things enough to make a race > > less likely to be hit. Or a compiler bug. > > > > > > > That doesn't feel right because, according to the > > > comments, both pg_atomic_compare_exchange_u32() and > > > pg_atomic_compare_exchange_u64() should provide full memory barrier > > > semantics. So, I decided to investigate this further. > > > > > > In my case, the implementation of pg_atomic_compare_exchange_u64() is based > > > on __atomic_compare_exchange_n(). > > > > > > static inline bool > > > pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, > > > uint64 *expected, uint64 newval) > > > { > > > AssertPointerAlignment(expected, 8); > > > return __atomic_compare_exchange_n(&ptr->value, expected, newval, false, > > > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); > > > } > > > > > > According to the docs __ATOMIC_SEQ_CST enforces total ordering with *all > > > other __ATOMIC_SEQ_CST operations*. It only says about other > > > __ATOMIC_SEQ_CST operations; nothing is said about regular reads/writes. > > > This sounds quite far from our comment, promising full barrier semantics, > > > which, in my understanding, means the equivalent of > > > both pg_read_barrier()/pg_write_barrier(), which should barrier *all other > > > read/writes*. > > > > A memory barrier in one process/thread always needs be paired with a barrier > > in another process/thread. It's not enough to have a memory barrier on one > > side but not the other. > > Yes, there surely should be a memory barrier on another side. But > does __ATOMIC_SEQ_CST works as a barrier for the regular read/write > operations on the same side? Yes, if it's paired with another barrier on the other side. The regular read/write operations are basically protected transitively, due to a) An acquire barrier preventing non-atomic reads/writes in the same thread from appearing to have been moved to before the barrier b) A release barrier preventing non-atomic reads/writes in the same thread from appearing to have been moved to after the barrier c) The other thread being guaranteed a) and b) for the other threads' non-atomic reads/writes depending on the type of barrier d) The atomic value itself being guaranteed to be, well, atomic > > > This function first checks if LSE instructions are present. If so, > > > instruction LSE casal is used. If not (in my case), the loop of > > > ldaxr/stlxr is used instead. The documentation says both ldaxr/stlxr > > > provides one-way barriers. Read/writes after ldaxr will be observed after, > > > and read/writes before stlxr will be observed before. So, a pair of these > > > instructions provides a full memory barrier. However, if we don't observe > > > the expected value, only ldaxr gets executed. So, an unsuccessful > > > pg_atomic_compare_exchange_u64() attempt will leave us with a one-way > > > barrier, and that caused a bug in my case. > > > > That has to be a compiler bug. We specify __ATOMIC_SEQ_CST for both > > success_memorder *and* failure_memorder. > > > > What compiler & version is this? > > I've tried and got the same results with two compilers. > gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 > Ubuntu clang version 19.1.1 (1ubuntu1~24.04.2) Thinking more about it I wonder if the behaviour of not doing a release in case the atomic fails *might* arguably actually be correct - if the compare-exchange fails, there's nothing that the non-atomic values could be ordered against. What is the access pattern and the observed problems with it that made you look at the disassembly? Greetings, Andres Freund