Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tqa9o-007CfE-6H for pgsql-hackers@arkaria.postgresql.org; Fri, 07 Mar 2025 16:02:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tqa9m-009qV5-OQ for pgsql-hackers@arkaria.postgresql.org; Fri, 07 Mar 2025 16:02:38 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tqa9m-009qUh-Eg for pgsql-hackers@lists.postgresql.org; Fri, 07 Mar 2025 16:02:38 +0000 Received: from fout-b3-smtp.messagingengine.com ([202.12.124.146]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tqa9k-001WB5-1t for pgsql-hackers@postgresql.org; Fri, 07 Mar 2025 16:02:37 +0000 Received: from phl-compute-12.internal (phl-compute-12.phl.internal [10.202.2.52]) by mailfout.stl.internal (Postfix) with ESMTP id 3D44C1140186; Fri, 7 Mar 2025 11:02:36 -0500 (EST) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-12.internal (MEProxy); Fri, 07 Mar 2025 11:02:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1741363356; x=1741449756; bh=ZjU8/SoayA MWVzyuW7HzchlqiDAe7nLR3wzqn3bdfhQ=; b=on50VAUT9edgUBE/GLmeJrrI1W 7wxvledjFeYjos6Tj+bpOuzB9tFIw3u5mvH5OIL0k2lCgTq32pF6j0qJIEVbizYY 8K5SQgSIiSkamTNHQf5ZTiqu0n0iG4rI/YNGETmLgY59/fEOuwrXPO8G7njUYDbT OZ3zp3JsylfaBMTQXxbKPqnKUZrFuYerusyPjEQpYPRuGTosVeIeqMtU4LS/9mhN 8Dgr76eqwAGqR2Ut5kFdpypMAzZKbgmbzAQs5K3SozbZI7lM/up86I/A5qE4Duq6 imRwEL+z7nZ3sylyRCqgEs/nudABt7rr70Pif+AE0/qCx/P653nqJ9w73UxQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1741363356; x=1741449756; bh=ZjU8/SoayAMWVzyuW7HzchlqiDAe7nLR3wz qn3bdfhQ=; b=txgRfOZKpzU1/mbH2iZTgTXh7bAUX3Tlv+9JSHIKwoFJcr7e61n ABV1WKkCv7J58DIeQIHnBFBJJEKNvt2C6VM/Yf+7yPpXOOTv2ve8hZ60S1afhxc6 xZaYfTrW8fwdYq3AjA54eQ0CfunvZW3xtgKSclloARRHZebsyg2lnh7+N+v6ec5Q VSWixSh5/dfOVQ7U9H2BwUGzHbfUAU8WZ7CxUqI927x7OUwXgmpB35JTn9D92XGE XWxZCB33+hZi/ZSUL82EaLHRTqcQyf0sB0HIBq+nWUx+ZO2/Pm3852fppCEExW9C 7cLstEyC+qqZBeVWUX2TK2NRZKCVnGjbH5A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdduuddutdekucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggv pdfurfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpih gvnhhtshculddquddttddmnecujfgurhepfffhvfevuffkfhggtggujgesthdtsfdttddt vdenucfhrhhomheptehnughrvghsucfhrhgvuhhnugcuoegrnhgurhgvshesrghnrghrrg iivghlrdguvgeqnecuggftrfgrthhtvghrnheptdfgfffhveetveefudelieegfeekvedu vefgkeefieegieffvdelhfetheffveefnecuffhomhgrihhnpegtphhprhgvfhgvrhgvnh gtvgdrtghomhdpsggrrhhrihgvrhhsrdhrvggrugenucevlhhushhtvghrufhiiigvpedt necurfgrrhgrmhepmhgrihhlfhhrohhmpegrnhgurhgvshesrghnrghrrgiivghlrdguvg dpnhgspghrtghpthhtohepvddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheprggv khhorhhothhkohhvsehgmhgrihhlrdgtohhmpdhrtghpthhtohepphhgshhqlhdqhhgrtg hkvghrshesphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 7 Mar 2025 11:02:34 -0500 (EST) Date: Fri, 7 Mar 2025 11:02:33 -0500 From: Andres Freund To: Alexander Korotkov Cc: pgsql-hackers Subject: Re: pg_atomic_compare_exchange_*() and memory barriers Message-ID: <2muwyx6a5vojkg7iegknhnkcch3lfxptsxk7icwuh7szkvvu2y@vc3ukkfvnu6i> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2025-03-07 17:47:08 +0200, Alexander Korotkov wrote: > While investigating a bug in the patch to get rid of WALBufMappingLock, I > found that the surrounding pg_atomic_compare_exchange_u64() fixes the > problem for me. That sounds more likely to be due to slowing down things enough to make a race less likely to be hit. Or a compiler bug. > That doesn't feel right because, according to the > comments, both pg_atomic_compare_exchange_u32() and > pg_atomic_compare_exchange_u64() should provide full memory barrier > semantics. So, I decided to investigate this further. > > In my case, the implementation of pg_atomic_compare_exchange_u64() is based > on __atomic_compare_exchange_n(). > > static inline bool > pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, > uint64 *expected, uint64 newval) > { > AssertPointerAlignment(expected, 8); > return __atomic_compare_exchange_n(&ptr->value, expected, newval, false, > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); > } > > According to the docs __ATOMIC_SEQ_CST enforces total ordering with *all > other __ATOMIC_SEQ_CST operations*. It only says about other > __ATOMIC_SEQ_CST operations; nothing is said about regular reads/writes. > This sounds quite far from our comment, promising full barrier semantics, > which, in my understanding, means the equivalent of > both pg_read_barrier()/pg_write_barrier(), which should barrier *all other > read/writes*. A memory barrier in one process/thread always needs be paired with a barrier in another process/thread. It's not enough to have a memory barrier on one side but not the other. __ATOMIC_SEQ_CST is used to implement the C11/C++11 barriers, and for those it's more clearly formulated that that include acquire/release semantics. The standard formulation is a bit more complicated IIRC, but here's the cppreference.com simplification: https://en.cppreference.com/w/c/atomic/memory_order > A load operation with this memory order performs an acquire operation, a > store performs a release operation, and read-modify-write performs both an > acquire operation and a release operation, plus a single total order exists > in which all threads observe all modifications in the same order (see > Sequentially-consistent ordering below). > > This function first checks if LSE instructions are present. If so, > instruction LSE casal is used. If not (in my case), the loop of > ldaxr/stlxr is used instead. The documentation says both ldaxr/stlxr > provides one-way barriers. Read/writes after ldaxr will be observed after, > and read/writes before stlxr will be observed before. So, a pair of these > instructions provides a full memory barrier. However, if we don't observe > the expected value, only ldaxr gets executed. So, an unsuccessful > pg_atomic_compare_exchange_u64() attempt will leave us with a one-way > barrier, and that caused a bug in my case. That has to be a compiler bug. We specify __ATOMIC_SEQ_CST for both success_memorder *and* failure_memorder. What compiler & version is this? > I have a couple of ideas on how to fix this problem. > 1. Provide full barrier semantics for pg_atomic_compare_exchange_*(). In > particular, for __atomic_compare_exchange_n(), we should probably > use __ATOMIC_ACQ_REL for success and run an explicit memory barrier in the > case of failure. I don't follow - __ATOMIC_SEQ_CST is *stronger* than __ATOMIC_ACQ_REL. Greetings, Andres Freund