Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tqbBt-007X68-Hc for pgsql-hackers@arkaria.postgresql.org; Fri, 07 Mar 2025 17:08:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tqbBs-00CH51-5C for pgsql-hackers@arkaria.postgresql.org; Fri, 07 Mar 2025 17:08:52 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tqbBr-00CH15-Mw for pgsql-hackers@lists.postgresql.org; Fri, 07 Mar 2025 17:08:51 +0000 Received: from mail-lf1-x134.google.com ([2a00:1450:4864:20::134]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tqbBo-001Wrt-2C for pgsql-hackers@postgresql.org; Fri, 07 Mar 2025 17:08:50 +0000 Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-54996d30bfbso276647e87.2 for ; Fri, 07 Mar 2025 09:08:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741367327; x=1741972127; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WQ5CNF7jPREIfdQaOcRDLn+Ouueu0oF5yR7PssHMgnY=; b=Zi24CWtIvbipPhMV9T0Xq3pk30lM7/4JlGG3eBQU5u/CrsX/jUdDTHgAECHxMkIdPt g2yGnqQt+DcOXGHmfA7ORs+cYOF96R+XynBBmQHyd6wTFNi3p8Kqw+i71xgZgwrfEkik 9Rb1zS7RQ9ugGL2wOUZC8PVr9r2JQ5hHos1RMLXryuZmWq42y+4DrTHm3cbpdd8FHt28 bew0dzo8xPWAuIN+xUfYYFeI3PhyMOgm8lG3tE/nZ9fD8IltOlG8IdBndmEHzjidEQq5 8/yQkR9CSTt8thLOVsOY98FwxodJ7p3XGSjcOfj4aRXWN+xQg29EhZN4wPjqW1S1FYVY HpcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741367327; x=1741972127; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WQ5CNF7jPREIfdQaOcRDLn+Ouueu0oF5yR7PssHMgnY=; b=cDW5h5OMsj9+LULXn20qC0Adxc7cp5LSLn3j8LlasY313pEB3vbuOUaTjXjocVadgp ZgYlSitBTXa1DCAB19FO0MpRwIjLI+N8uaKpLLK2h54FTGOkcrpCmNdul2NA/GWufqOZ MMkX/YsAvAx4a64gu+lIE5UASC6sW0OCZa4Zd7fTudx6kjpyrOwQTaS/1KUrspO7R7B2 NCkAQr6gjDPM/UDK7gFB+/jB0pkPViXcNhSSUSsWMSzzCRuM3BqgeNQmqAaMAeVpdJJE VA5AUSU1LoujlvfvX/LL7ZuovPjuVo8pFCgfJEJHZJvj6JppYTpTgTAVZE+9mULwmS0T nRKg== X-Forwarded-Encrypted: i=1; AJvYcCWLWaZs6SyqpelVu/esK9tWyWya1v5Bi6ksdl2RX9a10S63TKGObuLjO7dy2Imd4FEc5+3ciHAyrBF71+pZ@postgresql.org X-Gm-Message-State: AOJu0YxmLUfpcBJGsKQvIzm8w+z8XzUKkZuvKmBO7K6w6fVrn1bTwmWB usAvVDf4LLETu/zOriPwPGsfpLjK2+qDMRY6hJMcyJxyDa+G1K52m67Ei6OuUMmkHWkptaPeQJi b+czP9bUgmK8faLs+vBJHthn8acw= X-Gm-Gg: ASbGnctGliUvN5fsoeO9+WtSS6CmPL9c+bQ6aiKh5yDO4aZ1yH9rX3cXf2ENOrjcp2g AuK9Q8t7uRfvh0cenLGB0JOQb/5ybHyybW6nDR6kYf+dPzsNlD+0mlGZqJRZ66iFe1V2dfVuXF/ HZ48ITT4tVHNwQjmYw3kMANt2tciOf X-Google-Smtp-Source: AGHT+IGiResW3z4kwfYvDGlKbgO3395ZzNfQlZT4ostYTIBv0xgED+nA8MqyLnpW5k2wwf5CU4DUMqk516IBjkK+yzM= X-Received: by 2002:a05:6512:118a:b0:545:d7d:ac53 with SMTP id 2adb3069b0e04-54990ec173amr1828460e87.34.1741367326284; Fri, 07 Mar 2025 09:08:46 -0800 (PST) MIME-Version: 1.0 References: <2muwyx6a5vojkg7iegknhnkcch3lfxptsxk7icwuh7szkvvu2y@vc3ukkfvnu6i> In-Reply-To: From: Pavel Borisov Date: Fri, 7 Mar 2025 21:08:34 +0400 X-Gm-Features: AQ5f1JqnvYA0NJmadubGzDrdd0nsFA7vRlC_CJfXpHKEtL-ETRt7tC4qczRP0gY Message-ID: Subject: Re: pg_atomic_compare_exchange_*() and memory barriers To: Alexander Korotkov Cc: Andres Freund , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, Alexander and Andres! On Fri, 7 Mar 2025 at 20:40, Alexander Korotkov wrot= e: > > Hi, Andres. > > Thank you for reply. > > On Fri, Mar 7, 2025 at 6:02=E2=80=AFPM Andres Freund = wrote: > > > > On 2025-03-07 17:47:08 +0200, Alexander Korotkov wrote: > > > While investigating a bug in the patch to get rid of WALBufMappingLoc= k, I > > > found that the surrounding pg_atomic_compare_exchange_u64() fixes the > > > problem for me. > > > > That sounds more likely to be due to slowing down things enough to make= a race > > less likely to be hit. Or a compiler bug. > > > > > > > That doesn't feel right because, according to the > > > comments, both pg_atomic_compare_exchange_u32() and > > > pg_atomic_compare_exchange_u64() should provide full memory barrier > > > semantics. So, I decided to investigate this further. > > > > > > In my case, the implementation of pg_atomic_compare_exchange_u64() is= based > > > on __atomic_compare_exchange_n(). > > > > > > static inline bool > > > pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, > > > uint64 *expected, uint64 newval) > > > { > > > AssertPointerAlignment(expected, 8); > > > return __atomic_compare_exchange_n(&ptr->value, expected, newval,= false, > > > __ATOMIC_SEQ_CST, __ATOMIC_SEQ= _CST); > > > } > > > > > > According to the docs __ATOMIC_SEQ_CST enforces total ordering with *= all > > > other __ATOMIC_SEQ_CST operations*. It only says about other > > > __ATOMIC_SEQ_CST operations; nothing is said about regular reads/writ= es. > > > This sounds quite far from our comment, promising full barrier semant= ics, > > > which, in my understanding, means the equivalent of > > > both pg_read_barrier()/pg_write_barrier(), which should barrier *all = other > > > read/writes*. > > > > A memory barrier in one process/thread always needs be paired with a ba= rrier > > in another process/thread. It's not enough to have a memory barrier on = one > > side but not the other. > > Yes, there surely should be a memory barrier on another side. But > does __ATOMIC_SEQ_CST works as a barrier for the regular read/write > operations on the same side? According to a reference posted by Andres [1]: "Within a thread of execution, accesses (reads and writes) through volatile lvalues cannot be reordered past observable side-effects (including other volatile accesses) that are separated by a sequence point within the same thread, but this order is not guaranteed to be observed by another thread, since volatile access does not establish inter-thread synchronization." Also: "as soon as atomic operations that are not tagged memory_order_seq_cst enter the picture, the sequential consistency is lost" So I think Alexander is right in the original post. The order of everything not marked as ATOMIC_SEQ_CST (atomic or non-atomic operations) compared to atomic operations marked by ATOMIC_SEQ_CST is not warranted. > > __ATOMIC_SEQ_CST is used to implement the C11/C++11 barriers, and for t= hose > > it's more clearly formulated that that include acquire/release semantic= s. The > > standard formulation is a bit more complicated IIRC, but here's the > > cppreference.com simplification: > > > > https://en.cppreference.com/w/c/atomic/memory_order > > > A load operation with this memory order performs an acquire operatio= n, a > > > store performs a release operation, and read-modify-write performs b= oth an > > > acquire operation and a release operation, plus a single total order= exists > > > in which all threads observe all modifications in the same order (se= e > > > Sequentially-consistent ordering below). > > > Thank you. This is not yet clear for me. I probably need to meditate > more on this :) I think __ATOMIC_ACQ_REL is needed for success, and __ATOMIC_ACQUIRE might be sufficient for failure as failure of __atomic_compare_exchange_n contains no modification of atomic variable, so we want to see all modifications from the concurrent operations, but will not enforce them to see any "our" change. [2] Maybe I'm too optimistic here and having a full memory barrier is better. Thank you very much for considering this issue! I think the right operation of atomic primitives is both very very important and also hard to check thoroughly. I think the test that could reproduce the reordering of atomic and non-atomic operations could be very useful in regression set. [1] https://en.cppreference.com/w/c/atomic/memory_order#Sequentially-consis= tent_ordering [2] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html Best regards, Pavel Borisov Supabase