Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1umZSA-000E5Q-AK for pgsql-general@arkaria.postgresql.org; Thu, 14 Aug 2025 15:01:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1umZS8-0078gt-PZ for pgsql-general@arkaria.postgresql.org; Thu, 14 Aug 2025 15:01:17 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1umZS8-0078gl-Ek for pgsql-general@lists.postgresql.org; Thu, 14 Aug 2025 15:01:16 +0000 Received: from cloud.gatewaynet.com ([185.90.37.94]) by magus.postgresql.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1umZS6-000kJo-15 for pgsql-general@lists.postgresql.org; Thu, 14 Aug 2025 15:01:16 +0000 Content-Type: multipart/alternative; boundary="------------i7nkimEKvygdZ5hWMAXes2MO" Message-ID: <48a32f45-57f2-4560-ae94-3488b3568c8a@cloud.gatewaynet.com> Date: Thu, 14 Aug 2025 16:01:12 +0100 MIME-Version: 1.0 Subject: Re: Strange deadlock with object/target of lock : transaction To: Adrian Klaver , "pgsql-general@lists.postgresql.org" References: <0c474bc1-e7d6-4d7f-88ad-5284f89c997b@cloud.gatewaynet.com> Content-Language: en-US From: Achilleas Mantzios In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------i7nkimEKvygdZ5hWMAXes2MO Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi Adrian On 8/14/25 15:39, Adrian Klaver wrote: > On 8/14/25 00:07, Achilleas Mantzios wrote: >> Hi All >> >> We've been hit by a weird deadlock which it took me some days to >> isolate and replicate. It does not have to do with order of updates >> or any explicit TABLE-level locking, the objects/targets of the >> deadlock in question are transactions. > First off, I maybe wrong with the above conclusion, I noticed that even in the common deadlock scenario (xact A updating object 1 and then 2, while xact B updating 2 and then 1) the message is again the same , i.e. Process waits for ShareLock on transaction ; blocked by process . Process waits for ShareLock on transaction ; blocked by process . while updating tuple ()... Also I should have mentioned that it takes at least three transactions as in the example to make the deadlock happen. At least two of the "UPDATE" style and one of the "INSERT" style. > I have some questions: > > 1) Did this work in versions prior to 18? No, our production is on 16.9 and this is where I got the issue. > > 2) The test case you ran was done on 18beta1, are you planning to test > on the just released 18beta3? I must upgrade, but I don't think anything will change, this behavior seems consistent at least across 16->18beta1 > > --------------i7nkimEKvygdZ5hWMAXes2MO Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Hi Adrian

On 8/14/25 15:39, Adrian Klaver wrote:

On 8/14/25 00:07, Achilleas Mantzios wrote:
Hi All

We've been hit by a weird deadlock which it took me some days to isolate and replicate. It does not have to do with order of updates or any explicit TABLE-level locking, the objects/targets of the deadlock in question are transactions.

First off, I maybe wrong with the above conclusion, I noticed that even in the common deadlock scenario (xact A updating object 1 and then 2, while xact B updating 2 and then 1) the message is again the same , i.e.

Process <pid1> waits for ShareLock on transaction <xactB>; blocked by process <pid2>.

Process <pid2> waits for ShareLock on transaction <xactA>; blocked by process <pid1>.

while updating tuple ()...

Also I should have mentioned that it takes at least three transactions as in the example to make the deadlock happen. At least two of the "UPDATE" style and one of the "INSERT" style.

I have some questions:

1) Did this work in versions prior to 18?
No, our production is on 16.9 and this is where I got the issue.

2) The test case you ran was done on 18beta1, are you planning to test on the just released 18beta3?
I must upgrade, but I don't think anything will change, this behavior seems consistent at least across 16->18beta1


--------------i7nkimEKvygdZ5hWMAXes2MO--