Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ss3mh-005nMQ-5s for pgsql-general@arkaria.postgresql.org; Sat, 21 Sep 2024 17:20:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ss3me-006p4K-27 for pgsql-general@arkaria.postgresql.org; Sat, 21 Sep 2024 17:20:37 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ss3mc-006p2u-C8 for pgsql-general@lists.postgresql.org; Sat, 21 Sep 2024 17:20:37 +0000 Received: from fout2-smtp.messagingengine.com ([103.168.172.145]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ss3mY-000Ndk-4Q for pgsql-general@lists.postgresql.org; Sat, 21 Sep 2024 17:20:34 +0000 Received: from phl-compute-08.internal (phl-compute-08.phl.internal [10.202.2.48]) by mailfout.phl.internal (Postfix) with ESMTP id 4CC2D1380213; Sat, 21 Sep 2024 13:20:29 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-08.internal (MEProxy); Sat, 21 Sep 2024 13:20:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aklaver.com; h= cc:content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1726939229; x=1727025629; bh=Nhg3s9y31nE6Ww2odaUbRSHpfGE0VzNgjk+vJ0MVYNk=; b= o4/JaK11Yf30WmfrcE9hHjVX2qjqwwEldTULgbzGvnP6kDsUNj+wQlyH1dkBzhfg pbU0kCtXhS6v4v9f03Wvn/A1Y7cy73bXwQrScwARsfcPAJOpWlePGjEXrSWbVg7d 1pIxMRU1GgZP+CKBR7637prn8GXBP0HQ3ByFatbQkxOoMeVjyHg10HfVOHvlqHdI YYnVcEMaWnlBfLZUiaIOWQm15rPTNr0G7p8NSd5FVhj9XsU1+BY9naRPSy+TX1Md zu7b0tzmNdea5EQxT2/3BaDkB5txAL+ryEYd0sfA5g4v5apD0udIkMid98kpyP8e QsIa6BKb305ZyEGHPVVRTA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1726939229; x= 1727025629; bh=Nhg3s9y31nE6Ww2odaUbRSHpfGE0VzNgjk+vJ0MVYNk=; b=S iijt/CNIkixwvglyTwLEXbTmLz4KguzKA4Fl8cpyISI8HNazmkh7A8r/x3AKynGp RPUg27kROSNqiEJ/+DaLHJFmYBSCBhudhYQz5juUmWqHcNXJr8RXwzUV6TfLeO6X YSc0Syzy2DvkhphOz459vW0UCmAPTcfkV7/QVPRHCaAY02ULmTK2qB7yY9I6AtRg 1biAnmaWsKh8gZ4zeqTMEa281yu9Q/kQoV4verh2oavtK0jv51P6sVZESzjAcQYt Ma2BC/Siuj9Kh/8RTkPHjlq6kp2K2HDVU3TjyBQoPapxrMFZLnf35wntHrKCPyGe EYGSUJ//6lx4+x5j8rHvQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrudelhedgudduvdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfhfhfgjtgfgsehtkeertddtvdej necuhfhrohhmpeetughrihgrnhcumfhlrghvvghruceorggurhhirghnrdhklhgrvhgvrh esrghklhgrvhgvrhdrtghomheqnecuggftrfgrthhtvghrnhepffelgeeifefgveduhedt hfekuedtffejveegffegjeevtdehgfduieetfeehjeehnecuvehluhhsthgvrhfuihiivg eptdenucfrrghrrghmpehmrghilhhfrhhomheprggurhhirghnrdhklhgrvhgvrhesrghk lhgrvhgvrhdrtghomhdpnhgspghrtghpthhtohepvddpmhhouggvpehsmhhtphhouhhtpd hrtghpthhtohepvhgvvghmrgdttddttdesghhmrghilhdrtghomhdprhgtphhtthhopehp ghhsqhhlqdhgvghnvghrrghlsehlihhsthhsrdhpohhsthhgrhgvshhqlhdrohhrgh X-ME-Proxy: Feedback-ID: i76984098:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 21 Sep 2024 13:20:28 -0400 (EDT) Message-ID: <55697e68-ee97-4571-9b71-c91a9373d40b@aklaver.com> Date: Sat, 21 Sep 2024 10:20:27 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: IO related waits To: veem v , pgsql-general@lists.postgresql.org References: <586d988d-978b-4b08-84b4-ef3ebb8736a8@aklaver.com> <4cfe720e-4283-43fc-8d72-8fa11e2f741a@aklaver.com> <20240920221718.ctq5whqfyqlrjwr2@hjp.at> Content-Language: en-US From: Adrian Klaver In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 9/21/24 02:36, veem v wrote: > > > > Actually here the application is using kafka and  flink stream and is > using one of existing code in which it was doing row by row commit which > is now changed to Batch commit i.e. the commit point is shifted from row > by row to batch now. There are multiple sessions spawning at the same > time to process the incoming messages 24/7. And also as described in > another ongoing thread and also we saw in the past we did not see much > difference between "batch commit" and "Batch insert" performance. We > only realized the row by row commit is having worse performance. The bottom line is that to solve this a cost is going to have to be paid somewhere. Previously it was done with autocommit in the form of slow insert performance. You improved the speed of the inserts by wrapping multiple inserts in transactions and that led you to this problem, where open transactions across sessions is leading to deadlock issues due to the same id being inserted in concurrent open sessions. Per my and Greg Sabino Mullane comments the solution is going to need planning. Right now you are playing a round of Whack-a-Mole by making ad-hoc changes of portions of the process without reference to the process as a whole.At some point the parallel duplicate ids(records) need to be straightened out into a serial application of data. You and the other people involved need to come up with a coherent view of the process as whole with a goal to achieving that. Then you can start planning on where that cost is best paid: 1) In the data before the streaming. 2) In the streaming process itself 3) In the database or 4) Spread out across 1-4. > > Now, in such a scenario when the messages are coming from a streaming > framework using kafka +flink and also the insert is happening using row > by row only (but just the commit is shifted to batch), i don't see any > way to sort the ID columns in this streaming process, so that they > won't overlap across session. > > In such a situation , the only way seems to have the messages replayed > for which the deadlock error happens , as I think during a deadlock > error, one session gets terminated by postgres and that messages perhap > we can save in some exception table and then replay? > -- Adrian Klaver adrian.klaver@aklaver.com