Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tg4XY-007UwR-PY for pgsql-general@arkaria.postgresql.org; Thu, 06 Feb 2025 16:15:45 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tg4XT-00922U-7P for pgsql-general@arkaria.postgresql.org; Thu, 06 Feb 2025 16:15:39 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tg4XR-00920I-VO for pgsql-general@lists.postgresql.org; Thu, 06 Feb 2025 16:15:38 +0000 Received: from fout-a4-smtp.messagingengine.com ([103.168.172.147]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tg4XN-003zcp-2L for pgsql-general@lists.postgresql.org; Thu, 06 Feb 2025 16:15:37 +0000 Received: from phl-compute-11.internal (phl-compute-11.phl.internal [10.202.2.51]) by mailfout.phl.internal (Postfix) with ESMTP id ABDF21380235; Thu, 6 Feb 2025 11:15:31 -0500 (EST) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-11.internal (MEProxy); Thu, 06 Feb 2025 11:15:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aklaver.com; h= cc:content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1738858531; x=1738944931; bh=Pzvi5oiw7S7jcm7mrNBfoM85ZvuivIiu/VIpqpH7obM=; b= OsIcl0D5MC8Q9rrapOn+9eXBX/RAvREJfJX5DMFOGgveOMqkeNLTT78CM/MijZ3q mPM23W9CZzfTsvticC7y4TeN1dZTgCc80J5SClryFJwa7Y/P42YwcpRzoFmVeL8E oOtjGRw6l04bdzHUnDrXjlT1hSdY/pw9T1kCo10njH13d9gH8TOsgnM1+IZTrE1p A7vtNzAFB+aEuJjUQC+aKCy4XmzDJ4FbVLw8EEyRV6pYh63QGENb5UPDMg3a/KWU /ZmNzZbIsSepxIKBstObVU3U6yBLHcjylSOnDSChr89ivPBBxZpBxdXZUE5xJFLi FkLE0/lMd/RVE25TaNjwXg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; t=1738858531; x=1738944931; bh=P zvi5oiw7S7jcm7mrNBfoM85ZvuivIiu/VIpqpH7obM=; b=RvnbVrm0gOHGr5lpj Rw4dUhwxtzs3GR1M+Qr5kl0FWqJhQtQpV2LIBFjDUdpXQaXS3B1aOA9JkgygRbsR mU/5J9GYEoNEelIkCJGJYKI246HhJXpnqpAJiOLrbmor6sHbadCrvUdwZVIFnMtU L4HYDv8dpk2A5bGh6amxorNJv1w4qI6s2ajG6tm27VHpYw6vJyKj1JkrtMbcqJNs k7W/uk+07Fdrhkj6zftIx7nAgDZNbP8ZSh3hxbTaK7pUEcj4PnW0Rjwis59JARFo xU5od6hsPj0iU3kFANhJN+59gRjHZjjNouWqObAIGRfTSAzPm9ahQIMZPqJvKP29 fn/KA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgddvieekfecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfhfhfgjtgfgsehtjeertddtvdej necuhfhrohhmpeetughrihgrnhcumfhlrghvvghruceorggurhhirghnrdhklhgrvhgvrh esrghklhgrvhgvrhdrtghomheqnecuggftrfgrthhtvghrnhepiedvhfeiheehgeeuieel jeeitedtjeehudegfeelkedvleekhedtgfeiffefkedunecuvehluhhsthgvrhfuihiivg eptdenucfrrghrrghmpehmrghilhhfrhhomheprggurhhirghnrdhklhgrvhgvrhesrghk lhgrvhgvrhdrtghomhdpnhgspghrtghpthhtohepvddpmhhouggvpehsmhhtphhouhhtpd hrtghpthhtoheprhhlvggrtghhsehprhhinhgtvghtohhnrdgvughupdhrtghpthhtohep phhgshhqlhdqghgvnhgvrhgrlheslhhishhtshdrphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: i76984098:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Feb 2025 11:15:30 -0500 (EST) Message-ID: <88d60ace-45e6-4d41-afc4-113df7219c4d@aklaver.com> Date: Thu, 6 Feb 2025 08:15:29 -0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: How to perform a long running dry run transaction without blocking To: Robert Leach , pgsql-general@lists.postgresql.org References: Content-Language: en-US From: Adrian Klaver In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 2/6/25 07:40, Robert Leach wrote: Comments inline. > Let me try and distill all the back story down to an oversimplified explanation: > > I created a scientific data submission validation interface that helps researchers compile their data submissions to our database. To do this, I decided to add a `--validate` option to the load scripts that essentially raises a dry run exception at the end of a load run so that the data is all rolled back before leaving the atomic transaction block. Load to where existing table or temporary table? > > This validation interface skips the raw data load step, which is the heftiest, most long running, part and generally finishes in well under a minute. So what is the validation actually doing? > > The interface works fantastically well. It rolls back problematic data in smaller transactions and buffers those errors for a final report that the user can work to fix in their data files and retry validation until all their problems are solved and the data is ready to load. > > The problem is that if we have a long running load going on in the background (which can take hours) and during that load, a user tries to validate a new submission that contains some overlapping common data (e.g. a new tissue type or protocol that is currently being loaded) or perhaps they are trying to validate data added to an existing submission that is being partially loaded, that validation process gets blocked and the validation interface encounters a gateway timeout. Where are the background processes loading data to? > > I had tried changing the isolation level to both repeatable read and serializable, but the hang can still occur (though serializable seems to avoid hangs in some cases that repeatable read does not). > > My initial interpretation of the isolation level documentation was that transactions would not be aware of what is happening in other transactions and that if there was a problem, you would only ever get a serialization error at the end when the data is committed, but after encountering the hang yesterday, I realized that the "snapshot" that the second transaction gets includes locks already established by the first transaction. > > I had hoped that since my validation interface would never commit, it would never have a serialization error or hang. > > What options are available that might allow this all to work as I intended? Is there some way to tell the validation process transaction to assume that any pending changes from another transaction will succeed? Not that I know of and that would be a false promise anyway as I know of no process that is perfect. > > I have lots of ideas on how to mitigate these occurrences, but I don't want to bias any potentially helpful responses. Am I screwed from my early decision to use the loading code to validate data submissions? > -- Adrian Klaver adrian.klaver@aklaver.com