Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQ03U-001CwU-1H for pgpool-hackers@arkaria.postgresql.org; Thu, 21 May 2026 09:51:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wQ03R-009nvb-0J for pgpool-hackers@arkaria.postgresql.org; Thu, 21 May 2026 09:51:02 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQ03Q-009nvH-2O for pgpool-hackers@lists.postgresql.org; Thu, 21 May 2026 09:51:01 +0000 Received: from meldrar.postgresql.org ([2a02:c0:301:0:ffff::31]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wQ03P-0000000050N-0DjX for pgpool-hackers@lists.postgresql.org; Thu, 21 May 2026 09:51:00 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Content-Transfer-Encoding:Content-Type: Mime-Version:References:In-Reply-To:From:Subject:Cc:To:Message-Id:Date:Sender :Reply-To:Content-ID:Content-Description; bh=fR9r7nWBrqlhoCFoynAHRSyQxJEpQUQrJfdGoKbAq3Q=; b=lkD2X0QbjFnaqzXMeuOms+qK/J cngDzD4H+isFQu20Umt4qGvhpo+9sZFholWVvMMcaWZkySSPA+knGGx2b9xMByXihNwYPeR0nJtx1 iKliwv8SYPu1djQRX9KwvbjNAKxiOfYOosMLswDOumFfEP8QNlncZ2zNvAt+7o3anjqGbQVX/j4jk QAsb6GPfTNH+WDX/WkRUS5+zZjlVA9HQYNaXeB5mzW9d13swwnRqoz6fxN9fzajfTAcN11B+4zGbX fPNo/6Gr2m44pw4Ng4OkKZMk8WjjHvp211G9StmrnPXj8eNMadPrq6zJcoVWNoC+14dPDpP/C68hW LlZDxvHw==; Received: from [2409:11:4120:300:68df:9b4f:848b:484c] (helo=localhost) by meldrar.postgresql.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQ03L-001aML-0U; Thu, 21 May 2026 09:50:57 +0000 Date: Thu, 21 May 2026 18:50:44 +0900 (JST) Message-Id: <20260521.185044.504532841722447976.ishii@postgresql.org> To: nadav@tailorbrands.com Cc: pgpool-hackers@lists.postgresql.org Subject: Re: Proposal: Recent mutated table tracking in memory From: Tatsuo Ishii In-Reply-To: References: <20260518.191105.282754674107940178.ishii@postgresql.org> <20260520.132810.1024114140609315185.ishii@postgresql.org> X-Mailer: Mew version 6.8 on Emacs 29.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Host-Lookup-Failed: Reverse DNS lookup failed for 2409:11:4120:300:68df:9b4f:848b:484c (failed) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Nadav, > Hi Tatsuo, > > Thanks for checking the V3, sorry for missing the test issue. > > I reproduced the timeout locally. Found and fixed the root cause. > > Root cause > ---------- > > In CommandComplete.c, the autocommit write-tracking code was > gated only on session_context->is_in_transaction, not on the > cluster mode. I think you are talking about the logic to judge whether we are in an explicite transaction or not here. Current dml_adaptive checks supplied query is a transaction starting command like BEGIN. IMO this is fundamentaly wrong because the command may fail by various reasons. The correct way is checking transaction state by using TSTATE macro. Note that the macro can only be used at leat one ready for query response returned from backend (simple query protocol case), or command complete response is returned from backend (extended query protocol case). > In native replication and snapshot isolation modes, > dml_adaptive() is never called (it lives inside > where_to_send_main_replica), so is_in_transaction is never set > to true even inside an explicit BEGIN/COMMIT block. That meant > every DML in those modes was treated as autocommit by the > write-tracking code, triggering > pool_track_table_mutation_get_database_oid() ― which does a > relcache do_query ― while a transaction was actually in flight > on the backend connection. The do_query conflicts with the > in-flight transaction and hangs the session. Assuming "a transaction was actually in flight" means a transaction was open (explicit transaction), not really. do_query can be called inside or outside of an explicit transaction. Anyway, I found dml_adaptive is completely broken (it brings wrong results if query cache enabled). Unless there are users for the feature, maybe we should remove dml_adaptive entirely? Regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp