Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sxA9z-008KAP-3s for pgsql-general@arkaria.postgresql.org; Sat, 05 Oct 2024 19:09:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sxA9x-003pVf-14 for pgsql-general@arkaria.postgresql.org; Sat, 05 Oct 2024 19:09:45 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sxA9w-003pVX-MJ for pgsql-general@lists.postgresql.org; Sat, 05 Oct 2024 19:09:44 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sxA9u-002gGY-0D for pgsql-general@lists.postgresql.org; Sat, 05 Oct 2024 19:09:43 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 495J9dgF1067325; Sat, 5 Oct 2024 15:09:39 -0400 From: Tom Lane To: Greg Sabino Mullane cc: Adrian Klaver , pgsql-general@lists.postgresql.org Subject: Re: Repeatable Read Isolation Level "transaction start time" In-reply-to: References: <28109.1727286817@sss.pgh.pa.us> <20240925215554.gfg24h5sp5aqesxv@hjp.at> <152525.1727302184@sss.pgh.pa.us> <20241005091424.34il2ss4noazgegx@hjp.at> <368259fb-fd2e-4a05-89e9-a733fae6d964@aklaver.com> Comments: In-reply-to Greg Sabino Mullane message dated "Sat, 05 Oct 2024 13:21:54 -0400" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1067323.1728155379.1@sss.pgh.pa.us> Date: Sat, 05 Oct 2024 15:09:39 -0400 Message-ID: <1067324.1728155379@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Greg Sabino Mullane writes: > All we can guarantee > via pg_stat_activity is that if xact_start and query_start *are* identical, > no snapshot has been granted yet, Surely that's not true either. xact_start = query_start implies that the current statement is the first in its transaction (assuming sufficiently fine-grained clock timestamps, something I'm not sure is an entirely safe assumption). But if that statement is not simply a BEGIN, it's likely obtained its own transaction snapshot after a few microseconds. As long as "read the system clock" is a distinct operation from "read a snapshot", there are going to be skew issues here. We could maybe eliminate that by reading the clock while holding the lock that prevents commits while reading a snapshot, but I doubt that anybody is going to accept that on performance grounds. Adding a not-guaranteed-cheap syscall inside that extremely hot code path seems unsatisfactory. Also, we currently do guarantee that xact_start matches query_start for the first statement of the transaction (the converse of what I said above). Removing that guarantee in order to add some other one wouldn't necessarily please everybody. regards, tom lane