Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wAElO-002BxA-22 for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 22:19:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wAElM-002php-2s for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 22:19:13 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wAElL-002phh-22 for pgsql-hackers@lists.postgresql.org; Tue, 07 Apr 2026 22:19:13 +0000 Received: from fhigh-b2-smtp.messagingengine.com ([202.12.124.153]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wAElJ-000000016u2-2neA for pgsql-hackers@lists.postgresql.org; Tue, 07 Apr 2026 22:19:11 +0000 Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfhigh.stl.internal (Postfix) with ESMTP id 3311C7A013F; Tue, 7 Apr 2026 18:19:08 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-01.internal (MEProxy); Tue, 07 Apr 2026 18:19:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1775600347; x=1775686747; bh=BVjyJG67+m ueWwtVvQLCnSC5VOyj0CNXao/x+uuro1g=; b=gbANImVX32/hq2pf9+fz1n78A1 tfeb5sXDlCP489l7IlEAifRSTai0JNOmy0bzvhmbKgXHRzhrufsLaG801HVc3etq 7nAYbrDVNW2zzbKjKZF+t9se9Uimmv7cc6bZZjP4tLaA0RoFOy0fya2RJqnyc3Y+ H4daw2RFDtVf+BH9l9BcCKprI/fYjwg600xfnsBjO3BOW8Y2LPElUEGvbztCqkq9 0puUxNXSYDBP6CtlBQHtVLVVIcg/TtwToSpoRh7Qrvfvh1ZVoaO6Hmc5wxNmguHz +Ys3oRhOo2lmiwnRXio6XhnSMW2UU3H/aFUriF8dQVVmkTFF3ivBvomhpD5Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1775600347; x=1775686747; bh=BVjyJG67+mueWwtVvQLCnSC5VOyj0CNXao/ x+uuro1g=; b=iyDsAc+sdBsrBMRnuK+/eXMJsEJFNOtYkbxGqMWMkfe7cRPjqKo TYr0g0RJHY+uHs/4/HEhqBI/o4z4gb6oCRo3L3DyDRecrqcKCiye1yBXPaXKk85d 1fGuVi7UO8yRP7zYAq9FBCVNnBcbuxUAtq+Dm8JatUspU4+Pj2tdeB819OVjNLtd o3Cpw/Dj+fKIxpshSzfHizSvd4Dz5IT1d1f7q9tUhbFdvpHXZDHr8+AxLyc/WRDg u5fIY/YdLIRFubQjfWg9JFqlarBxeQ/DPcw9aB982sWxcqWN9aXkZF0UeI5Y2xFK 7ibNjKKvs1FZDbgx2bDzwvRjRVKX55N2eDA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgddvudekhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpeffhffvvefukfhfgggtuggjsehmtdfsredttddvnecuhfhrohhmpeetnhgurhgvshcu hfhrvghunhguuceorghnughrvghssegrnhgrrhgriigvlhdruggvqeenucggtffrrghtth gvrhhnpeeffedujefggffhgeehudffudeijeegtdekjefhveeiveegkeegkeelleejfeek ieenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegrnh gurhgvshesrghnrghrrgiivghlrdguvgdpnhgspghrtghpthhtohepiedpmhhouggvpehs mhhtphhouhhtpdhrtghpthhtoheplhhukhgrshesfhhithhtlhdrtghomhdprhgtphhtth hopehsmhhithhhphgsvddvhedtsehgmhgrihhlrdgtohhmpdhrtghpthhtohephhhlihhn nhgrkhgrsehikhhirdhfihdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehlih hsthhsrdhpohhsthhgrhgvshhqlhdrohhrghdprhgtphhtthhopeiishholhhtrdhprghr rhgrghhisehpvghrtghonhgrrdgtohhmpdhrtghpthhtohepthhomhgrshesvhhonhgurh grrdhmvg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 7 Apr 2026 18:19:07 -0400 (EDT) Date: Tue, 7 Apr 2026 18:19:06 -0400 From: Andres Freund To: Lukas Fittl Cc: Heikki Linnakangas , PostgreSQL Hackers , Tomas Vondra , Peter Smith , Zsolt Parragi Subject: Re: Stack-based tracking of per-node WAL/buffer usage Message-ID: References: <57biou6l65r7gr4nunoe6lignz2x6m3w45gihoypaez4pc46di@txj3bakhj66l> <3xbje45m5knff52mye5dfnrjdnwv7it2bzmqac3jqe66fvop4a@xvhy6zx7n6sb> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="7u6mrb22y43wqryc" Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --7u6mrb22y43wqryc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, On 2026-04-07 13:30:11 -0700, Lukas Fittl wrote: > 0001 is the change to make queryDesc->totaltime be allocated by > ExecutorStart instead of plugins themselves, and adds a > queryDesc->totaltime_options to have plugins request which level of > summary instrumentation they need. This change is pretty simple, and > could still make sense to get into 19. Because of the earlier > Instrumentation refactoring that was pushed (thanks!) we're already > asking extensions allocating queryDesc->totaltime to modify their use > of InstrAlloc, so I think we might as well clean this up now. Hm. That's a fair argument. They indeed would have to again change next release It's not a complicated change and removes more lines than it adds. I guess one thing I'm not sure is whether the fields shouldn't be renamed at the same time: a) To prevent extensions from continuing to set it, most of them do not test against assert enabled builds. With a different name they would get a compiler error. b) "totaltime" and "totaltime_options" are pretty poor descriptors of tracking query level statistics. If everyone has to change anyway, this is a good occasion. 'query_instr[_options]'? Any opinions? > 0002 is just ExecProcNodeInstr moved to instrument.c, as Andres had > suggested previously. We still get some quick performance wins from > doing that (see end of email), and again, its a simple change, so > could be considered if someone has bandwidth remaining. I've added a > later patch that then does the more complex inlining and gets us the > full speed up. Here it needs a few more inlines to get the full performance, otherwise it doesn't inline all the helpers. I think on balance I didn't like the prototype in instrument.h, that's too widely included, and it might even cause some circularity issues. It seems better to do the somewhat ugly thing and have the prototype be in executor.h. > 0002 measurements (with current master and TSC clock source used for > timing, best of three): > > CREATE TABLE lotsarows(key int not null); > INSERT INTO lotsarows SELECT generate_series(1, 50000000); > VACUUM FREEZE lotsarows; With the somewhat more extreme benchmark I used in the rdtsc thread and the added inline mentioned above I see a bit bigger wins. See the attached explainbench.sql - it doesn't quite cover all the combinations, but I think it gives a good enough overview. c=1 pgbench -f ~/tmp/explainbench.sql -P5 -r -t 10 master: statement latencies in milliseconds and failures: 200.800 0 SELECT pg_prewarm('pgbench_accounts'); 0.098 0 PREPARE query AS SELECT * FROM pgbench_accounts OFFSET 5000000 LIMIT 1; 212.010 0 EXPLAIN (ANALYZE, BUFFERS OFF, WAL OFF, TIMING OFF) 268.648 0 EXPLAIN (ANALYZE, BUFFERS OFF, WAL OFF, TIMING ON) 232.421 0 EXPLAIN (ANALYZE, BUFFERS ON, WAL ON, TIMING OFF) 283.531 0 EXPLAIN (ANALYZE, BUFFERS ON, WAL ON, TIMING ON) 0.030 0 DEALLOCATE query; 0002: statement latencies in milliseconds and failures: 201.558 0 SELECT pg_prewarm('pgbench_accounts'); 0.103 0 PREPARE query AS SELECT * FROM pgbench_accounts OFFSET 5000000 LIMIT 1; 188.696 0 EXPLAIN (ANALYZE, BUFFERS OFF, WAL OFF, TIMING OFF) 244.479 0 EXPLAIN (ANALYZE, BUFFERS OFF, WAL OFF, TIMING ON) 223.773 0 EXPLAIN (ANALYZE, BUFFERS ON, WAL ON, TIMING OFF) 266.947 0 EXPLAIN (ANALYZE, BUFFERS ON, WAL ON, TIMING ON) 0.034 0 DEALLOCATE query; That's something like 4-12%. Pretty nice for a patch that just adds a few lines around and adds a few inlines. > At this point I'd say its safe to say that we should push out later > changes to PG20, because it needs another good look over, and I don't > think Andres or Heikki have the capacity for that today (but I really > appreciate all the effort put in by both of you!). Indeed. > @@ -334,6 +334,9 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags) > > if (auto_explain_enabled()) > { > + /* We're always interested in runtime */ > + queryDesc->totaltime_options |= INSTRUMENT_TIMER; > - queryDesc->totaltime = InstrAlloc(INSTRUMENT_ALL); Not that it's going to make a significant difference, but it is nice that this now would need to track less. Kinda wonder about having EXPLAIN (ANALYZE BUFFERS totals_only, WAL totals_only) ...; in plenty cases that'd be all one needs, at substantially lower cost. Greetings, Andres Freund --7u6mrb22y43wqryc Content-Type: application/sql Content-Disposition: attachment; filename="explainbench.sql" Content-Transfer-Encoding: quoted-printable SELECT pg_prewarm('pgbench_accounts');=0A=0APREPARE query AS SELECT * FROM = pgbench_accounts OFFSET 5000000 LIMIT 1;=0A=0AEXPLAIN (ANALYZE, BUFFERS OFF= , WAL OFF, TIMING OFF)=0AEXECUTE query;=0A=0AEXPLAIN (ANALYZE, BUFFERS OFF,= WAL OFF, TIMING ON)=0AEXECUTE query;=0A=0AEXPLAIN (ANALYZE, BUFFERS ON, WA= L ON, TIMING OFF)=0AEXECUTE query;=0A=0AEXPLAIN (ANALYZE, BUFFERS ON, WAL O= N, TIMING ON)=0AEXECUTE query;=0A=0ADEALLOCATE query;=0A --7u6mrb22y43wqryc Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="v17a-0001-instrumentation-Move-ExecProcNodeInstr-to-allow.patch"