Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5Llc-0039ry-2E for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Mar 2026 10:47:17 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w5Lla-00Ddki-2q for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Mar 2026 10:47:15 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5Lla-00DdkZ-1d for pgsql-hackers@lists.postgresql.org; Wed, 25 Mar 2026 10:47:15 +0000 Received: from lahtoruutu.iki.fi ([2a0b:5c81:1c1::37]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w5LlY-000000013AB-01BH for pgsql-hackers@lists.postgresql.org; Wed, 25 Mar 2026 10:47:14 +0000 Received: from [10.0.2.15] (unknown [130.41.208.2]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id 4fgkB36l3Tz49PyD; Wed, 25 Mar 2026 12:47:07 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1774435629; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=m/Yq3zXPAmRy0zr+UEGPmbwSyooKjQpX8aT9m0aL7yM=; b=Zy7F1fPj9ZFqG6NnezMVQAnAnOR98ekmwaXdJXsAlkutR+bEDPMQWEOz3NdaC3OF1Wgxix PRaJZI9KkWXwv/EnbNBfaAsLB08SLaEkdnJZmFO2CZTXdcJhgGveuHMbsOpQ2mow82+pwG W7aLBrEg4BRDEBZKi4Y9M1Oyy//vZSRfyFpildbuRPospDWebV6OxCvz8kckQYoeMJ30Oq 7AkbUjM91S/6NTc8EQzRinDA/hvaKWVLuIYkeswWyMS6vCrR8XZH7wK9f5u2opEDkDtyh/ tHCvUgp+pVadMGnqQ3BFKuXC8Rua5brztPH99XCoXijdX4Mvbfnz28mVacONiQ== ARC-Seal: i=1; a=rsa-sha256; d=iki.fi; s=lahtoruutu; cv=none; t=1774435629; b=CsQFHpMG9GmKI1b3EBLVgc6qlKzrLBPXtkr8y5WVVWgZckMXqC45sc4UIKJda7QDO/y7BW DnxGiQplUc5PnBaF5XFW1gw5UhQ7hCQnGdL5tYsomnd8j/1Yd4GRpuSZRuIYqFKwIHeV+V 7+GxKqT5ltIzSv/P9c9BIG5lp2H16G3bpduL12jq9ROiIa5Jlh2k4kx5wkX7l/Gwd9ionb SrlesWIAaghTOodHFVN+SmogaPOhmpEvqJkolm6QRKoH7H4gYcFn5u7cNNAZ/T96G3vsS5 O8Rcjls0DV2oOE8n8/QL7lIGTeEFl1N4qT6/Gbx7uHFHcU2jzS5W7sGD31q4OQ== ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1774435629; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=m/Yq3zXPAmRy0zr+UEGPmbwSyooKjQpX8aT9m0aL7yM=; b=YUoeU/adQKc4KH3i1SxGWsHBO0CwqfjCZUNdWseBwTWdUKiGw5SMzPbjVKL9QfI0AtNZjy Hj5qxnZyaM0bCd3m/Gzn7OzLJmTVsjPO0mPhkwRn90YGSx+mjLEEOcoMilFGLx37dbzbYM XHXJZ0T1h0Z1fddowpiOdApKaJscrNela99YZFvt7wCjne//SxkcjubC45bT2tDin/etkZ rD+AmkbRGXDrTIr4tupq6oknHaehWe6nIB5DsRnA/nQDSM/xtm91b9XTgLJOYvpYzGFK8v pvK88nnvBdw1jbEqh/257ZUjL9FftyahX7QNbHdfPePAtyEmJQdLZ7qvq6dZrw== Message-ID: Date: Wed, 25 Mar 2026 12:47:01 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Stack-based tracking of per-node WAL/buffer usage To: Lukas Fittl , Zsolt Parragi , Andres Freund Cc: Tomas Vondra , PostgreSQL Hackers , Peter Smith References: <06086cb4-881b-4f5a-96af-f275220ff52d@vondra.me> Content-Language: en-US From: Heikki Linnakangas In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 24/03/2026 08:03, Lukas Fittl wrote: > Instead I've tried introducing a memory context for instrumentation > managed as a resource owner, and I am now (for now) convinced that > this is the right trade-off for the problem at hand. Yes, that seems better. This patch could use an overview README file, I'm struggling to understand how the this all works. Here's my understanding so far, please correct me if I'm wrong: There are *two* data structures tracking the Instrumentation nodes. The patch only talks about a stack, but I think there's also implicitly a tree in there. Tree ---- All Instrumentation nodes are part of a tree. For example, if you have two portals open, the tree might look like this: Session - Query A - NestLoop - Seq Scan A - Seq Scan B - Query B - Seq Scan C When a node is "finalized", its counters are added to its parent. This tree is a somewhat implicit in the patch. Each QueryInstrumentation has a list of child nodes, but only unfinalized ones. Don't we need that at the session level too? When a Query is released on abort, its counters need to be added to the parent too. If I understand correctly, the patch tries to use the stack for that, but it's confusing. I think it would make the patch more clear to talk explicitly about the tree, and represent it explicitly in the Instrumentation nodes. I.e. add a "parent" pointer, or a "children" list, or both to the Instrumentation struct. Stack ----- At all times, there's a stack that tracks what is the Instrumentation in the tree that is *currently* executing. For example, while executing the Seq Scan B, the stack would look like this: 0: Session 1: Query A 2: NestLoop 3: Seq Scan B And when the code is sending a result row back to the client, while the query is being executed, the stack would be just: 0: Session In the patch, the stack is represented by an array. It could also be implemented with a CurrentInstrumentation global variable, similar to CurrentMemoryContext and CurrentResourceOwner. Abort handling -------------- On abort, two things need to happen: 1. Reset the stack to the appropriate level. This ensures that any we don't later try to update the counters on an Instrumentation nodes that is going away with the abort. In the above example, the stack would be reset to the 0: Session level. 2. Finalize all the Instrumentation nodes as part of the ResourceOwner cleanup. All Instrumentation nodes that are released roll up their counters to their parents. Questions: Is the stack always a path from the root of the tree, down to some node? Or could you have e.g. recursion like A -> B -> C -> A? (I don't know if it makes a difference, just wondering) What happens if you release e.g. the NestLoop before its children? All the Instrumentation nodes belonging to a query would usually be part of the same ResourceOwner and there's no guarantee on what order the resources are released. - Heikki