Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRXWQ-00GFcw-7N for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jul 2024 13:38:14 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sRXWO-005KLs-Ud for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jul 2024 13:38:12 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRXWO-005KLj-LG for pgsql-hackers@lists.postgresql.org; Wed, 10 Jul 2024 13:38:12 +0000 Received: from mail-lj1-x22e.google.com ([2a00:1450:4864:20::22e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sRXWL-001Ogs-Tz for pgsql-hackers@lists.postgresql.org; Wed, 10 Jul 2024 13:38:12 +0000 Received: by mail-lj1-x22e.google.com with SMTP id 38308e7fff4ca-2eaafda3b5cso61974831fa.3 for ; Wed, 10 Jul 2024 06:38:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720618689; x=1721223489; darn=lists.postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=wctOc/9+wm3kGhOcJcKCVa4efH87YgB6Y532yCIw1vU=; b=UnmuZuF9Cy4wPYdzKkkm/+HgCuq3DnZ+N6aA2GQaWgDl3n6CR98uoFREbDr4mi7ug5 Jv3cfD6HmVvAp/nx9Xa5Pjb6GE+iwv0MAy4TX/v/cJ+7rrZZCYDk5OJIRZoGfwfBsD24 +/lgAhMd7/Pme14tdDC5R68wQNhIhqql8qqHx/UsYHCnf6ZAdX76rFHp4qHEFz0+jzdm FhR5UE8AxlkFXdQ1Z6fWbyR6gVd5ysPn7wOOK72Ux4Dc/qmllIUjXLSI4AK8Wvc7Dja5 cNVuR5siXRAzPfKPLXJHeYMYEqq2mk4HPap5MBLQEVr7poP1ZP1ObVRgWr8i3RA1dSG/ fxZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720618689; x=1721223489; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=wctOc/9+wm3kGhOcJcKCVa4efH87YgB6Y532yCIw1vU=; b=JbQ7q2XG8jGLN948DDP+rBETOYwMK7YPFWZxjhEzYp7wn7h9feH9+eGbrdS5xZQWh7 74b7ofl99IESvfNn/Ekteef6fK5vJMIg1/ZgOWJ3sFi6QTpAdP2V9BRGaih9mh933MBS M7E8kAycTCcv6VeD5V3pn7zOmPHQIfT+555tLUZrpUYPb+cO2CZoJyBkUWl+rql5YEVT mZu7gsrO3REBRhy8w9lFtY0FjkHtvJuD8ytSIY+QPOyc3BNP8Fhr6ecm6SSef36qPf2b LoyMRQHSjpREDcuKtoqAw2nFqDpIJJC/3KaIYnxE+vbEiMQUwZoxpkhOCGO3STTKHE5U SdwA== X-Gm-Message-State: AOJu0YxFQXkt1Lm14I6AlNwcPRK3hcrIg22WKzjBSPRPAKtJ9lxCr+F5 JGgkVT/yzHcUjlvntc9QAjW7CZNCq1SMcx0e7Yt4kLRtXVs0Y/tQo1sHXw== X-Google-Smtp-Source: AGHT+IEk21+QM9CQ4XeSTzsRWxqQyAYhmJn9fWBgSWoeroCDTfUKz7SGNh21Yoi/q/IWmNdTG5cbQg== X-Received: by 2002:a2e:82d0:0:b0:2ee:493b:a4ee with SMTP id 38308e7fff4ca-2eeb30fcb53mr35246501fa.24.1720618688266; Wed, 10 Jul 2024 06:38:08 -0700 (PDT) Received: from ip-10-97-1-34.eu-west-3.compute.internal (ec2-15-236-134-5.eu-west-3.compute.amazonaws.com. [15.236.134.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-367cdfa0694sm5359945f8f.66.2024.07.10.06.38.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jul 2024 06:38:07 -0700 (PDT) Date: Wed, 10 Jul 2024 13:38:06 +0000 From: Bertrand Drouvot To: Michael Paquier Cc: pgsql-hackers@lists.postgresql.org Subject: Re: relfilenode statistics Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On Wed, Jul 10, 2024 at 03:02:34PM +0900, Michael Paquier wrote: > On Sat, May 25, 2024 at 07:52:02AM +0000, Bertrand Drouvot wrote: > > But I think that it is in a state that can be used to discuss the approach it > > is implementing (so that we can agree or not on it) before moving > > forward. > > I have read through the patch to get an idea of how things are done, Thanks! > and I am troubled by the approach taken (mentioned down by you), but > that's invasive compared to how pgstats wants to be transparent with > its stats kinds. > > + Oid objoid; /* object ID, either table or function > or tablespace. */ > + RelFileNumber relfile; /* relfilenumber for RelFileLocator. */ > } PgStat_HashKey; > > This adds a relfilenode component to the central hash key used for the > dshash of pgstats, which is something most stats types don't care > about. That's right but that's an existing behavior without the patch as: PGSTAT_KIND_DATABASE does not care care about the objoid PGSTAT_KIND_REPLSLOT does not care care about the dboid PGSTAT_KIND_SUBSCRIPTION does not care care about the dboid That's 3 kinds out of the 5 non fixed stats kind. Not saying it's good, just saying that's an existing behavior. > That looks like the incorrect thing to do to me, particularly > seeing a couple of lines down that a stats kind is assigned so the > HashKey uniqueness is ensured by the KindInfo: > + [PGSTAT_KIND_RELFILENODE] = { > + .name = "relfilenode", You mean, just rely on kind, dboid and relfile to ensure uniqueness? I'm not sure that would work as there is this comment in relfilelocator.h: " * Notice that relNumber is only unique within a database in a particular * tablespace. " So, I think it makes sense to link the hashkey to all the RelFileLocator fields, means: dboid (linked to RelFileLocator's dbOid) objoid (linked to RelFileLocator's spcOid) relfile (linked to RelFileLocator's relNumber) > FWIW, I have on my stack of patches something to switch the objoid to > 8 bytes, actually, which is something that would be required for > pg_stat_statements as query IDs are wider than that and affect all > databases, FWIW. Relfilenodes are 4 bytes, okay still Robert has > proposed a couple of years ago a patch set to bump that to 56 bits, > change reverted in a448e49bcbe4. Right, but it really looks like this extra field is needed to ensure uniqueness (see above). > What you would be looking instead is to use the relfilenode as an > objoid Not sure that works, as it looks like uniqueness won't be ensured (see above). > and keep track of the OID of the original relation in each > PgStat_StatRelFileNodeEntry so as it is possible to know where a past > relfilenode was used? That makes looking back at the past relation's > elfilenodes stats more complicated as it would be necessary to keep a > list of the past relfilenodes for a relation, as well. Perhaps with > some kind of cache that maintains a mapping between the relation and > its relfilenode history? Yeah, I also thought about keeping a list of "previous" relfilenodes stats for a relation but that would lead to: 1. Keep previous relfilnode stats 2. A more complicated way to look at relation stats (as you said) 3. Extra memory usage I think the only reason "previous" relfilenode stats are needed is to provide accurate stats for the relation. Outside of this need, I don't think we would want to retrieve "individual" previous relfilenode stats in the past. That's why the POC patch "simply" copies the stats to the relation during a rewrite (before getting rid of the "previous" relfilenode stats). What do you think? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com