Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1suete-008aZ9-GJ for pgsql-hackers@arkaria.postgresql.org; Sat, 28 Sep 2024 21:22:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1suetd-00DUdz-RK for pgsql-hackers@arkaria.postgresql.org; Sat, 28 Sep 2024 21:22:33 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1suetd-00DUdS-BG for pgsql-hackers@lists.postgresql.org; Sat, 28 Sep 2024 21:22:33 +0000 Received: from mail.postgrespro.ru ([93.174.131.139]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1suetZ-001Zy5-FY for pgsql-hackers@postgresql.org; Sat, 28 Sep 2024 21:22:32 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=postgrespro.ru; s=mx2023; t=1727558549; bh=rgVtiRsdMGXsCvpF7yWEYBgvDLKE5fMcDkHAPFeFA8g=; h=Message-ID:Date:User-Agent:Subject:To:Cc:References:From: In-Reply-To:From; b=CV8tcxrbxSLfmZF0ARp5/Yy9EKwhgX7rCOZ32y8hHt9SB8ee3vunfHozZFKgEQNXv HL7l4rAQzmMcLRiHcNRqljtnNY0ZbMuZOAYyyB4DTKq0fOFrG1P+kFtiM86ojfPrb4 DUIv7eKipOW/sMh7dqqCgsPs5ctI2ip/Pg3HMsvHKji4ItCry021C7l1iqAwvJrpbG kunAyctp3Qmwoaixxq/9VCAYt232+ujDaI5S9P4yFh9k/cp7IeOHUo+Njb6PflwaRg 49OmmoWVSMEA8c0r1HVNYgkqI/y/BAnpR/aZbs2h3CERlsi3jCCKMuJNYorAsfvFIN 7ZKkhwM2QBBWg== Received: from [192.168.76.77] (unknown [185.211.158.166]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: a.rybakina@postgrespro.ru) by mail.postgrespro.ru (Postfix/587) with ESMTPSA id BF82D6078D; Sun, 29 Sep 2024 00:22:28 +0300 (MSK) Content-Type: multipart/alternative; boundary="------------nmJ3YlsrBAh3r7t20V6HlTI5" Message-ID: <9485d892-fd04-4e3a-ac24-7dd767cb7333@postgrespro.ru> Date: Sun, 29 Sep 2024 00:22:28 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Vacuum statistics To: Masahiko Sawada , Melanie Plageman , Andrei Zubkov Cc: jian he , Alexander Korotkov , Ilia Evdokimov , Alena Rybakina , pgsql-hackers , a.lepikhov@postgrespro.ru References: <53c47c2d-72a5-44f2-900c-9973b2af1808@tantorlabs.com> <4a902cea-54fb-41b5-b208-b84731a5f577@postgrespro.ru> <092adec6-4eae-4bd4-bd0d-473a9df1282b@tantorlabs.com> <3deae1bd-ad84-4459-a26e-04c9136b84e9@postgrespro.ru> <9b10c6d3-52c4-4eef-b67c-c33442667729@postgrespro.ru> Content-Language: en-US From: Alena Rybakina In-Reply-To: X-KSMG-AntiPhishing: NotDetected X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 2.1.0.7854, bases: 2024/09/28 19:29:00 #26681244 X-KSMG-AntiVirus-Status: NotDetected, skipped X-KSMG-LinksScanning: not scanned, disabled by settings X-KSMG-Message-Action: skipped X-KSMG-Rule-ID: 1 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------nmJ3YlsrBAh3r7t20V6HlTI5 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi! Thank you for your interesting for this patch! >> I took a very brief look at this and was wondering if it was worth >> having a way to make the per-table vacuum statistics opt-in (like a >> table storage parameter) in order to decrease the shared memory >> footprint of storing the stats. > I'm not sure how users can select tables that enable vacuum statistics > as I think they basically want to have statistics for all tables, but > I see your point. Since the size of PgStat_TableCounts approximately > tripled by this patch (112 bytes to 320 bytes), it might be worth > considering ways to reduce the number of entries or reducing the size > of vacuum statistics. The main purpose of these statistics is to see abnormal behavior of vacuum in relation to a table or the database as a whole. For example, there may be a situation where vacuum has started to run more often and spends a lot of resources on processing a certain index, but the size of the index does not change significantly. Moreover, the table in which this index is located can be much smaller in size. This may be because the index is bloated and needs to be reindexed. This is exactly what vacuum statistics can show - we will see that compared to other objects, vacuum processed more blocks and spent more time on this index. Perhaps the vacuum parameters for the index should be set more aggressively to avoid this in the future. I suppose that if we turn off statistics collection for a certain object, we can miss it. In addition, the user may not enable the parameter for the object in time, because he will forget about it. As for the second option, now I cannot say which statistics can be removed, to be honest. So far, they all seem necessary. -- Regards, Alena Rybakina Postgres Professional --------------nmJ3YlsrBAh3r7t20V6HlTI5 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi! Thank you for your interesting for this patch!
I took a very brief look at this and was wondering if it was worth
having a way to make the per-table vacuum statistics opt-in (like a
table storage parameter) in order to decrease the shared memory
footprint of storing the stats.
I'm not sure how users can select tables that enable vacuum statistics
as I think they basically want to have statistics for all tables, but
I see your point. Since the size of PgStat_TableCounts approximately
tripled by this patch (112 bytes to 320 bytes), it might be worth
considering ways to reduce the number of entries or reducing the size
of vacuum statistics.

The main purpose of these statistics is to see abnormal behavior of vacuum in relation to a table or the database as a whole.

For example, there may be a situation where vacuum has started to run more often and spends a lot of resources on processing a certain index, but the size of the index does not change significantly. Moreover, the table in which this index is located can be much smaller in size. This may be because the index is bloated and needs to be reindexed. 

This is exactly what vacuum statistics can show - we will see that compared to other objects, vacuum processed more blocks and spent more time on this index.

Perhaps the vacuum parameters for the index should be set more aggressively to avoid this in the future.

I suppose that if we turn off statistics collection for a certain object, we can miss it. In addition, the user may not enable the parameter for the object in time, because he will forget about it. 

As for the second option, now I cannot say which statistics can be removed, to be honest. So far, they all seem necessary.

-- 
Regards,
Alena Rybakina
Postgres Professional
--------------nmJ3YlsrBAh3r7t20V6HlTI5--