Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tWGzU-00FRA4-W4 for pgsql-hackers@arkaria.postgresql.org; Fri, 10 Jan 2025 15:32:06 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tWGzU-00097d-BE for pgsql-hackers@arkaria.postgresql.org; Fri, 10 Jan 2025 15:32:04 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tWGzT-00097V-RP for pgsql-hackers@lists.postgresql.org; Fri, 10 Jan 2025 15:32:03 +0000 Received: from mail.postgrespro.ru ([93.174.131.139]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tWGzQ-000vZ0-1S for pgsql-hackers@postgresql.org; Fri, 10 Jan 2025 15:32:03 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=postgrespro.ru; s=mx2023; t=1736523119; bh=KiZQ4WLcyJu5OPJK444J8sREkzCPPXUKTnra+r0v3xQ=; h=Message-ID:Date:User-Agent:Subject:From:To:Cc:References: In-Reply-To:From; b=DNd3Bpyk5WM+rcaT0dk9mTN1xV8SSNmNMTLr9X6Vv0vt2XGUFRICqHK2zpWdNRpvs ZdntgFouvb0bDQvPfVVO9Yw3NrKJTxLpYJBEJ7FYUVlFE1IvsANbtyA9Q4LdZlDRg7 6eDVd0HWhCHnyUIgWukX9lEWYH7H5AtlMWU1LvZgWWPnA6gqQpkXHRDjoqDMLEgrlC RdnyfkaVf5UVSQNziIbTZcwbi7MCetB+N4HCPXyqTpW5e+CtyZzYWqAIDgVvFD3kK+ neaXW9bk4czfzX2oXDkBgSUVlK5/00E0/n7sRk1MoChTmb5WSi1LcauzFI5dlYfw8a E+/2fGHeiwTig== Received: from [10.4.12.74] (unknown [93.174.131.141]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: a.rybakina@postgrespro.ru) by mail.postgrespro.ru (Postfix/587) with ESMTPSA id AB33E6071B; Fri, 10 Jan 2025 18:31:59 +0300 (MSK) Content-Type: multipart/alternative; boundary="------------3wD000d43wUN1L5HTRFE0D3N" Message-ID: <0d218dcf-a06e-4876-a773-186d53002ea2@postgrespro.ru> Date: Fri, 10 Jan 2025 18:31:59 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Vacuum statistics From: Alena Rybakina To: Sami Imseih Cc: Ilia Evdokimov , Alexander Korotkov , Jim Nasby , Andrei Zubkov , Masahiko Sawada , Melanie Plageman , jian he , pgsql-hackers , a.lepikhov@postgrespro.ru References: <0B6CBF4C-CC2A-4200-9126-CE3A390D938B@upgrade.com> <6732acf8ce0f31025b535ae1a64568750924a887.camel@moonset.ru> <5AA8FFD5-6DE2-4A31-8E00-AE98F738F5D1@upgrade.com> <85b963fe-5977-43aa-9241-75b862abcc69@postgrespro.ru> <9C7A167C-DCDE-4A17-9ABE-6276723FEC50@upgrade.com> <2d493cf9-9ba7-4cc1-a3f2-67afd7c163ee@postgrespro.ru> <77e6e723-0d3e-4235-8386-03d143916125@postgrespro.ru> <8f933105-80b3-4eea-8aaf-902bbba45f79@postgrespro.ru> Content-Language: en-US In-Reply-To: <8f933105-80b3-4eea-8aaf-902bbba45f79@postgrespro.ru> X-KSMG-AntiPhishing: NotDetected, bases: 2025/01/10 13:01:00 X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 2.1.0.7854, bases: 2025/01/10 08:29:00 #26967733 X-KSMG-AntiVirus-Status: NotDetected, skipped X-KSMG-LinksScanning: not scanned, disabled by settings X-KSMG-Message-Action: skipped X-KSMG-Rule-ID: 1 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------3wD000d43wUN1L5HTRFE0D3N Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Sorry, I made a typo due to lack of sleep, I've marked below where exactly just in case. On 10.01.2025 15:04, Alena Rybakina wrote: > > Hi, I have updated the patch. Fix minor mistakes in the document, > added the wraparound_failsafe_count statistics - it accounts the > number of times when the vacuum operates heap relation to prevent > workaround problem, fixed "shemaname". > I didn't mean workaround problem but wraparound problem. >>> Secondly, where to put the total time of vacuum for indexes and databases? >>> It would be incorrect not to take them into account at all. What if we remove the total time from >>> the heap statistics and add it to pg_stat_tables and only leave the vacuum statistics total time of >>> vacuum operation of indexes and databases? >>> It seems strange to me that they will have to be viewed from different views. >>> >>> I think it is necessary to look at the total time for tables into perspective of how much >>> time vacuum spent in total on processing indexes, since indexes can be bloated, for example. >>> I think it is better to leave these statistics here. >> You make valid points. I now think because track_vacuum_statistics is >> optional, we should track total_time in 2 places. First place in the new >> view being proposed here and the second place is in pg_stat_all_tables >> as being proposed here [3]. This way if track_vacuum_statistics is off, the >> total_time of vacuum could still be tracked by pg_stat_all_tables. >> >> By the way, the current patch does not track materialized view, >> but it should as materialized views can also be vacuumed. >> >> Regards, >> >> Sami >> >> [1]https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-VACUUM-FAILSAFE-AGE >> [2]https://github.com/postgres/postgres/blob/master/src/backend/access/heap/vacuumlazy.c#L2437-L2444 >> [3]https://commitfest.postgresql.org/52/5485/ >> >> > I don't agree with this. > > Firstly, the hook is enabled by default, that is, it must be specially > disabled so that the vacuum statistics are not collected. > > Secondly, it will cause confusion. First, the hook was disabled and > statistics were collected in one place - pg_stat_all_tables, and then > it was enabled and the user notices that the statistics there stopped > accumulating, > he is in a panic, "suddenly the vacuum does not work, what to do?". > The second point here bothers me, how to take into account this > statistics with the current detailed vacuum statistics? After all, > adding these values ​​​​is wrong - > they do not show the correct statistics regarding the same pages > processed by vacuum, ignoring it later means that they will be > redundant. I think it is better to save it here, since this will save > us from possible confusion. > > Secondly, it will immediately show other important parameters > regarding this statistics - how long the vacuum was sleep (delay_time > in my patches), how much time the vacuum spent on processing indexes > during its processing. > Without this information, this assessment will not be voluminous and > indicative enough. > I didn't mean hook but guc here. -- Regards, Alena Rybakina Postgres Professional --------------3wD000d43wUN1L5HTRFE0D3N Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit
Sorry, I made a typo due to lack of sleep, I've marked below where exactly just in case.

On 10.01.2025 15:04, Alena Rybakina wrote:

Hi, I have updated the patch. Fix minor mistakes in the document, added the wraparound_failsafe_count statistics  - it accounts the number of times when the vacuum operates heap relation to prevent workaround problem, fixed "shemaname".

I didn't mean workaround problem but wraparound problem.
Secondly, where to put the total time of vacuum for indexes and databases?
It would be incorrect not to take them into account at all. What if we remove the total time from
the heap statistics and add it to pg_stat_tables and only leave the vacuum statistics total time of
vacuum operation of indexes and databases?
It seems strange to me that they will have to be viewed from different views.

I think it is necessary to look at the total time for tables into perspective of how much
time vacuum spent in total on processing indexes, since indexes can be bloated, for example.
I think it is better to leave these statistics here.
You make valid points. I now think because track_vacuum_statistics is
optional, we should track total_time in 2 places. First place in the new
view being proposed here and the second place is in pg_stat_all_tables
as being proposed here [3]. This way if track_vacuum_statistics is off, the
total_time of vacuum could still be tracked by pg_stat_all_tables.

By the way, the current patch does not track materialized view,
but it should as materialized views can also be vacuumed.

Regards,

Sami

[1] https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-VACUUM-FAILSAFE-AGE
[2] https://github.com/postgres/postgres/blob/master/src/backend/access/heap/vacuumlazy.c#L2437-L2444
[3] https://commitfest.postgresql.org/52/5485/


I don't agree with this. 

Firstly, the hook is enabled by default, that is, it must be specially disabled so that the vacuum statistics are not collected. 

Secondly, it will cause confusion. First, the hook was disabled and statistics were collected in one place - pg_stat_all_tables, and then it was enabled and the user notices that the statistics there stopped accumulating,
he is in a panic, "suddenly the vacuum does not work, what to do?". The second point here bothers me, how to take into account this statistics with the current detailed vacuum statistics? After all, adding these values ​​​​is wrong -
they do not show the correct statistics regarding the same pages processed by vacuum, ignoring it later means that they will be redundant. I think it is better to save it here, since this will save us from possible confusion. 

Secondly, it will immediately show other important parameters regarding this statistics - how long the vacuum was sleep (delay_time in my patches), how much time the vacuum spent on processing indexes during its processing.
Without this information, this assessment will not be voluminous and indicative enough.


    
I didn't mean hook but guc here.
-- 
Regards,
Alena Rybakina
Postgres Professional
--------------3wD000d43wUN1L5HTRFE0D3N--