Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1shWw7-003SgH-EM for pgsql-general@arkaria.postgresql.org; Fri, 23 Aug 2024 16:14:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1shWw5-00EnPo-6a for pgsql-general@arkaria.postgresql.org; Fri, 23 Aug 2024 16:14:49 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1shWw4-00EnPa-LQ for pgsql-general@lists.postgresql.org; Fri, 23 Aug 2024 16:14:49 +0000 Received: from mail-oa1-x2a.google.com ([2001:4860:4864:20::2a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1shWw2-0015Mf-Fk for pgsql-general@lists.postgresql.org; Fri, 23 Aug 2024 16:14:47 +0000 Received: by mail-oa1-x2a.google.com with SMTP id 586e51a60fabf-27028b2cf22so1017435fac.1 for ; Fri, 23 Aug 2024 09:14:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724429685; x=1725034485; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=LeDTlSuobkZyTJtanQY00Lana2FXoiTR2vUnO02vwDg=; b=U+3skEPluLuVSybbJIjREgbnDuMZ4e9C+HAuieaMrJNRlZQ11EBks7MFkEINItlKnk CZTC5YgrtfXds8/t8spwxFyFwmCyxLlHdl6txPz2XvxN65PjQFwTZGDTfsI30N6p27if ShF5DVHPQJ1xvPb9JFSwh11Tog0O1QDgnU6Th1p8wU6CqIUG+T/PyZZUeWraeXAJFd6o 4LWpczHu77oSHRaTgzHUhPY3IHe5Jn81VtBG1Bw5apNldaFl563DBTO9X/nac4AbaWAB UA/g0A+hSkZxphPdgfHdB4/6F4U8aCs4UFiXgPFSoGvsOC/CLaw5j5OLa2C3s7zDqaFs 4YIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724429685; x=1725034485; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=LeDTlSuobkZyTJtanQY00Lana2FXoiTR2vUnO02vwDg=; b=LIYw2GwKAjHbI5IoFoz0ZwWVvtKAP1g0wEh5Log0uj/jwP5c7/fWnWDlGteStTJhER gifNbm2NN2Csjbf0AVOUMRDMIljc5FZD57TNFu9QGmmIvrCV6mCoRKsE4FSOOudxileD xelcjXBrJC3OYidJ+j6Y11knosdTIsZaZYxv5xeO/zbqJVjaSbe9B+zOiywI6+iD3b0K CT1SbM2O99D81YGWdhShABOBQ39ZRycm3Y8irD4G84oQpaNxeQEBethd5NOIJv0YOtMV bwqqkYtoKH7XLMRCsbkLvyxqyeqE9N+8MvPUu/6ysVl/KvjsFpqd9W83g8XF2P3dJjWD nZ8Q== X-Gm-Message-State: AOJu0YwAufb9k6BqTY6QHbkynAEIVTGeh+JN+xEb5kdkiFtVMt1oBoHw A0GoYT4kYyi8NKyKD+Cv3FmoOr12yZNS9j5Tx/S8KknXZzAgoWYwRxi6wau8PRGEtwCWHL7HrE7 dm0+pOPzqznqO1hrjFTsiYgGu1j0Ld6T+a78= X-Google-Smtp-Source: AGHT+IFOQRFeESGoUZrSZeOGfUhhjq1w5gE2iMIAam7XW753WCFpmXR0N5hs3deKo3J/ZFmkMLbJJl7v4iRXr/Uyezw= X-Received: by 2002:a05:6870:b01e:b0:260:28c0:669a with SMTP id 586e51a60fabf-273e4741790mr992290fac.25.1724429685280; Fri, 23 Aug 2024 09:14:45 -0700 (PDT) MIME-Version: 1.0 From: Matthew Tice Date: Fri, 23 Aug 2024 10:14:34 -0600 Message-ID: Subject: dead tuple difference between pgstattuple and pg_stat_user_tables To: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000d4d4f306205c123d" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000d4d4f306205c123d Content-Type: text/plain; charset="UTF-8" Hi All, I'm trying to understand why there's a difference between what pgstattuple reports and pg_stat_user_tables reports (for the number of dead tuples). As I understand, pgstattuple and pgstattuple_approx return the exact number of dead tuples (as noted in the documentation) and based on an older Stack Overflow answer the value returned from pg_stat_user_tables "uses the most recent data collected by ANALYZE". Why would it be that even after analyzing a table the n_dead_tup value is still vastly different than dead_tuple_count? > SELECT * FROM (SELECT dead_tuple_count from pgstattuple_approx('oban.oban_jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_analyze,now(),autovacuum_c ount FROM pg_stat_user_tables WHERE relname = 'oban_jobs' and schemaname = 'oban')b; -[ RECORD 1 ]------------------------- dead_tuple_count | 3736 n_dead_tup | 1127044 last_autovacuum | 2024-08-23 16:00:30.983141+00 last_analyze | 2024-08-23 15:33:50.628422+00 now | 2024-08-23 16:01:19.915893+00 autovacuum_count | 446478 SELECT 1 > vacuum (verbose,analyze) oban.oban_jobs; vacuuming "oban.oban_jobs" table "oban_jobs": index scan bypassed: 29341 pages from table (0.79% of total) have 1111747 dead item identifiers launched 2 parallel vacuum workers for index cleanup (planned: 2) index "oban_jobs_args_index" now contains 18281 row versions in 10232 pages 0 index row versions were removed. 0 index pages were newly deleted. 56 index pages are currently deleted, of which 833 are currently reusable. CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s. index "oban_jobs_meta_index" now contains 18281 row versions in 9698 pages 0 index row versions were removed. 0 index pages were newly deleted. 35 index pages are currently deleted, of which 621 are currently reusable. CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s. table "oban_jobs": found 855 removable, 9661 nonremovable row versions in 29341 out of 3727204 pages 1330 dead row versions cannot be removed yet, oldest xmin: 1378705314 Skipped 0 pages due to buffer pins, 3696951 frozen pages. 912 skipped pages using mintxid fork. CPU: user: 0.12 s, system: 0.08 s, elapsed: 0.22 s. vacuuming "pg_toast.pg_toast_72454950" table "pg_toast_72454950": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages 0 dead row versions cannot be removed yet, oldest xmin: 1378705314 Skipped 0 pages due to buffer pins, 0 frozen pages. 0 skipped pages using mintxid fork. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. analyzing "oban.oban_jobs" "oban_jobs": scanned 30000 of 3727204 pages, containing 75 live rows and 10501 dead rows; 75 rows in sample, 9318 estimated total rows VACUUM > SELECT * FROM (SELECT dead_tuple_count from pgstattuple_approx('oban.oban_jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_analyze,now(),autovacuum_c ount FROM pg_stat_user_tables WHERE relname = 'oban_jobs' and schemaname = 'oban')b; -[ RECORD 1 ]------------------------- dead_tuple_count | 1701 n_dead_tup | 1306009 last_autovacuum | 2024-08-23 16:01:31.034229+00 last_analyze | 2024-08-23 16:01:47.85574+00 now | 2024-08-23 16:01:55.734589+00 autovacuum_count | 446479 This is a Google Alloy DB instance running: > select version(); -[ RECORD 1 ]------------------------- version | PostgreSQL 14.10 on x86_64-pc-linux-gnu, compiled by Debian clang version 12.0.1, 64-bit SELECT 1 --000000000000d4d4f306205c123d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi All,

I'm trying to understand wh= y there's a difference between what pgstattuple reports and pg_stat_use= r_tables reports (for the number of dead tuples).

As I understand, p= gstattuple and pgstattuple_approx return the exact number of dead tuples (a= s noted in the documentation) and based on an older Stack Overflow answer t= he value returned from pg_stat_user_tables "uses the most recent data = collected by ANALYZE".

Why would it be that even after analyzin= g a table the n_dead_tup value is still vastly different than dead_tuple_co= unt?

> SELECT * FROM (SELECT dead_tuple_count from pgstattuple_ap= prox('oban.oban_jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_a= nalyze,now(),autovacuum_c
=C2=A0ount FROM pg_stat_user_tables WHERE reln= ame =3D 'oban_jobs' and schemaname =3D 'oban')b;
-[ RECO= RD 1 ]-------------------------
dead_tuple_count | 3736
n_dead_tup = =C2=A0 =C2=A0 =C2=A0 | 1127044
last_autovacuum =C2=A0| 2024-08-23 16:00:= 30.983141+00
last_analyze =C2=A0 =C2=A0 | 2024-08-23 15:33:50.628422+00<= br>now =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| 2024-08-23 16:01:1= 9.915893+00
autovacuum_count | 446478
SELECT 1

> vacuum (ve= rbose,analyze) oban.oban_jobs;

vacuuming "oban.oban_jobs"<= br>table "oban_jobs": index scan bypassed: 29341 pages from table= (0.79% of total) have 1111747 dead item identifiers
launched 2 parallel= vacuum workers for index cleanup (planned: 2)
index "oban_jobs_arg= s_index" now contains 18281 row versions in 10232 pages
0 index row= versions were removed.
0 index pages were newly deleted.
56 index pa= ges are currently deleted, of which 833 are currently reusable.
CPU: use= r: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
index "oban_jobs_meta_i= ndex" now contains 18281 row versions in 9698 pages
0 index row ver= sions were removed.
0 index pages were newly deleted.
35 index pages = are currently deleted, of which 621 are currently reusable.
CPU: user: 0= .01 s, system: 0.00 s, elapsed: 0.01 s.
table "oban_jobs": fou= nd 855 removable, 9661 nonremovable row versions in 29341 out of 3727204 pa= ges
1330 dead row versions cannot be removed yet, oldest xmin: 137870531= 4
Skipped 0 pages due to buffer pins, 3696951 frozen pages.
912 skipp= ed pages using mintxid fork.
CPU: user: 0.12 s, system: 0.08 s, elapsed:= 0.22 s.
vacuuming "pg_toast.pg_toast_72454950"
table "= ;pg_toast_72454950": found 0 removable, 0 nonremovable row versions in= 0 out of 0 pages
0 dead row versions cannot be removed yet, oldest xmin= : 1378705314
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 sk= ipped pages using mintxid fork.
CPU: user: 0.00 s, system: 0.00 s, elaps= ed: 0.00 s.
analyzing "oban.oban_jobs"
"oban_jobs"= ;: scanned 30000 of 3727204 pages, containing 75 live rows and 10501 dead r= ows; 75 rows in sample, 9318 estimated total rows
VACUUM

> SEL= ECT * FROM (SELECT dead_tuple_count from pgstattuple_approx('oban.oban_= jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_analyze,now(),autovac= uum_c
=C2=A0ount FROM pg_stat_user_tables WHERE relname =3D 'oban_jo= bs' and schemaname =3D 'oban')b;
-[ RECORD 1 ]--------------= -----------
dead_tuple_count | 1701
n_dead_tup =C2=A0 =C2=A0 =C2=A0 |= 1306009
last_autovacuum =C2=A0| 2024-08-23 16:01:31.034229+00
last_a= nalyze =C2=A0 =C2=A0 | 2024-08-23 16:01:47.85574+00
now =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| 2024-08-23 16:01:55.734589+00
autova= cuum_count | 446479

This is a Google Alloy DB instance running:
&= gt; select version();
-[ RECORD 1 ]-------------------------
version = | PostgreSQL 14.10 on x86_64-pc-linux-gnu, compiled by Debian clang version= 12.0.1, 64-bit
SELECT 1
--000000000000d4d4f306205c123d--