Message-ID: <2568733f3acd9d6f65216acb486669ebbcdf7f6b.camel@cybertec.at>
Subject: Re: Vacuum Questions
From: Laurenz Albe <laurenz.albe@cybertec.at>
To: Leo <leo1969@gmail.com>, pgsql-novice@lists.postgresql.org
Date: Fri, 02 May 2025 18:37:35 +0200
In-Reply-To: 
 <CACWnfuZibskXj0gWEfx95jXpzhQWxmFRtK1Fc4fT5bUeY6AWsg@mail.gmail.com>
References: 
	<CACWnfuZibskXj0gWEfx95jXpzhQWxmFRtK1Fc4fT5bUeY6AWsg@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
User-Agent: Evolution 3.54.3 (3.54.3-1.fc41) 
MIME-Version: 1.0
Archived-At: 
 <https://www.postgresql.org/message-id/2568733f3acd9d6f65216acb486669ebbcdf7f6b.camel%40cybertec.at>
Precedence: bulk

On Fri, 2025-05-02 at 09:50 -0400, Leo wrote:
> I have been working on AWS PostgreSQL RDS for a few years, but still not =
very experienced
> when it comes to performance issues.=C2=A0 Plus RDS is slightly different=
 from the pure=C2=A0PostgreSQL.
>=20
> I am trying=C2=A0to comprehend=C2=A0exactly how vacuum works.
>=20
> Here is what I am trying to do.
>=20
> I am purging old records from a table (500 million rows, but I am doing i=
t in sets of
> 50,000,000 with a smaller loop of 100,000).=C2=A0 That works just fine.
>=20
> Because of the amount of data/rows deleted, I disabled=C2=A0the autovacuu=
m for this table
> (I want to have control over vacuum, autovacuum does not complete anyway =
due to the
> timeout, sizing, etc settings that I do not want to change system wide).=
=C2=A0 I will put
> the autovacuum back once I am done of course.
>=20
> The issue is when I start vacuuming.=C2=A0 This table has 4 indexes and a=
 PK that I worry about.
> The PK takes about 30 minutes to vacuum and two of the indexes take about=
 an hour each.
> The problem comes in for the other 2 indexes - they take 12+ hours each t=
o vacuum:
>=20
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 vacuuming "public.pc_=
workflowlog"
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 launched 4 parallel v=
acuum workers for index vacuuming (planned: 4)
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 scanned index "pc_wor=
kflowlog_pk" to remove 50000000 row versions
> DETAIL:=C2=A0 CPU: user: 191.03 s, system: 12.43 s, elapsed: 1711.22 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 scanned index "workfl=
owlo_u_publicid_g6uqp9lkn6e8" to remove 50000000 row versions
> DETAIL:=C2=A0 CPU: user: 325.75 s, system: 19.75 s, elapsed: 2674.24 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 scanned index "workfl=
owlo_n_workflow_2tc9k2hdtry9v" to remove 50000000 row versions
> DETAIL:=C2=A0 CPU: user: 312.17 s, system: 16.94 s, elapsed: 3097.88 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 scanned index "workfl=
owlo_n_frozenseti_2kjkbjgf3c6ro" to remove 50000000 row versions
> DETAIL:=C2=A0 CPU: user: 41187.70 s, system: 216.14 s, elapsed: 42749.36 =
s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 scanned index "workfl=
owlo_n_userid_14kqw6qdsnndw" to remove 50000000 row versions=C2=A0
> DETAIL:=C2=A0 CPU: user: 41280.66 s, system: 216.67 s, elapsed: 42832.16 =
s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:=C2=A0 "pc_workflowlog": rem=
oved 50000000 row versions in 1129870 pages
>=20
> I've increased=C2=A0max_parallel_maintenance_workers to 8 for the session=
 and it used parallel 4
> (one for each index I assume) to handle it and the two indexes were done =
in ~ an hour.=C2=A0 What
> I am trying to figure out is how to force the other two large indexes to =
be vacuumed in
> parallel - a few workers going against an index.=C2=A0 It seems it is pos=
sible to do, the index
> size is large enough to kick in, but I have not been able to figure it ou=
t yet.=C2=A0 Most of the
> parameters are at default values.
>=20
> What am I missing?

I didn't look at the code, but I believe that a single index won't be scann=
ed by more than a
single process.

> I have a few other questions.=C2=A0 Does vacuum=C2=A0time depend on the n=
umber of dead rows only and
> the size of the table, or does the entire storage allocation (including d=
ead tuples) also
> affect it?

I am not certain what you mean.
If you set "maintenance_work_mem" high enough, VACUUM may be able to get ri=
d of all dead
tuples in a single pass, which is very good for performance.  But that woul=
d figure under
"the time depends on the number of dead tuples".

> Would it be more beneficial to drop the two large indexes, purge, vacuum,=
 and recreate the
> indexes after make more sense (I know it needs to be tested)?

If you delete the majority of the rows, that may be a good move.
But then, maybe the best approach would be to create a new copy of the tabl=
e, copy all the
"surviving" rows there, create indexes on that copy, drop the original tabl=
e and replace it
with the copy.
That would avoid bloat, which is otherwise unavoidable if you delete a big =
part of the table.

> Lastly, is it better to delete all the rows (500 mil) instead of doing it=
 in smaller
> batches, and vacuum only once?

That will most likely be faster, and the bloat wouldn't be any worse.

Yours,
Laurenz Albe