MIME-Version: 1.0
From: R Wahyudi <rwahyudi@gmail.com>
Date: Wed, 17 Sep 2025 08:25:49 +1000
Message-ID: <CALWQLzRmzT7bo0c6CUX9=L_oLD3oUN8fZ5yyGLEwe7y5rWoxmQ@mail.gmail.com>
Subject: pg_restore scan
To: pgsql-general@lists.postgresql.org
Content-Type: multipart/alternative; boundary="000000000000de69c5063ef29ab9"
Archived-At: <https://www.postgresql.org/message-id/CALWQLzRmzT7bo0c6CUX9%3DL_oLD3oUN8fZ5yyGLEwe7y5rWoxmQ%40mail.gmail.com>
Precedence: bulk

--000000000000de69c5063ef29ab9
Content-Type: text/plain; charset="UTF-8"

I'm trying to troubleshoot the slowness issue with pg_restore and stumbled
across a recent post about pg_restore scanning the whole file :

> "scanning happens in a very inefficient way, with many seek calls and
small block reads. Try strace to see them. This initial phase can take
hours in a huge dump file, before even starting any actual restoration."
see :
https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net

I'm currently having this same issue.

At the early stage of restoration I can see lots of disk writes activities
but as time goes by, disk writes activities are reduced.
I can see the COPY process in postgres but not using any CPU, and the
process that uses CPU are pg_restores.

I can recreate this issue when restoring a specific table to stdout.

ie :
pg_restore -vvvv -t <some_table_at_the> DB.pgdump -f -

If the table is at the bottom of the TOC it will take  hours before I get a
result, but I get an almost immediate result when the table is at the top.
 parallel restore suffers with the same issue where each process has to
perform a scan for each table.

What is the best way to speed up the restore ?


More info about my environment :
pg_restore (PostgreSQL) 17.6

Archive :
; Archive created at 2025-09-16 16:08:28 AEST
;     dbname: DB
;     TOC Entries: 8221
;     Compression: none
;     Dump Version: 1.14-0
;     Format: CUSTOM
;     Integer: 4 bytes
;     Offset: 8 bytes
;     Dumped from database version: 14.15
;     Dumped by pg_dump version: 14.19 (Ubuntu 14.19-1.pgdg22.04+1)

--000000000000de69c5063ef29ab9
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:trebuche=
t ms,sans-serif;color:#073763"><br></div><div class=3D"gmail_default" style=
=3D"font-family:trebuchet ms,sans-serif;color:#073763">I&#39;m trying to tr=
oubleshoot the slowness issue with pg_restore and stumbled across a recent =
post about pg_restore scanning the whole file :=C2=A0</div><div class=3D"gm=
ail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#073763"><b=
r></div><div class=3D"gmail_default" style=3D"font-family:trebuchet ms,sans=
-serif;color:#073763">&gt; &quot;scanning happens in a very inefficient way=
, with many seek calls and small block reads. Try strace to see them. This =
initial phase can take hours in a huge dump file, before even starting any =
actual restoration.&quot;</div><div class=3D"gmail_default" style=3D"font-f=
amily:trebuchet ms,sans-serif;color:#073763">see : <a href=3D"https://www.p=
ostgresql.org/message-id/E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net">ht=
tps://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-B2C3EC2E0551%40=
gmx.net</a></div><div class=3D"gmail_default" style=3D"font-family:trebuche=
t ms,sans-serif;color:#073763"><br></div><div class=3D"gmail_default" style=
=3D"font-family:trebuchet ms,sans-serif;color:#073763">I&#39;m currently ha=
ving this same issue.=C2=A0</div><div class=3D"gmail_default" style=3D"font=
-family:trebuchet ms,sans-serif;color:#073763">=C2=A0</div><div class=3D"gm=
ail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#073763">At=
 the early stage of restoration I can see lots of disk writes activities bu=
t as time goes by, disk writes activities are reduced.</div><div class=3D"g=
mail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#073763">I=
 can see the COPY process in postgres but not using any CPU, and the proces=
s that uses CPU are pg_restores.=C2=A0</div><div class=3D"gmail_default" st=
yle=3D"font-family:trebuchet ms,sans-serif;color:#073763"><br></div><div cl=
ass=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#0=
73763">I can recreate this issue when restoring a specific table to stdout.=
=C2=A0</div><div class=3D"gmail_default" style=3D"font-family:trebuchet ms,=
sans-serif;color:#073763"><br></div><div class=3D"gmail_default" style=3D"f=
ont-family:trebuchet ms,sans-serif;color:#073763">ie :</div><div class=3D"g=
mail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#073763">p=
g_restore -vvvv -t &lt;some_table_at_the&gt; DB.pgdump -f -</div><div class=
=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#0737=
63"><br></div><div class=3D"gmail_default" style=3D"font-family:trebuchet m=
s,sans-serif;color:#073763">If the table is at the bottom of the TOC it wil=
l take=C2=A0 hours before I get a result, but I get an almost immediate res=
ult when the table is at the top.=C2=A0</div><div class=3D"gmail_default" s=
tyle=3D"font-family:trebuchet ms,sans-serif;color:#073763">=C2=A0parallel r=
estore suffers with the same issue where each process has to perform a scan=
 for each table.</div><div class=3D"gmail_default" style=3D"font-family:tre=
buchet ms,sans-serif;color:#073763"><br></div><div class=3D"gmail_default" =
style=3D"font-family:trebuchet ms,sans-serif;color:#073763">What is the bes=
t way to speed up the restore ?=C2=A0</div><div class=3D"gmail_default" sty=
le=3D"font-family:trebuchet ms,sans-serif;color:#073763"><br></div><div cla=
ss=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#07=
3763"><br></div><div class=3D"gmail_default" style=3D"font-family:trebuchet=
 ms,sans-serif;color:#073763">More info about my environment :=C2=A0</div><=
div class=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif;co=
lor:#073763">pg_restore (PostgreSQL) 17.6<br></div><div class=3D"gmail_defa=
ult" style=3D"font-family:trebuchet ms,sans-serif;color:#073763"><br></div>=
<div class=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif;c=
olor:#073763">Archive :=C2=A0</div><div class=3D"gmail_default" style=3D"fo=
nt-family:trebuchet ms,sans-serif;color:#073763">; Archive created at 2025-=
09-16 16:08:28 AEST<br>; =C2=A0 =C2=A0 dbname: DB<br>; =C2=A0 =C2=A0 TOC En=
tries: 8221<br>; =C2=A0 =C2=A0 Compression: none<br>; =C2=A0 =C2=A0 Dump Ve=
rsion: 1.14-0<br>; =C2=A0 =C2=A0 Format: CUSTOM<br>; =C2=A0 =C2=A0 Integer:=
 4 bytes<br>; =C2=A0 =C2=A0 Offset: 8 bytes<br>; =C2=A0 =C2=A0 Dumped from =
database version: 14.15<br>; =C2=A0 =C2=A0 Dumped by pg_dump version: 14.19=
 (Ubuntu 14.19-1.pgdg22.04+1)<br></div><div class=3D"gmail_default" style=
=3D"font-family:trebuchet ms,sans-serif;color:#073763"><br></div><div class=
=3D"gmail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#0737=
63"><br></div><div class=3D"gmail_default" style=3D"font-family:trebuchet m=
s,sans-serif;color:#073763"><br></div><div class=3D"gmail_default" style=3D=
"font-family:trebuchet ms,sans-serif;color:#073763"><br></div><div class=3D=
"gmail_default" style=3D"font-family:trebuchet ms,sans-serif;color:#073763"=
><br></div><div class=3D"gmail_default" style=3D"font-family:trebuchet ms,s=
ans-serif;color:#073763"><br></div></div>

--000000000000de69c5063ef29ab9--