Re: pg_restore scan

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: pg_restore scan
13+ messages / 3 participants
[nested] [flat]

* Re: pg_restore scan
@ 2025-09-16 22:36 Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: Adrian Klaver @ 2025-09-16 22:36 UTC (permalink / raw)
  To: R Wahyudi <[email protected]>; [email protected]

On 9/16/25 15:25, R Wahyudi wrote:
> 
> I'm trying to troubleshoot the slowness issue with pg_restore and 
> stumbled across a recent post about pg_restore scanning the whole file :
> 
>  > "scanning happens in a very inefficient way, with many seek calls and 
> small block reads. Try strace to see them. This initial phase can take 
> hours in a huge dump file, before even starting any actual restoration."
> see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820- 
> B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/ 
> E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>

This was for pg_dump output that was streamed to a Borg archive and as 
result had no object offsets in the TOC.

How are you doing your pg_dump?



-- 
Adrian Klaver
[email protected]






^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
@ 2025-09-17 00:54 ` R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:25   ` Re: pg_restore scan Adrian Klaver <[email protected]>
  0 siblings, 2 replies; 13+ messages in thread

From: R Wahyudi @ 2025-09-17 00:54 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: [email protected]

pg_dump was done using the following command :
pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>

On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <[email protected]>
wrote:

> On 9/16/25 15:25, R Wahyudi wrote:
> >
> > I'm trying to troubleshoot the slowness issue with pg_restore and
> > stumbled across a recent post about pg_restore scanning the whole file :
> >
> >  > "scanning happens in a very inefficient way, with many seek calls and
> > small block reads. Try strace to see them. This initial phase can take
> > hours in a huge dump file, before even starting any actual restoration."
> > see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
> > B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
>
> This was for pg_dump output that was streamed to a Borg archive and as
> result had no object offsets in the TOC.
>
> How are you doing your pg_dump?
>
>
>
> --
> Adrian Klaver
> [email protected]
>


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
@ 2025-09-17 01:02   ` Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  1 sibling, 1 reply; 13+ messages in thread

From: Ron Johnson @ 2025-09-17 01:02 UTC (permalink / raw)
  To: pgsql-generallists.postgresql.org <[email protected]>

So, piping or redirecting to a file?  If so, then that's the problem.

pg_dump directly to a file puts file offsets in the TOC.

This how I do custom dumps:
cd $BackupDir
pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log

On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <[email protected]> wrote:

> pg_dump was done using the following command :
> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>
> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <[email protected]>
> wrote:
>
>> On 9/16/25 15:25, R Wahyudi wrote:
>> >
>> > I'm trying to troubleshoot the slowness issue with pg_restore and
>> > stumbled across a recent post about pg_restore scanning the whole file :
>> >
>> >  > "scanning happens in a very inefficient way, with many seek calls
>> and
>> > small block reads. Try strace to see them. This initial phase can take
>> > hours in a huge dump file, before even starting any actual restoration."
>> > see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
>> > B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
>> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
>>
>> This was for pg_dump output that was streamed to a Borg archive and as
>> result had no object offsets in the TOC.
>>
>> How are you doing your pg_dump?
>>
>>
>>
>> --
>> Adrian Klaver
>> [email protected]
>>
>

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
@ 2025-09-17 02:50     ` R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: R Wahyudi @ 2025-09-17 02:50 UTC (permalink / raw)
  To: Ron Johnson <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]>

Sorry for not including the full command - yes , its piping to a
compression command :
 | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>


I think we found the issue! I'll do further testing and see how it goes !





On Wed, 17 Sept 2025 at 11:02, Ron Johnson <[email protected]> wrote:

> So, piping or redirecting to a file?  If so, then that's the problem.
>
> pg_dump directly to a file puts file offsets in the TOC.
>
> This how I do custom dumps:
> cd $BackupDir
> pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log
>
> On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <[email protected]> wrote:
>
>> pg_dump was done using the following command :
>> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>>
>> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <[email protected]>
>> wrote:
>>
>>> On 9/16/25 15:25, R Wahyudi wrote:
>>> >
>>> > I'm trying to troubleshoot the slowness issue with pg_restore and
>>> > stumbled across a recent post about pg_restore scanning the whole file
>>> :
>>> >
>>> >  > "scanning happens in a very inefficient way, with many seek calls
>>> and
>>> > small block reads. Try strace to see them. This initial phase can take
>>> > hours in a huge dump file, before even starting any actual
>>> restoration."
>>> > see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
>>> > B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
>>> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
>>>
>>> This was for pg_dump output that was streamed to a Borg archive and as
>>> result had no object offsets in the TOC.
>>>
>>> How are you doing your pg_dump?
>>>
>>>
>>>
>>> --
>>> Adrian Klaver
>>> [email protected]
>>>
>>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
@ 2025-09-17 03:47       ` Ron Johnson <[email protected]>
  2025-09-18 12:58         ` Re: pg_restore scan R Wahyudi <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: Ron Johnson @ 2025-09-17 03:47 UTC (permalink / raw)
  To: pgsql-generallists.postgresql.org <[email protected]>

PG 17 has integrated zstd compression, while --format=directory lets you do
multi-threaded dumps.  That's much faster than a single-threaded pg_dump
into a multi-threaded compression program.

(If for _Reasons_ you require a single-file backup, then tar the directory
of compressed files using the --remove-files option.)

On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <[email protected]> wrote:

> Sorry for not including the full command - yes , its piping to a
> compression command :
>  | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
>
>
> I think we found the issue! I'll do further testing and see how it goes !
>
>
>
>
>
> On Wed, 17 Sept 2025 at 11:02, Ron Johnson <[email protected]>
> wrote:
>
>> So, piping or redirecting to a file?  If so, then that's the problem.
>>
>> pg_dump directly to a file puts file offsets in the TOC.
>>
>> This how I do custom dumps:
>> cd $BackupDir
>> pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log
>>
>> On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <[email protected]> wrote:
>>
>>> pg_dump was done using the following command :
>>> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>>>
>>> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <[email protected]>
>>> wrote:
>>>
>>>> On 9/16/25 15:25, R Wahyudi wrote:
>>>> >
>>>> > I'm trying to troubleshoot the slowness issue with pg_restore and
>>>> > stumbled across a recent post about pg_restore scanning the whole
>>>> file :
>>>> >
>>>> >  > "scanning happens in a very inefficient way, with many seek calls
>>>> and
>>>> > small block reads. Try strace to see them. This initial phase can
>>>> take
>>>> > hours in a huge dump file, before even starting any actual
>>>> restoration."
>>>> > see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
>>>> > B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
>>>> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
>>>>
>>>> This was for pg_dump output that was streamed to a Borg archive and as
>>>> result had no object offsets in the TOC.
>>>>
>>>> How are you doing your pg_dump?
>>>>
>>>>
>>>>
>>>> --
>>>> Adrian Klaver
>>>> [email protected]
>>>>
>>>
>>
>> --
>> Death to <Redacted>, and butter sauce.
>> Don't boil me, I'm still alive.
>> <Redacted> lobster!
>>
>

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
@ 2025-09-18 12:58         ` R Wahyudi <[email protected]>
  2025-09-18 14:09           ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-18 15:54           ` Re: pg_restore scan Adrian Klaver <[email protected]>
  0 siblings, 2 replies; 13+ messages in thread

From: R Wahyudi @ 2025-09-18 12:58 UTC (permalink / raw)
  To: Ron Johnson <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]>

Hi All,

Thanks for the quick and accurate response!  I never been so happy seeing
IOwait on my system!

I might be blind as  I can't find information about 'offset' in pg_dump
documentation.
Where can I find more info about this?

Regards,
Rianto

On Wed, 17 Sept 2025 at 13:48, Ron Johnson <[email protected]> wrote:

>
> PG 17 has integrated zstd compression, while --format=directory lets you
> do multi-threaded dumps.  That's much faster than a single-threaded pg_dump
> into a multi-threaded compression program.
>
> (If for _Reasons_ you require a single-file backup, then tar the directory
> of compressed files using the --remove-files option.)
>
> On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <[email protected]> wrote:
>
>> Sorry for not including the full command - yes , its piping to a
>> compression command :
>>  | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
>>
>>
>> I think we found the issue! I'll do further testing and see how it goes !
>>
>>
>>
>>
>>
>> On Wed, 17 Sept 2025 at 11:02, Ron Johnson <[email protected]>
>> wrote:
>>
>>> So, piping or redirecting to a file?  If so, then that's the problem.
>>>
>>> pg_dump directly to a file puts file offsets in the TOC.
>>>
>>> This how I do custom dumps:
>>> cd $BackupDir
>>> pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log
>>>
>>> On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <[email protected]> wrote:
>>>
>>>> pg_dump was done using the following command :
>>>> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>>>>
>>>> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <[email protected]>
>>>> wrote:
>>>>
>>>>> On 9/16/25 15:25, R Wahyudi wrote:
>>>>> >
>>>>> > I'm trying to troubleshoot the slowness issue with pg_restore and
>>>>> > stumbled across a recent post about pg_restore scanning the whole
>>>>> file :
>>>>> >
>>>>> >  > "scanning happens in a very inefficient way, with many seek calls
>>>>> and
>>>>> > small block reads. Try strace to see them. This initial phase can
>>>>> take
>>>>> > hours in a huge dump file, before even starting any actual
>>>>> restoration."
>>>>> > see : https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
>>>>> > B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
>>>>> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
>>>>>
>>>>> This was for pg_dump output that was streamed to a Borg archive and as
>>>>> result had no object offsets in the TOC.
>>>>>
>>>>> How are you doing your pg_dump?
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Adrian Klaver
>>>>> [email protected]
>>>>>
>>>>
>>>
>>> --
>>> Death to <Redacted>, and butter sauce.
>>> Don't boil me, I'm still alive.
>>> <Redacted> lobster!
>>>
>>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-18 12:58         ` Re: pg_restore scan R Wahyudi <[email protected]>
@ 2025-09-18 14:09           ` Ron Johnson <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: Ron Johnson @ 2025-09-18 14:09 UTC (permalink / raw)
  To: R Wahyudi <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]>

It's towards the end of this long mailing list thread from a couple of
weeks ago.

https://www.postgrespro.com/list/id/[email protected]

On Thu, Sep 18, 2025 at 8:58 AM R Wahyudi <[email protected]> wrote:

> Hi All,
>
> Thanks for the quick and accurate response!  I never been so happy seeing
> IOwait on my system!
>
> I might be blind as  I can't find information about 'offset' in pg_dump
> documentation.
> Where can I find more info about this?
>
> Regards,
> Rianto
>
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson <[email protected]>
> wrote:
>
>>
>> PG 17 has integrated zstd compression, while --format=directory lets you
>> do multi-threaded dumps.  That's much faster than a single-threaded pg_dump
>> into a multi-threaded compression program.
>>
>> (If for _Reasons_ you require a single-file backup, then tar the
>> directory of compressed files using the --remove-files option.)
>>
>> On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <[email protected]> wrote:
>>
>>> Sorry for not including the full command - yes , its piping to a
>>> compression command :
>>>  | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
>>>
>>>
>>> I think we found the issue! I'll do further testing and see how it goes !
>>>
>>>
>>>
>>>
>>>
>>> On Wed, 17 Sept 2025 at 11:02, Ron Johnson <[email protected]>
>>> wrote:
>>>
>>>> So, piping or redirecting to a file?  If so, then that's the problem.
>>>>
>>>> pg_dump directly to a file puts file offsets in the TOC.
>>>>
>>>> This how I do custom dumps:
>>>> cd $BackupDir
>>>> pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump  2> ${db}.log
>>>>
>>>> On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi <[email protected]> wrote:
>>>>
>>>>> pg_dump was done using the following command :
>>>>> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>>>>>
>>>>> On Wed, 17 Sept 2025 at 08:36, Adrian Klaver <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> On 9/16/25 15:25, R Wahyudi wrote:
>>>>>> >
>>>>>> > I'm trying to troubleshoot the slowness issue with pg_restore and
>>>>>> > stumbled across a recent post about pg_restore scanning the whole
>>>>>> file :
>>>>>> >
>>>>>> >  > "scanning happens in a very inefficient way, with many seek
>>>>>> calls and
>>>>>> > small block reads. Try strace to see them. This initial phase can
>>>>>> take
>>>>>> > hours in a huge dump file, before even starting any actual
>>>>>> restoration."
>>>>>> > see :
>>>>>> https://www.postgresql.org/message-id/E48B611D-7D61-4575-A820-
>>>>>> > B2C3EC2E0551%40gmx.net <https://www.postgresql.org/message-id/
>>>>>> > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net>
>>>>>>
>>>>>> This was for pg_dump output that was streamed to a Borg archive and
>>>>>> as
>>>>>> result had no object offsets in the TOC.
>>>>>>
>>>>>> How are you doing your pg_dump?
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Adrian Klaver
>>>>>> [email protected]
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Death to <Redacted>, and butter sauce.
>>>> Don't boil me, I'm still alive.
>>>> <Redacted> lobster!
>>>>
>>>
>>
>> --
>> Death to <Redacted>, and butter sauce.
>> Don't boil me, I'm still alive.
>> <Redacted> lobster!
>>
>

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-18 12:58         ` Re: pg_restore scan R Wahyudi <[email protected]>
@ 2025-09-18 15:54           ` Adrian Klaver <[email protected]>
  2025-09-18 21:36             ` Re: pg_restore scan R Wahyudi <[email protected]>
  1 sibling, 1 reply; 13+ messages in thread

From: Adrian Klaver @ 2025-09-18 15:54 UTC (permalink / raw)
  To: R Wahyudi <[email protected]>; Ron Johnson <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]>

On 9/18/25 05:58, R Wahyudi wrote:
> Hi All,
> 
> Thanks for the quick and accurate response!  I never been so happy 
> seeing IOwait on my system!

Because?

What did you find?

> 
> I might be blind as  I can't find information about 'offset' in pg_dump 
> documentation.
> Where can I find more info about this?

It is not in the user documentation.

 From the thread Ron referred to, there is an explanation here:

https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us

I believe the actual code, for the -Fc format, is in pg_backup_custom.c 
here:

https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723

Per comment at line 755:

"
  If possible, re-write the TOC in order to update the data offset 
information.  This is not essential, as pg_restore can cope in most
cases without it; but it can make pg_restore significantly faster
in some situations (especially parallel restore).  We can skip this
step if we're not dumping any data; there are no offsets to update
in that case.
"

> 
> Regards,
> Rianto
> 
> On Wed, 17 Sept 2025 at 13:48, Ron Johnson <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> 
>     PG 17 has integrated zstd compression, while --format=directory lets
>     you do multi-threaded dumps.  That's much faster than a single-
>     threaded pg_dump into a multi-threaded compression program.
> 
>     (If for _Reasons_ you require a single-file backup, then tar the
>     directory of compressed files using the --remove-files option.)
> 
>     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <[email protected]
>     <mailto:[email protected]>> wrote:
> 
>         Sorry for not including the full command - yes , its piping to a
>         compression command :
>           | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
> 
> 
>         I think we found the issue! I'll do further testing and see how
>         it goes !
> 
> 
> 
> 
> 
>         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>         <[email protected] <mailto:[email protected]>> wrote:
> 
>             So, piping or redirecting to a file?  If so, then that's the
>             problem.
> 
>             pg_dump directly to a file puts file offsets in the TOC.
> 
>             This how I do custom dumps:
>             cd $BackupDir
>             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>               2> ${db}.log
> 
>             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>             <[email protected] <mailto:[email protected]>> wrote:
> 
>                 pg_dump was done using the following command :
>                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
> 
>                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>                 <[email protected]
>                 <mailto:[email protected]>> wrote:
> 
>                     On 9/16/25 15:25, R Wahyudi wrote:
>                      >
>                      > I'm trying to troubleshoot the slowness issue
>                     with pg_restore and
>                      > stumbled across a recent post about pg_restore
>                     scanning the whole file :
>                      >
>                      >  > "scanning happens in a very inefficient way,
>                     with many seek calls and
>                      > small block reads. Try strace to see them. This
>                     initial phase can take
>                      > hours in a huge dump file, before even starting
>                     any actual restoration."
>                      > see : https://www.postgresql.org/message-id/
>                     E48B611D-7D61-4575-A820- <https://
>                     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
>                      > B2C3EC2E0551%40gmx.net <http://40gmx.net;
>                     <https://www.postgresql.org/message-id/ <https://
>                     www.postgresql.org/message-id/>
>                      > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
>                     <http://40gmx.net>;
> 
>                     This was for pg_dump output that was streamed to a
>                     Borg archive and as
>                     result had no object offsets in the TOC.
> 
>                     How are you doing your pg_dump?
> 
> 
> 
>                     -- 
>                     Adrian Klaver
>                     [email protected]
>                     <mailto:[email protected]>
> 
> 
> 
>             -- 
>             Death to <Redacted>, and butter sauce.
>             Don't boil me, I'm still alive.
>             <Redacted> lobster!
> 
> 
> 
>     -- 
>     Death to <Redacted>, and butter sauce.
>     Don't boil me, I'm still alive.
>     <Redacted> lobster!
> 


-- 
Adrian Klaver
[email protected]






^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-18 12:58         ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-18 15:54           ` Re: pg_restore scan Adrian Klaver <[email protected]>
@ 2025-09-18 21:36             ` R Wahyudi <[email protected]>
  2025-09-18 21:45               ` Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-19 04:06               ` Re: pg_restore scan Ron Johnson <[email protected]>
  0 siblings, 2 replies; 13+ messages in thread

From: R Wahyudi @ 2025-09-18 21:36 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

I've been given a database dump file daily and I've been asked to restore
it.
I tried everything I could to speed up the process, including using -j 40.

I discovered that at the later stage of the restore process,  the
following behaviour repeated a few times :
40 x pg_restore process doing 100% CPU
40 x  postgres process doing COPY but using 0% CPU
..... and zero disk write activity

I don't see this behaviour when restoring the database that was dumped with
-Fd.
Also with an un-piped backup file, I can restore a specific table without
having to wait for hours.


--





On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <[email protected]>
wrote:

> On 9/18/25 05:58, R Wahyudi wrote:
> > Hi All,
> >
> > Thanks for the quick and accurate response!  I never been so happy
> > seeing IOwait on my system!
>
> Because?
>
> What did you find?
>
> >
> > I might be blind as  I can't find information about 'offset' in pg_dump
> > documentation.
> > Where can I find more info about this?
>
> It is not in the user documentation.
>
>  From the thread Ron referred to, there is an explanation here:
>
> https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us
>
> I believe the actual code, for the -Fc format, is in pg_backup_custom.c
> here:
>
>
> https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723
>
> Per comment at line 755:
>
> "
>   If possible, re-write the TOC in order to update the data offset
> information.  This is not essential, as pg_restore can cope in most
> cases without it; but it can make pg_restore significantly faster
> in some situations (especially parallel restore).  We can skip this
> step if we're not dumping any data; there are no offsets to update
> in that case.
> "
>
> >
> > Regards,
> > Rianto
> >
> > On Wed, 17 Sept 2025 at 13:48, Ron Johnson <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >
> >     PG 17 has integrated zstd compression, while --format=directory lets
> >     you do multi-threaded dumps.  That's much faster than a single-
> >     threaded pg_dump into a multi-threaded compression program.
> >
> >     (If for _Reasons_ you require a single-file backup, then tar the
> >     directory of compressed files using the --remove-files option.)
> >
> >     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <[email protected]
> >     <mailto:[email protected]>> wrote:
> >
> >         Sorry for not including the full command - yes , its piping to a
> >         compression command :
> >           | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
> >
> >
> >         I think we found the issue! I'll do further testing and see how
> >         it goes !
> >
> >
> >
> >
> >
> >         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
> >         <[email protected] <mailto:[email protected]>>
> wrote:
> >
> >             So, piping or redirecting to a file?  If so, then that's the
> >             problem.
> >
> >             pg_dump directly to a file puts file offsets in the TOC.
> >
> >             This how I do custom dumps:
> >             cd $BackupDir
> >             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
> >               2> ${db}.log
> >
> >             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
> >             <[email protected] <mailto:[email protected]>> wrote:
> >
> >                 pg_dump was done using the following command :
> >                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
> >
> >                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
> >                 <[email protected]
> >                 <mailto:[email protected]>> wrote:
> >
> >                     On 9/16/25 15:25, R Wahyudi wrote:
> >                      >
> >                      > I'm trying to troubleshoot the slowness issue
> >                     with pg_restore and
> >                      > stumbled across a recent post about pg_restore
> >                     scanning the whole file :
> >                      >
> >                      >  > "scanning happens in a very inefficient way,
> >                     with many seek calls and
> >                      > small block reads. Try strace to see them. This
> >                     initial phase can take
> >                      > hours in a huge dump file, before even starting
> >                     any actual restoration."
> >                      > see : https://www.postgresql.org/message-id/
> >                     E48B611D-7D61-4575-A820- <https://
> >
> www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
> >                      > B2C3EC2E0551%40gmx.net <http://40gmx.net;
> >                     <https://www.postgresql.org/message-id/ <https://
> >                     www.postgresql.org/message-id/>
> >                      > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
> >                     <http://40gmx.net>;
> >
> >                     This was for pg_dump output that was streamed to a
> >                     Borg archive and as
> >                     result had no object offsets in the TOC.
> >
> >                     How are you doing your pg_dump?
> >
> >
> >
> >                     --
> >                     Adrian Klaver
> >                     [email protected]
> >                     <mailto:[email protected]>
> >
> >
> >
> >             --
> >             Death to <Redacted>, and butter sauce.
> >             Don't boil me, I'm still alive.
> >             <Redacted> lobster!
> >
> >
> >
> >     --
> >     Death to <Redacted>, and butter sauce.
> >     Don't boil me, I'm still alive.
> >     <Redacted> lobster!
> >
>
>
> --
> Adrian Klaver
> [email protected]
>


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-18 12:58         ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-18 15:54           ` Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-18 21:36             ` Re: pg_restore scan R Wahyudi <[email protected]>
@ 2025-09-18 21:45               ` Adrian Klaver <[email protected]>
  2025-09-18 23:45                 ` Re: pg_restore scan R Wahyudi <[email protected]>
  1 sibling, 1 reply; 13+ messages in thread

From: Adrian Klaver @ 2025-09-18 21:45 UTC (permalink / raw)
  To: R Wahyudi <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>



On 9/18/25 2:36 PM, R Wahyudi wrote:
> I've been given a database dump file daily and I've been asked to 
> restore it.
> I tried everything I could to speed up the process, including using -j 40.
> 
> I discovered that at the later stage of the restore process,  the 
> following behaviour repeated a few times :
> 40 x pg_restore process doing 100% CPU
> 40 x  postgres process doing COPY but using 0% CPU
> ..... and zero disk write activity
> 
> I don't see this behaviour when restoring the database that was dumped 
> with -Fd.
> Also with an un-piped backup file, I can restore a specific table 
> without having to wait for hours.

 From the docs:

https://www.postgresql.org/docs/current/app-pgrestore.html

"
-j number-of-jobs

Only the custom and directory archive formats are supported with this 
option. The input must be a regular file or directory (not, for example, 
a pipe or standard input). Also, multiple jobs cannot be used together 
with the option --single-transaction.
"


> 
> 
> --
> 
> 
> 
> 
> 
> On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     On 9/18/25 05:58, R Wahyudi wrote:
>      > Hi All,
>      >
>      > Thanks for the quick and accurate response!  I never been so happy
>      > seeing IOwait on my system!
> 
>     Because?
> 
>     What did you find?
> 
>      >
>      > I might be blind as  I can't find information about 'offset' in
>     pg_dump
>      > documentation.
>      > Where can I find more info about this?
> 
>     It is not in the user documentation.
> 
>       From the thread Ron referred to, there is an explanation here:
> 
>     https://www.postgresql.org/message-
>     id/366773.1756749256%40sss.pgh.pa.us <https://www.postgresql.org/
>     message-id/366773.1756749256%40sss.pgh.pa.us>
> 
>     I believe the actual code, for the -Fc format, is in pg_backup_custom.c
>     here:
> 
>     https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/
>     pg_backup_custom.c#L723 <https://github.com/postgres/postgres/blob/
>     master/src/bin/pg_dump/pg_backup_custom.c#L723>
> 
>     Per comment at line 755:
> 
>     "
>        If possible, re-write the TOC in order to update the data offset
>     information.  This is not essential, as pg_restore can cope in most
>     cases without it; but it can make pg_restore significantly faster
>     in some situations (especially parallel restore).  We can skip this
>     step if we're not dumping any data; there are no offsets to update
>     in that case.
>     "
> 
>      >
>      > Regards,
>      > Rianto
>      >
>      > On Wed, 17 Sept 2025 at 13:48, Ron Johnson
>     <[email protected] <mailto:[email protected]>
>      > <mailto:[email protected]
>     <mailto:[email protected]>>> wrote:
>      >
>      >
>      >     PG 17 has integrated zstd compression, while --
>     format=directory lets
>      >     you do multi-threaded dumps.  That's much faster than a single-
>      >     threaded pg_dump into a multi-threaded compression program.
>      >
>      >     (If for _Reasons_ you require a single-file backup, then tar the
>      >     directory of compressed files using the --remove-files option.)
>      >
>      >     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi
>     <[email protected] <mailto:[email protected]>
>      >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
>      >
>      >         Sorry for not including the full command - yes , its
>     piping to a
>      >         compression command :
>      >           | lbzip2 -n <threadsforbzipgoeshere>--best >
>     <filenamegoeshere>
>      >
>      >
>      >         I think we found the issue! I'll do further testing and
>     see how
>      >         it goes !
>      >
>      >
>      >
>      >
>      >
>      >         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>      >         <[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>>
>     wrote:
>      >
>      >             So, piping or redirecting to a file?  If so, then
>     that's the
>      >             problem.
>      >
>      >             pg_dump directly to a file puts file offsets in the TOC.
>      >
>      >             This how I do custom dumps:
>      >             cd $BackupDir
>      >             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>      >               2> ${db}.log
>      >
>      >             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>      >             <[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>> wrote:
>      >
>      >                 pg_dump was done using the following command :
>      >                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>      >
>      >                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>      >                 <[email protected]
>     <mailto:[email protected]>
>      >                 <mailto:[email protected]
>     <mailto:[email protected]>>> wrote:
>      >
>      >                     On 9/16/25 15:25, R Wahyudi wrote:
>      >                      >
>      >                      > I'm trying to troubleshoot the slowness issue
>      >                     with pg_restore and
>      >                      > stumbled across a recent post about pg_restore
>      >                     scanning the whole file :
>      >                      >
>      >                      >  > "scanning happens in a very inefficient
>     way,
>      >                     with many seek calls and
>      >                      > small block reads. Try strace to see them.
>     This
>      >                     initial phase can take
>      >                      > hours in a huge dump file, before even
>     starting
>      >                     any actual restoration."
>      >                      > see : https://www.postgresql.org/message-
>     id/ <https://www.postgresql.org/message-id/;
>      >                     E48B611D-7D61-4575-A820- <https://
>      > www.postgresql.org/message-id/E48B611D-7D61-4575-A820- <http://
>     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->>
>      >                      > B2C3EC2E0551%40gmx.net <http://40gmx.net;
>     <http://40gmx.net <http://40gmx.net>;
>      >                     <https://www.postgresql.org/message-id/
>     <https://www.postgresql.org/message-id/; <https://
>      > www.postgresql.org/message-id/ <http://www.postgresql.org/
>     message-id/>>
>      >                      > E48B611D-7D61-4575-A820-
>     B2C3EC2E0551%40gmx.net <http://40gmx.net;
>      >                     <http://40gmx.net <http://40gmx.net>>;
>      >
>      >                     This was for pg_dump output that was streamed
>     to a
>      >                     Borg archive and as
>      >                     result had no object offsets in the TOC.
>      >
>      >                     How are you doing your pg_dump?
>      >
>      >
>      >
>      >                     --
>      >                     Adrian Klaver
>      > [email protected] <mailto:[email protected]>
>      >                     <mailto:[email protected]
>     <mailto:[email protected]>>
>      >
>      >
>      >
>      >             --
>      >             Death to <Redacted>, and butter sauce.
>      >             Don't boil me, I'm still alive.
>      >             <Redacted> lobster!
>      >
>      >
>      >
>      >     --
>      >     Death to <Redacted>, and butter sauce.
>      >     Don't boil me, I'm still alive.
>      >     <Redacted> lobster!
>      >
> 
> 
>     -- 
>     Adrian Klaver
>     [email protected] <mailto:[email protected]>
> 

-- 
Adrian Klaver
[email protected]







^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-18 12:58         ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-18 15:54           ` Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-18 21:36             ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-18 21:45               ` Re: pg_restore scan Adrian Klaver <[email protected]>
@ 2025-09-18 23:45                 ` R Wahyudi <[email protected]>
  0 siblings, 0 replies; 13+ messages in thread

From: R Wahyudi @ 2025-09-18 23:45 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

>> The input must be a regular file or directory (not, for example, a pipe
or standard input).

Thanks again for the pointer!

I successfully ran a parallel restore with no warnings presented.
I didn't really pay attention to how the dump was taken until I
accidentally stumbled upon your post.


Regards,
Rianto




On Fri, 19 Sept 2025 at 07:45, Adrian Klaver <[email protected]>
wrote:

>
>
> On 9/18/25 2:36 PM, R Wahyudi wrote:
> > I've been given a database dump file daily and I've been asked to
> > restore it.
> > I tried everything I could to speed up the process, including using -j
> 40.
> >
> > I discovered that at the later stage of the restore process,  the
> > following behaviour repeated a few times :
> > 40 x pg_restore process doing 100% CPU
> > 40 x  postgres process doing COPY but using 0% CPU
> > ..... and zero disk write activity
> >
> > I don't see this behaviour when restoring the database that was dumped
> > with -Fd.
> > Also with an un-piped backup file, I can restore a specific table
> > without having to wait for hours.
>
>  From the docs:
>
> https://www.postgresql.org/docs/current/app-pgrestore.html
>
> "
> -j number-of-jobs
>
> Only the custom and directory archive formats are supported with this
> option. The input must be a regular file or directory (not, for example,
> a pipe or standard input). Also, multiple jobs cannot be used together
> with the option --single-transaction.
> "
>
>
> >
> >
> > --
> >
> >
> >
> >
> >
> > On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     On 9/18/25 05:58, R Wahyudi wrote:
> >      > Hi All,
> >      >
> >      > Thanks for the quick and accurate response!  I never been so happy
> >      > seeing IOwait on my system!
> >
> >     Because?
> >
> >     What did you find?
> >
> >      >
> >      > I might be blind as  I can't find information about 'offset' in
> >     pg_dump
> >      > documentation.
> >      > Where can I find more info about this?
> >
> >     It is not in the user documentation.
> >
> >       From the thread Ron referred to, there is an explanation here:
> >
> >     https://www.postgresql.org/message-
> >     id/366773.1756749256%40sss.pgh.pa.us <https://www.postgresql.org/
> >     message-id/366773.1756749256%40sss.pgh.pa.us>
> >
> >     I believe the actual code, for the -Fc format, is in
> pg_backup_custom.c
> >     here:
> >
> >     https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/
> >     pg_backup_custom.c#L723 <https://github.com/postgres/postgres/blob/
> >     master/src/bin/pg_dump/pg_backup_custom.c#L723>
> >
> >     Per comment at line 755:
> >
> >     "
> >        If possible, re-write the TOC in order to update the data offset
> >     information.  This is not essential, as pg_restore can cope in most
> >     cases without it; but it can make pg_restore significantly faster
> >     in some situations (especially parallel restore).  We can skip this
> >     step if we're not dumping any data; there are no offsets to update
> >     in that case.
> >     "
> >
> >      >
> >      > Regards,
> >      > Rianto
> >      >
> >      > On Wed, 17 Sept 2025 at 13:48, Ron Johnson
> >     <[email protected] <mailto:[email protected]>
> >      > <mailto:[email protected]
> >     <mailto:[email protected]>>> wrote:
> >      >
> >      >
> >      >     PG 17 has integrated zstd compression, while --
> >     format=directory lets
> >      >     you do multi-threaded dumps.  That's much faster than a
> single-
> >      >     threaded pg_dump into a multi-threaded compression program.
> >      >
> >      >     (If for _Reasons_ you require a single-file backup, then tar
> the
> >      >     directory of compressed files using the --remove-files
> option.)
> >      >
> >      >     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi
> >     <[email protected] <mailto:[email protected]>
> >      >     <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >      >
> >      >         Sorry for not including the full command - yes , its
> >     piping to a
> >      >         compression command :
> >      >           | lbzip2 -n <threadsforbzipgoeshere>--best >
> >     <filenamegoeshere>
> >      >
> >      >
> >      >         I think we found the issue! I'll do further testing and
> >     see how
> >      >         it goes !
> >      >
> >      >
> >      >
> >      >
> >      >
> >      >         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
> >      >         <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>
> >     wrote:
> >      >
> >      >             So, piping or redirecting to a file?  If so, then
> >     that's the
> >      >             problem.
> >      >
> >      >             pg_dump directly to a file puts file offsets in the
> TOC.
> >      >
> >      >             This how I do custom dumps:
> >      >             cd $BackupDir
> >      >             pg_dump -Fc --compress=zstd:long -v -d${db} -f
> ${db}.dump
> >      >               2> ${db}.log
> >      >
> >      >             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
> >      >             <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >      >
> >      >                 pg_dump was done using the following command :
> >      >                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d
> <database>
> >      >
> >      >                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
> >      >                 <[email protected]
> >     <mailto:[email protected]>
> >      >                 <mailto:[email protected]
> >     <mailto:[email protected]>>> wrote:
> >      >
> >      >                     On 9/16/25 15:25, R Wahyudi wrote:
> >      >                      >
> >      >                      > I'm trying to troubleshoot the slowness
> issue
> >      >                     with pg_restore and
> >      >                      > stumbled across a recent post about
> pg_restore
> >      >                     scanning the whole file :
> >      >                      >
> >      >                      >  > "scanning happens in a very inefficient
> >     way,
> >      >                     with many seek calls and
> >      >                      > small block reads. Try strace to see them.
> >     This
> >      >                     initial phase can take
> >      >                      > hours in a huge dump file, before even
> >     starting
> >      >                     any actual restoration."
> >      >                      > see : https://www.postgresql.org/message-
> >     id/ <https://www.postgresql.org/message-id/;
> >      >                     E48B611D-7D61-4575-A820- <https://
> >      > www.postgresql.org/message-id/E48B611D-7D61-4575-A820- <http://
> >     www.postgresql.org/message-id/E48B611D-7D61-4575-A820->>
> >      >                      > B2C3EC2E0551%40gmx.net <http://40gmx.net;
> >     <http://40gmx.net <http://40gmx.net>;
> >      >                     <https://www.postgresql.org/message-id/
> >     <https://www.postgresql.org/message-id/; <https://
> >      > www.postgresql.org/message-id/ <http://www.postgresql.org/
> >     message-id/>>
> >      >                      > E48B611D-7D61-4575-A820-
> >     B2C3EC2E0551%40gmx.net <http://40gmx.net;
> >      >                     <http://40gmx.net <http://40gmx.net>>;
> >      >
> >      >                     This was for pg_dump output that was streamed
> >     to a
> >      >                     Borg archive and as
> >      >                     result had no object offsets in the TOC.
> >      >
> >      >                     How are you doing your pg_dump?
> >      >
> >      >
> >      >
> >      >                     --
> >      >                     Adrian Klaver
> >      > [email protected] <mailto:[email protected]>
> >      >                     <mailto:[email protected]
> >     <mailto:[email protected]>>
> >      >
> >      >
> >      >
> >      >             --
> >      >             Death to <Redacted>, and butter sauce.
> >      >             Don't boil me, I'm still alive.
> >      >             <Redacted> lobster!
> >      >
> >      >
> >      >
> >      >     --
> >      >     Death to <Redacted>, and butter sauce.
> >      >     Don't boil me, I'm still alive.
> >      >     <Redacted> lobster!
> >      >
> >
> >
> >     --
> >     Adrian Klaver
> >     [email protected] <mailto:[email protected]>
> >
>
> --
> Adrian Klaver
> [email protected]
>
>


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 01:02   ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-17 02:50     ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-17 03:47       ` Re: pg_restore scan Ron Johnson <[email protected]>
  2025-09-18 12:58         ` Re: pg_restore scan R Wahyudi <[email protected]>
  2025-09-18 15:54           ` Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-18 21:36             ` Re: pg_restore scan R Wahyudi <[email protected]>
@ 2025-09-19 04:06               ` Ron Johnson <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: Ron Johnson @ 2025-09-19 04:06 UTC (permalink / raw)
  To: pgsql-general

On Thu, Sep 18, 2025 at 5:37 PM R Wahyudi <[email protected]> wrote:

> I've been given a database dump file daily and I've been asked to restore
> it.
> I tried everything I could to speed up the process, including using -j 40.
>
> I discovered that at the later stage of the restore process,  the
> following behaviour repeated a few times :
> 40 x pg_restore process doing 100% CPU
>

Threads are not magic.  IO and memory limitations still exist.


> 40 x  postgres process doing COPY but using 0% CPU
> ..... and zero disk write activity
>
> I don't see this behaviour when restoring the database that was dumped
> with -Fd.
> Also with an un-piped backup file, I can restore a specific table without
> having to wait for hours.
>

We explained this three days ago.  Heck, it's in this very email.   Click
on "the three dots", scroll down a bit.


> On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <[email protected]>
> wrote:
>
>> On 9/18/25 05:58, R Wahyudi wrote:
>> > Hi All,
>> >
>> > Thanks for the quick and accurate response!  I never been so happy
>> > seeing IOwait on my system!
>>
>> Because?
>>
>> What did you find?
>>
>> >
>> > I might be blind as  I can't find information about 'offset' in pg_dump
>> > documentation.
>> > Where can I find more info about this?
>>
>> It is not in the user documentation.
>>
>>  From the thread Ron referred to, there is an explanation here:
>>
>> https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us
>>
>> I believe the actual code, for the -Fc format, is in pg_backup_custom.c
>> here:
>>
>>
>> https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723
>>
>> Per comment at line 755:
>>
>> "
>>   If possible, re-write the TOC in order to update the data offset
>> information.  This is not essential, as pg_restore can cope in most
>> cases without it; but it can make pg_restore significantly faster
>> in some situations (especially parallel restore).  We can skip this
>> step if we're not dumping any data; there are no offsets to update
>> in that case.
>> "
>>
>> >
>> > Regards,
>> > Rianto
>> >
>> > On Wed, 17 Sept 2025 at 13:48, Ron Johnson <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> >
>> >     PG 17 has integrated zstd compression, while --format=directory lets
>> >     you do multi-threaded dumps.  That's much faster than a single-
>> >     threaded pg_dump into a multi-threaded compression program.
>> >
>> >     (If for _Reasons_ you require a single-file backup, then tar the
>> >     directory of compressed files using the --remove-files option.)
>> >
>> >     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <[email protected]
>> >     <mailto:[email protected]>> wrote:
>> >
>> >         Sorry for not including the full command - yes , its piping to a
>> >         compression command :
>> >           | lbzip2 -n <threadsforbzipgoeshere>--best >
>> <filenamegoeshere>
>> >
>> >
>> >         I think we found the issue! I'll do further testing and see how
>> >         it goes !
>> >
>> >
>> >
>> >
>> >
>> >         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
>> >         <[email protected] <mailto:[email protected]>>
>> wrote:
>> >
>> >             So, piping or redirecting to a file?  If so, then that's the
>> >             problem.
>> >
>> >             pg_dump directly to a file puts file offsets in the TOC.
>> >
>> >             This how I do custom dumps:
>> >             cd $BackupDir
>> >             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
>> >               2> ${db}.log
>> >
>> >             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
>> >             <[email protected] <mailto:[email protected]>> wrote:
>> >
>> >                 pg_dump was done using the following command :
>> >                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
>> >
>> >                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
>> >                 <[email protected]
>> >                 <mailto:[email protected]>> wrote:
>> >
>> >                     On 9/16/25 15:25, R Wahyudi wrote:
>> >                      >
>> >                      > I'm trying to troubleshoot the slowness issue
>> >                     with pg_restore and
>> >                      > stumbled across a recent post about pg_restore
>> >                     scanning the whole file :
>> >                      >
>> >                      >  > "scanning happens in a very inefficient way,
>> >                     with many seek calls and
>> >                      > small block reads. Try strace to see them. This
>> >                     initial phase can take
>> >                      > hours in a huge dump file, before even starting
>> >                     any actual restoration."
>> >                      > see : https://www.postgresql.org/message-id/
>> >                     E48B611D-7D61-4575-A820- <https://
>> >
>> www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
>> >                      > B2C3EC2E0551%40gmx.net <http://40gmx.net;
>> >                     <https://www.postgresql.org/message-id/ <https://
>> >                     www.postgresql.org/message-id/>
>> >                      > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
>> >                     <http://40gmx.net>;
>> >
>> >                     This was for pg_dump output that was streamed to a
>> >                     Borg archive and as
>> >                     result had no object offsets in the TOC.
>> >
>> >                     How are you doing your pg_dump?
>> >
>> >
>> >
>> >                     --
>> >                     Adrian Klaver
>> >                     [email protected]
>> >                     <mailto:[email protected]>
>> >
>> >
>> >
>> >             --
>> >             Death to <Redacted>, and butter sauce.
>> >             Don't boil me, I'm still alive.
>> >             <Redacted> lobster!
>> >
>> >
>> >
>> >     --
>> >     Death to <Redacted>, and butter sauce.
>> >     Don't boil me, I'm still alive.
>> >     <Redacted> lobster!
>> >
>>
>>
>> --
>> Adrian Klaver
>> [email protected]
>>
>

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: pg_restore scan
  2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
  2025-09-17 00:54 ` Re: pg_restore scan R Wahyudi <[email protected]>
@ 2025-09-17 02:25   ` Adrian Klaver <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: Adrian Klaver @ 2025-09-17 02:25 UTC (permalink / raw)
  To: R Wahyudi <[email protected]>; +Cc: [email protected]

On 9/16/25 17:54, R Wahyudi wrote:
> pg_dump was done using the following command :
> pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>

What do you do with the output?




-- 
Adrian Klaver
[email protected]






^ permalink  raw  reply  [nested|flat] 13+ messages in thread

end of thread, other threads:[~2025-09-19 04:06 UTC | newest]

Thread overview: 13+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-09-16 22:36 Re: pg_restore scan Adrian Klaver <[email protected]>
2025-09-17 00:54 ` R Wahyudi <[email protected]>
2025-09-17 01:02   ` Ron Johnson <[email protected]>
2025-09-17 02:50     ` R Wahyudi <[email protected]>
2025-09-17 03:47       ` Ron Johnson <[email protected]>
2025-09-18 12:58         ` R Wahyudi <[email protected]>
2025-09-18 14:09           ` Ron Johnson <[email protected]>
2025-09-18 15:54           ` Adrian Klaver <[email protected]>
2025-09-18 21:36             ` R Wahyudi <[email protected]>
2025-09-18 21:45               ` Adrian Klaver <[email protected]>
2025-09-18 23:45                 ` R Wahyudi <[email protected]>
2025-09-19 04:06               ` Ron Johnson <[email protected]>
2025-09-17 02:25   ` Adrian Klaver <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox