public inbox for [email protected]  
help / color / mirror / Atom feed
From: Dimitrios Apostolou <[email protected]>
To: [email protected]
Subject: Re: Performance issues during pg_restore -j with big partitioned table
Date: Fri, 4 Apr 2025 15:13:09 +0200 (CEST)
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

On Wed, 2 Apr 2025, Dimitrios Apostolou wrote:

> Hello list.
>
> My database includes one table with 1000 partitions, all of them rather

I was not clear here: my database dump has all that, and the database is
brand new and empty.

> sizeable. I run:
>
>  pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error
> --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump
>
> Right now after 24h of restore, I notice weird behaviour, so I have several
> questions about it:
>
> + 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting".
>   I see that they are waiting to issue a TRUNCATE for one of the
>   partitions and then COPY data to it.  Checking the log I see that
>   several partitions have already been copied finished, but many more
>   are left to start.
>
>   Why is a TRUNCATE needed at the start of a partition's COPY phase? I
>   didn't issue a --clean on the command line (I don't need it as my
>   database is newly created), and I don't see a mention of related
>   TRUNCATE in the pg_restore manual.
>
> + 1 postgres backend process is doing:
>
>   ALTER TABLE the_master_partitioned_table
>     ADD CONSTRAINT ...
>       FOREIGN KEY (columnX) REFERENCES another_table(columnX)
>
>   According to my logs this started right after COPY DATA for
>   another_table was finished. And apparently it has a lock on
>   the_master_partitioned_table that all other TRUNCATE have to wait for.
>
>   Is this a bug in the dependency resolution? Wouldn't it make sense for
>   this to wait until all 1000 partitions have finished their COPY DATA
>   phase?

Trying again, pg_restore exited with error after almost 24h:

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 8904; 2606 16529 CONSTRAINT test_runs_raw test_runs_raw_partitioned_pkey
pg_restore: error: could not execute query: ERROR:  deadlock detected
DETAIL:  Process 465409 waits for AccessExclusiveLock on relation 44437 of database 44090; blocked by process 465408.
Process 465408 waits for AccessShareLock on relation 44383 of database 44090; blocked by process 465409.
HINT:  See server log for query details.

From the logs I see that:

+ Process 465409 waits for AccessExclusiveLock on relation 44437 of database 44090; blocked by process 465408.
   --> 44437 is test_runs_raw__part_max10120k (a single partition)
+ Process 465408 waits for AccessShareLock     on relation 44383 of database 44090; blocked by process 465409.
   --> 44383 is test_runs_raw  (the master partitioned table)

Process 465409:
   ALTER TABLE ONLY public.test_runs_raw
     ADD CONSTRAINT test_runs_raw_partitioned_pkey PRIMARY KEY (workitem_n, run_n);

Process 465408:
   COPY public.test_runs_raw__part_max10120k(...) FROM stdin;


Bug? This happened on a postgres compiled from last week's master branch.
The dump I'm trying to restore is from postgres 17.4.


Thanks
Dimitris







reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Performance issues during pg_restore -j with big partitioned table
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox