Hello hackers,
I think we might be able to go one step further and completely skip the attislocal UPDATE for partition tables. This is because the attislocal UPDATE is done immediately after 'CREATE TABLE', during the 'ATTACH PARTITION' step(see attislocal being set to false in MergeAttributesIntoExisting). The UPDATEs emitted by pg_dump are therefore redundant. Even with batching, the single UPDATE still modifies N(no of columns) rows causing N relcache invalidations. This same workflow is then repeated by ATTACH PARTITION causing another N relcache invalidations.
Skipping the attislocal UPDATE definitely speeds up the runtime if there are a lot of partition tables because we will avoid quite a lot of relcache invalidations and rebuild calls. Since this optimization removes the attislocal UPDATE completely, the effect will be even more pronounced for wider partition tables.
There's already precedent for this:
* attinhcount is never explicitly set by pg_dump. It is only modified by MergeAttributesIntoExisting during ATTACH PARTITION
* conislocal for CHECK constraints is explicitly not fixed for partitions. See comment "No need to fix conislocal: ATTACH PARTITION does that" in dumpTableSchema
The only risk I can foresee is the window between CREATE TABLE and ATTACH PARTITION where attislocal will be incorrectly set to true. But I think this window is small enough to not worry about since ATTACH PARTITION immediately succeeds CREATE TABLE(maybe barring some other minor updates)
Here is a simple patch
```
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 137161aa5e0..d3d7403228a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -17734,7 +17734,8 @@ dumpTableSchema(Archive *fout, const TableInfo *tbinfo)
for (j = 0; j < tbinfo->numatts; j++)
{
if (!tbinfo->attisdropped[j] &&
- !tbinfo->attislocal[j])
+ !tbinfo->attislocal[j] &&
+ !tbinfo->ispartition)
{
if (firstitem)
{
```
I tried a few experiments on my local macbook and noticed an improvement of around ~33-36% (% can vary mostly depending on the number of columns)
Setup(master branch): 300 partitioned root tables with 200 leaves each = 60000 partition tables
300 columns per partition:
baseline : ~30 mins
with patch : ~20 mins (~33% faster)
700 columns per partition:
baseline : ~66 mins
with patch : ~42 mins (~36% faster)
Thanks
Nikhil
Broadcom Inc.