public inbox for [email protected]help / color / mirror / Atom feed
Re: PG17 optimizations to vacuum 7+ messages / 4 participants [nested] [flat]
* Re: PG17 optimizations to vacuum @ 2024-09-02 17:47 Peter Geoghegan <[email protected]> 0 siblings, 1 reply; 7+ messages in thread From: Peter Geoghegan @ 2024-09-02 17:47 UTC (permalink / raw) To: Melanie Plageman <[email protected]>; +Cc: Pavel Luzanov <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>; Heikki Linnakangas <[email protected]> On Mon, Sep 2, 2024 at 1:29 PM Melanie Plageman <[email protected]> wrote: > I'll investigate more tomorrow, but based on my initial investigation, > there appears to be some interaction related to how much of the > relation is in shared buffers after creating the table and updating > it. If you set shared_buffers sufficiently high and prewarm the table > after the update, master has fewer WAL records reported by vacuum > verbose. Fewer of what specific kind of WAL record? All of the details about useful work done by VACUUM were identical across versions. It was only the details related to WAL, buffers, and CPU time that changed. Perhaps I'm not thinking of something obvious. Maybe it's extra VISIBILITY records? But I'd expect the number of VISIBILITY records to match the number of pages frozen, given these particulars. VACUUM VERBOSE at least shows that that hasn't changed. -- Peter Geoghegan ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: PG17 optimizations to vacuum @ 2024-09-02 19:23 Melanie Plageman <[email protected]> parent: Peter Geoghegan <[email protected]> 0 siblings, 2 replies; 7+ messages in thread From: Melanie Plageman @ 2024-09-02 19:23 UTC (permalink / raw) To: Peter Geoghegan <[email protected]>; +Cc: Pavel Luzanov <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>; Heikki Linnakangas <[email protected]> On Mon, Sep 2, 2024 at 1:47 PM Peter Geoghegan <[email protected]> wrote: > > On Mon, Sep 2, 2024 at 1:29 PM Melanie Plageman > <[email protected]> wrote: > > I'll investigate more tomorrow, but based on my initial investigation, > > there appears to be some interaction related to how much of the > > relation is in shared buffers after creating the table and updating > > it. If you set shared_buffers sufficiently high and prewarm the table > > after the update, master has fewer WAL records reported by vacuum > > verbose. > > Fewer of what specific kind of WAL record? I would have expected to see no freeze records (since they no longer exist) and the same number of prune records. However, the overall number of records that I get for 16 and master is pretty similar. For some reason I stopped being able to reproduce Pavel's case. I'll work more on it tomorrow. This is roughly what I get for records by vacuum. Note that I prefixed VACUUM with BTREE on master to indicate those records are from index vacuuming. By default the headesc routine for records emitted by index vacuuming prints just VACUUM -- perhaps it would be better to prefix it. Note that these add up to almost the same thing. I don't know yet why the number PRUNE_VACUUM_SCAN is different than PRUNE on 16. PRUNE_VACUUM_SCAN and PRUNE + FREEZE_PAGE on 16 are similar. So, there must be pages that don't have items being pruned which are being frozen. I'll need to investigate further. master -- PRUNE_ON_ACCESS | 6 PRUNE_VACUUM_SCAN | 30974 PRUNE_VACUUM_CLEANUP | 14162 BTREE_VACUUM | 19127 16 -- PRUNE | 15504 FREEZE_PAGE | 13257 VACUUM | 34527 - Melanie ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: PG17 optimizations to vacuum @ 2024-09-02 19:42 Peter Geoghegan <[email protected]> parent: Melanie Plageman <[email protected]> 1 sibling, 0 replies; 7+ messages in thread From: Peter Geoghegan @ 2024-09-02 19:42 UTC (permalink / raw) To: Melanie Plageman <[email protected]>; +Cc: Pavel Luzanov <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>; Heikki Linnakangas <[email protected]> On Mon, Sep 2, 2024 at 3:23 PM Melanie Plageman <[email protected]> wrote: > This is roughly what I get for records by vacuum. Note that I prefixed > VACUUM with BTREE on master to indicate those records are from index > vacuuming. By default the headesc routine for records emitted by index > vacuuming prints just VACUUM -- perhaps it would be better to prefix > it. > > Note that these add up to almost the same thing. I don't know yet why > the number PRUNE_VACUUM_SCAN is different than PRUNE on 16. That is indeed surprising, given that Pavel's VACUUM VERBOSE output indicates that the number of heap tuples deleted is identical across versions. The output also strongly suggests that the same heap pages are pruned on both versions, since the "index scan needed: " line is also identical across versions. Might it be that the extra PRUNE_VACUUM_SCAN records originated in pages that only contained existing LP_UNUSED items when scanned by VACUUM? -- Peter Geoghegan ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: PG17 optimizations to vacuum @ 2024-09-02 20:35 Pavel Luzanov <[email protected]> parent: Melanie Plageman <[email protected]> 1 sibling, 2 replies; 7+ messages in thread From: Pavel Luzanov @ 2024-09-02 20:35 UTC (permalink / raw) To: Melanie Plageman <[email protected]>; Peter Geoghegan <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]>; Heikki Linnakangas <[email protected]> On 02.09.2024 22:23, Melanie Plageman wrote: > For some reason I stopped being able to reproduce Pavel's case. I repeated the test on another computer, but compared master with v15. The results are the same. The test can be simplified as follows: CREATE TABLE t(id integer) WITH (autovacuum_enabled = off); INSERT INTO t SELECT gen.id FROM generate_series(1,3_500_000) gen(id); CREATE INDEX t_id ON t(id); VACUUM FREEZE t; UPDATE t SET id = id + 1; VACUUM FREEZE VERBOSE t; My results (only line with WAL info from the last VACUUM command). master: WAL usage: 119583 records, 37231 full page images, 272631468 bytes v15: WAL usage: 96565 records, 47647 full page images, 217144602 bytes If it helps, without creating index on id column, the numbers will be much closer: master: WAL usage: 78502 records, 22090 full page images, 196215494 bytes v15: WAL usage: 77437 records, 30872 full page images, 152080268 bytes -- Pavel Luzanov Postgres Professional:https://postgrespro.com ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: PG17 optimizations to vacuum @ 2024-09-02 20:58 Heikki Linnakangas <[email protected]> parent: Pavel Luzanov <[email protected]> 1 sibling, 0 replies; 7+ messages in thread From: Heikki Linnakangas @ 2024-09-02 20:58 UTC (permalink / raw) To: Pavel Luzanov <[email protected]>; Melanie Plageman <[email protected]>; Peter Geoghegan <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]> On 02/09/2024 23:35, Pavel Luzanov wrote: > On 02.09.2024 22:23, Melanie Plageman wrote: >> For some reason I stopped being able to reproduce Pavel's case. I also cannot reproduce this. > I repeated the test on another computer, but compared master with v15. > The results are the same. The test can be simplified as follows: > > CREATE TABLE t(id integer) WITH (autovacuum_enabled = off); > INSERT INTO t SELECT gen.id FROM generate_series(1,3_500_000) gen(id); > CREATE INDEX t_id ON t(id); > VACUUM FREEZE t; > UPDATE t SET id = id + 1; > VACUUM FREEZE VERBOSE t; > > My results (only line with WAL info from the last VACUUM command). > > master: > WAL usage: 119583 records, 37231 full page images, 272631468 bytes > > v15: > WAL usage: 96565 records, 47647 full page images, 217144602 bytes Can you dump the stats with pg_waldump please. Something like: CREATE TABLE t(id integer) WITH (autovacuum_enabled = off); INSERT INTO t SELECT gen.id FROM generate_series(1,3_500_000) gen(id); CREATE INDEX t_id ON t(id); SET maintenance_work_mem = '1MB'; UPDATE t SET id = id + 1; select pg_current_wal_insert_lsn(); -- <<< PRINT WAL POS BEFORE VACUUM VACUUM FREEZE VERBOSE t; And then: pg_waldump -p data/pg_wal/ -s 1/F4474498 --stats=record where "1/F4474498" is the position printed by the "SELECT pg_current_wal_insert_lsn()" above. Do you have any non-default settings? "select name, current_setting(name), source from pg_settings where setting <> boot_val;" would show that. -- Heikki Linnakangas Neon (https://neon.tech) ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: PG17 optimizations to vacuum @ 2024-09-02 20:58 Peter Geoghegan <[email protected]> parent: Pavel Luzanov <[email protected]> 1 sibling, 1 reply; 7+ messages in thread From: Peter Geoghegan @ 2024-09-02 20:58 UTC (permalink / raw) To: Pavel Luzanov <[email protected]>; +Cc: Melanie Plageman <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>; Heikki Linnakangas <[email protected]> On Mon, Sep 2, 2024 at 4:35 PM Pavel Luzanov <[email protected]> wrote: > If it helps, without creating index on id column, the numbers will be > much closer: Yes, avoiding all index vacuuming seems useful. It makes the test case cleaner, since we don't have to think about the variability from the TIDStore work (and from index vacuuming more generally). > master: > WAL usage: 78502 records, 22090 full page images, 196215494 bytes > > v15: > WAL usage: 77437 records, 30872 full page images, 152080268 bytes To be clear, I wouldn't be all that surprised if some issue with alignment padding and/or the new WAL format made master write more bytes of WAL total than on earlier versions. Small inefficiencies like that were always a possibility. That's not the problem that I'm principally concerned about right now. I'm most concerned about the simple fact that we're writing more individual WAL records than on previous versions, despite the fact that VACUUM does exactly the same amount of useful pruning and freezing work across versions. How could a patch that merged together the previous PRUNE and VACUUM WAL record types into a single record type actually cause an increase in the number of WAL records written? I'd have thought that that was simply impossible. -- Peter Geoghegan ^ permalink raw reply [nested|flat] 7+ messages in thread
* Re: PG17 optimizations to vacuum @ 2024-09-02 21:17 Peter Geoghegan <[email protected]> parent: Peter Geoghegan <[email protected]> 0 siblings, 0 replies; 7+ messages in thread From: Peter Geoghegan @ 2024-09-02 21:17 UTC (permalink / raw) To: Pavel Luzanov <[email protected]>; +Cc: Melanie Plageman <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>; Heikki Linnakangas <[email protected]> On Mon, Sep 2, 2024 at 4:58 PM Peter Geoghegan <[email protected]> wrote: > On Mon, Sep 2, 2024 at 4:35 PM Pavel Luzanov <[email protected]> wrote: > > If it helps, without creating index on id column, the numbers will be > > much closer: > > Yes, avoiding all index vacuuming seems useful. It makes the test case > cleaner, since we don't have to think about the variability from the > TIDStore work (and from index vacuuming more generally). It just occurred to me that earlier versions don't have the HEAP_PAGE_PRUNE_MARK_UNUSED_NOW optimization added by commit c120550edb. Postgres 17 does have that optimization, though, so it should easily be able to write far fewer WAL records than earlier versions. And yet your revised no-indexes test case seems to show that Postgres 17 is doing slightly worse by that measure (and by others). -- Peter Geoghegan ^ permalink raw reply [nested|flat] 7+ messages in thread
end of thread, other threads:[~2024-09-02 21:17 UTC | newest] Thread overview: 7+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2024-09-02 17:47 Re: PG17 optimizations to vacuum Peter Geoghegan <[email protected]> 2024-09-02 19:23 ` Melanie Plageman <[email protected]> 2024-09-02 19:42 ` Peter Geoghegan <[email protected]> 2024-09-02 20:35 ` Pavel Luzanov <[email protected]> 2024-09-02 20:58 ` Heikki Linnakangas <[email protected]> 2024-09-02 20:58 ` Peter Geoghegan <[email protected]> 2024-09-02 21:17 ` Peter Geoghegan <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox