public inbox for [email protected]help / color / mirror / Atom feed
Loading the latest N rows into the cache seems way too fast. 6+ messages / 3 participants [nested] [flat]
* Loading the latest N rows into the cache seems way too fast. @ 2025-02-17 20:32 Ron Johnson <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Ron Johnson @ 2025-02-17 20:32 UTC (permalink / raw) To: pgsql-general PG 9.6.24 and PG 14.15, if it matters. (Yes, 9.6 is really EOL. I don't control that.) (I could use pg_prewarm, but the table is much bigger than RAM, and last_block value only has the newest record if data has never been deleted. The oldest records regularly get deleted, and then the table is vacuumed; thus, new records can be anywhere in the table.) Thus, roll my own cache-loading statement. The bigint "id" column in "mytbl" is populated from a sequence, and so is monotonically increasing: the newest records will have the biggest id values. The table also has a bytea column that averages about 100KB. Loading 200K rows is more than 200MB. I expected this "prewarm" statement to take much longer than 1/2 second. Am I still in the dark ages of computer speed, or is this statement not doing what I hope it's doing? $ time psql -h foo bar -Xc "DO \$\$ BEGIN PERFORM * FROM mytbl ORDER BY id DESC LIMIT 200000 ; END \$\$;" DO real 0m0.457s user 0m0.005s sys 0m0.004s -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster! ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Loading the latest N rows into the cache seems way too fast. @ 2025-02-17 21:36 Tom Lane <[email protected]> parent: Ron Johnson <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Tom Lane @ 2025-02-17 21:36 UTC (permalink / raw) To: Ron Johnson <[email protected]>; +Cc: pgsql-general Ron Johnson <[email protected]> writes: > The bigint "id" column in "mytbl" is populated from a sequence, and so is > monotonically increasing: the newest records will have the biggest id > values. > The table also has a bytea column that averages about 100KB. > Loading 200K rows is more than 200MB. I expected this "prewarm" statement > to take much longer than 1/2 second. Am I still in the dark ages of > computer speed, or is this statement not doing what I hope it's doing? It's not pulling in the TOAST storage where the bytea column lives. (pg_prewarm wouldn't have either, without special pushups.) regards, tom lane ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Loading the latest N rows into the cache seems way too fast. @ 2025-02-17 21:40 Ron Johnson <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 2 replies; 6+ messages in thread From: Ron Johnson @ 2025-02-17 21:40 UTC (permalink / raw) To: pgsql-general On Mon, Feb 17, 2025 at 4:36 PM Tom Lane <[email protected]> wrote: > Ron Johnson <[email protected]> writes: > > The bigint "id" column in "mytbl" is populated from a sequence, and so is > > monotonically increasing: the newest records will have the biggest id > > values. > > The table also has a bytea column that averages about 100KB. > > > Loading 200K rows is more than 200MB. I expected this "prewarm" > statement > > to take much longer than 1/2 second. Am I still in the dark ages of > > computer speed, or is this statement not doing what I hope it's doing? > > It's not pulling in the TOAST storage where the bytea column lives. > (pg_prewarm wouldn't have either, without special pushups.) > Puzzling, since I ran "PERFORM *". What if I explicitly mentioned the bytea column's name? -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster! ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Loading the latest N rows into the cache seems way too fast. @ 2025-02-17 21:51 Tom Lane <[email protected]> parent: Ron Johnson <[email protected]> 1 sibling, 1 reply; 6+ messages in thread From: Tom Lane @ 2025-02-17 21:51 UTC (permalink / raw) To: Ron Johnson <[email protected]>; +Cc: pgsql-general Ron Johnson <[email protected]> writes: > On Mon, Feb 17, 2025 at 4:36 PM Tom Lane <[email protected]> wrote: >> It's not pulling in the TOAST storage where the bytea column lives. >> (pg_prewarm wouldn't have either, without special pushups.) > Puzzling, since I ran "PERFORM *". What if I explicitly mentioned the > bytea column's name? You'd have to do something that actually used the column's value, perhaps "md5(byteacol)" or such. (The obvious candidate would be length(), but I think that is optimized to not fetch or decompress the whole value.) regards, tom lane ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Loading the latest N rows into the cache seems way too fast. @ 2025-02-17 21:58 David G. Johnston <[email protected]> parent: Ron Johnson <[email protected]> 1 sibling, 0 replies; 6+ messages in thread From: David G. Johnston @ 2025-02-17 21:58 UTC (permalink / raw) To: Ron Johnson <[email protected]>; +Cc: pgsql-general On Mon, Feb 17, 2025 at 2:41 PM Ron Johnson <[email protected]> wrote: > On Mon, Feb 17, 2025 at 4:36 PM Tom Lane <[email protected]> wrote: > >> Ron Johnson <[email protected]> writes: >> > The bigint "id" column in "mytbl" is populated from a sequence, and so >> is >> > monotonically increasing: the newest records will have the biggest id >> > values. >> > The table also has a bytea column that averages about 100KB. >> >> > Loading 200K rows is more than 200MB. I expected this "prewarm" >> statement >> > to take much longer than 1/2 second. Am I still in the dark ages of >> > computer speed, or is this statement not doing what I hope it's doing? >> >> It's not pulling in the TOAST storage where the bytea column lives. >> (pg_prewarm wouldn't have either, without special pushups.) >> > > Puzzling, since I ran "PERFORM *". What if I explicitly mentioned the > bytea column's name? > > It's more about the system optimizing away data retrieval because you've indicated you don't care about the contents due to using PERFORM. All it needs is a pointer to represent the future data, not the data itself. And PERFORM will never resolve that pointer by itself - so as Tom said your query would need to force pointer resolution by computing on the data. David J. ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Loading the latest N rows into the cache seems way too fast. @ 2025-02-17 23:03 Ron Johnson <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Ron Johnson @ 2025-02-17 23:03 UTC (permalink / raw) To: pgsql-general On Mon, Feb 17, 2025 at 4:51 PM Tom Lane <[email protected]> wrote: > Ron Johnson <[email protected]> writes: > > On Mon, Feb 17, 2025 at 4:36 PM Tom Lane <[email protected]> wrote: > >> It's not pulling in the TOAST storage where the bytea column lives. > >> (pg_prewarm wouldn't have either, without special pushups.) > > > Puzzling, since I ran "PERFORM *". What if I explicitly mentioned the > > bytea column's name? > > You'd have to do something that actually used the column's value, > perhaps "md5(byteacol)" or such. (The obvious candidate would be > length(), but I think that is optimized to not fetch or decompress > the whole value.) > That's definitely taking a LOT longer... -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster! ^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2025-02-17 23:03 UTC | newest] Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-02-17 20:32 Loading the latest N rows into the cache seems way too fast. Ron Johnson <[email protected]> 2025-02-17 21:36 ` Tom Lane <[email protected]> 2025-02-17 21:40 ` Ron Johnson <[email protected]> 2025-02-17 21:51 ` Tom Lane <[email protected]> 2025-02-17 23:03 ` Ron Johnson <[email protected]> 2025-02-17 21:58 ` David G. Johnston <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox