Naive question about multithreading/multicore

public inbox for [email protected]  
help / color / mirror / Atom feed

Naive question about multithreading/multicore
3+ messages / 2 participants
[nested] [flat]

* Naive question about multithreading/multicore
@ 2024-10-12 17:30  Marc SCHAEFER <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Marc SCHAEFER @ 2024-10-12 17:30 UTC (permalink / raw)
  To: [email protected]

Hello,

on a machine where starting two processes:
   perl -e 'while (1) { ; }'
I see two processed at 100% CPU, which is expected (with top).

Now, if I do:

template1=> SELECT COUNT(*) FROM pg_class a, pg_class b, pg_class c;

I see only one 100% CPU PostgreSQL process.

I read that while PostgreSQL connetions lead to a UNIX process model,
which is better for isolation, some operations have been parallelized
and can use more than one core/thread.

Maybe this specific case was not (yet?) parallelized, or should it
be and thus something is issing in my configuration?

Thank you.

PS: psql (13.16 (Debian 13.16-0+deb11u1))

^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Naive question about multithreading/multicore
@ 2024-10-12 19:16  Thomas Munro <[email protected]>
  parent: Marc SCHAEFER <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Thomas Munro @ 2024-10-12 19:16 UTC (permalink / raw)
  To: Marc SCHAEFER <[email protected]>; +Cc: [email protected]

On Sun, Oct 13, 2024 at 6:31 AM Marc SCHAEFER
<[email protected]> wrote:
> template1=> SELECT COUNT(*) FROM pg_class a, pg_class b, pg_class c;
>
> I see only one 100% CPU PostgreSQL process.

If you set set min_parallel_table_scan_size = 0 then it uses
parallelism, and completes much faster.  The planner generally works
by comparing the estimated cost of various plans (it is a "cost based"
optimiser), but the decision to actually consider parallelism at all
is essentially "rule based", and the rules aren't smart enough for
this query with default settings.  pg_class is considered too small to
bother parallelising the scan, and here you have a 3-way cross-join
which generates an enormous of work for each tuple so it is actually
a good idea to parallelise it.  I guess people don't actually do that too
often.

^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Naive question about multithreading/multicore
@ 2024-10-14 07:30  Marc SCHAEFER <[email protected]>
  parent: Thomas Munro <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Marc SCHAEFER @ 2024-10-14 07:30 UTC (permalink / raw)
  To: [email protected]

Hello,

On Sun, Oct 13, 2024 at 08:16:04AM +1300, Thomas Munro wrote:
> > template1=> SELECT COUNT(*) FROM pg_class a, pg_class b, pg_class c;
> >
> > I see only one 100% CPU PostgreSQL process.
> 
> If you set set min_parallel_table_scan_size = 0 then it uses

Without it, it uses one CPU and takes about 8.5 s (count is 57512456).

With it, it is indeed parallel (multiple CPU used) and it takes about 6s.

As this is on a machine with slow disks, it is perfectly ok, I just
wanted to see the CPU parallelism in action.

Thank you!






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

end of thread, other threads:[~2024-10-14 07:30 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-10-12 17:30 Naive question about multithreading/multicore Marc SCHAEFER <[email protected]>
2024-10-12 19:16 ` Thomas Munro <[email protected]>
2024-10-14 07:30   ` Marc SCHAEFER <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox