public inbox for [email protected]  
help / color / mirror / Atom feed
Re: optimizing number of workers
3+ messages / 2 participants
[nested] [flat]

* Re: optimizing number of workers
@ 2025-07-14 18:25  Greg Hennessy <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Greg Hennessy @ 2025-07-14 18:25 UTC (permalink / raw)
  To: Weck, Luis <[email protected]>; +Cc: [email protected] <[email protected]>

Setting those values to zero (not something I'd want to do in production)
changes the number of workes
from 10 to 13. At least something, but if anyone knows where discussion
about how to use
large numbers of CPU's in postgresql are being held I'd appreciate learning
about it.

Greg


On Fri, Jul 11, 2025 at 2:11 PM Weck, Luis <[email protected]> wrote:

> *From: *Greg Hennessy <[email protected]>
> *Date: *Thursday, July 10, 2025 at 4:40 PM
> *To: * [email protected] <
> [email protected]>
> *Subject: *optimizing number of workers
>
> Having just received a shiny new dual CPU machine to use as a postgresql
> server, I'm trying to do some reasonable efforts to configure it
> correctly. The hard
> ware has 128 cores, and I am running a VM with Redhat 9 and Postgresql
> 16.9.
>
> In postgresql.conf I have:
> max_worker_processes = 90               # (change requires restart)
> max_parallel_workers_per_gather = 72    # gsh 26 oct 2022
> max_parallel_maintenance_workers = 72   # gsh 12 jun 2025
> max_parallel_workers =  72              # gsh 12 jun 2025
> max_logical_replication_workers = 72    # gsh 12 jun 2025
> max_sync_workers_per_subscription = 72   # gsh 12 jun 2025
> autovacuum_max_workers = 12             # max number of autovacuum
> subprocesses
>
> When I do a simple count of a large (large being 1.8 Billion entries), I
> get
> about 10 workers used.
>
> prod_v1_0_0_rc1=# explain (analyze, buffers) select count(*) from
> gaiadr3.gaia_source;
>
>                QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Finalize Aggregate  (cost=14379796.81..14379796.82 rows=1 width=8)
> (actual time=16702.806..16705.479 rows=1 loops=1)
>    Buffers: shared hit=2507481
>    ->  Gather  (cost=14379795.78..14379796.79 rows=10 width=8) (actual
> time=16702.513..16705.470 rows=11 loops=1)
>          Workers Planned: 10
>          Workers Launched: 10
>          Buffers: shared hit=2507481
>          ->  Partial Aggregate  (cost=14379785.78..14379785.79 rows=1
> width=8) (actual time=16691.820..16691.821 rows=1 loops=11)
>                Buffers: shared hit=2507481
>                ->  Parallel Index Only Scan using gaia_source_nest128 on
> gaia_source  (cost=0.58..13926632.85 rows=181261171 width=0) (actual
> time=0.025..9559.644 rows=164700888 loops=11)
>                      Heap Fetches: 0
>                      Buffers: shared hit=2507481
>  Planning:
>    Buffers: shared hit=163
>  Planning Time: 14.898 ms
>  Execution Time: 16705.592 ms
>
> Postgres has chosen to use only a small fraction of the CPU's I have on
> my machine. Given the query returns an answer in about 8 seconds, it may be
> that Postgresql has allocated the proper number of works. But if I wanted
> to try to tweak some config parameters to see if using more workers
> would give me an answer faster, I don't seem to see any obvious knobs
> to turn. Are there parameters that I can adjust to see if I can increase
> throughput? Would adjusting parallel_setup_cost or parallel_tuple_cost
> likely to be of help?
>
> I believe you can decrease min_parallel_table_scan_size (default is 8MB)
> and min_parallel_index_scan_size (default 5112kB). The number of workers
> depends also on a multiple of these settings.
>
>


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: optimizing number of workers
@ 2025-07-14 18:54  Tom Lane <[email protected]>
  parent: Greg Hennessy <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Tom Lane @ 2025-07-14 18:54 UTC (permalink / raw)
  To: Greg Hennessy <[email protected]>; +Cc: Weck, Luis <[email protected]>; [email protected] <[email protected]>

Greg Hennessy <[email protected]> writes:
>> Postgres has chosen to use only a small fraction of the CPU's I have on
>> my machine. Given the query returns an answer in about 8 seconds, it may be
>> that Postgresql has allocated the proper number of works. But if I wanted
>> to try to tweak some config parameters to see if using more workers
>> would give me an answer faster, I don't seem to see any obvious knobs
>> to turn. Are there parameters that I can adjust to see if I can increase
>> throughput? Would adjusting parallel_setup_cost or parallel_tuple_cost
>> likely to be of help?

See the bit about

             * Select the number of workers based on the log of the size of
             * the relation.  This probably needs to be a good deal more
             * sophisticated, but we need something here for now.  Note that

in compute_parallel_worker().  You can move things at the margins by
changing min_parallel_table_scan_size, but that logarithmic behavior
will constrain the number of workers pretty quickly.  You'd have to
change that code to assign a whole bunch of workers to one scan.

(No, I don't know why it's done like that.  There might be related
discussion in our archives, but finding it could be difficult.)

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: optimizing number of workers
@ 2025-07-15 17:07  Greg Hennessy <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Greg Hennessy @ 2025-07-15 17:07 UTC (permalink / raw)
  To: ; +Cc: [email protected] <[email protected]>

if I "alter table allwise set (parallel_workers = 64);" then I can get 64
workers. I wonder if the code
to check the rel_parallel_workers do deal with the default algorithm not
allocating sufficient
parallel_workers.


On Mon, Jul 14, 2025 at 2:54 PM Tom Lane <[email protected]> wrote:

> Greg Hennessy <[email protected]> writes:
> >> Postgres has chosen to use only a small fraction of the CPU's I have on
> >> my machine. Given the query returns an answer in about 8 seconds, it
> may be
> >> that Postgresql has allocated the proper number of works. But if I
> wanted
> >> to try to tweak some config parameters to see if using more workers
> >> would give me an answer faster, I don't seem to see any obvious knobs
> >> to turn. Are there parameters that I can adjust to see if I can increase
> >> throughput? Would adjusting parallel_setup_cost or parallel_tuple_cost
> >> likely to be of help?
>
> See the bit about
>
>              * Select the number of workers based on the log of the size of
>              * the relation.  This probably needs to be a good deal more
>              * sophisticated, but we need something here for now.  Note
> that
>
> in compute_parallel_worker().  You can move things at the margins by
> changing min_parallel_table_scan_size, but that logarithmic behavior
> will constrain the number of workers pretty quickly.  You'd have to
> change that code to assign a whole bunch of workers to one scan.
>
> (No, I don't know why it's done like that.  There might be related
> discussion in our archives, but finding it could be difficult.)
>
>                         regards, tom lane
>


^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2025-07-15 17:07 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-07-14 18:25 Re: optimizing number of workers Greg Hennessy <[email protected]>
2025-07-14 18:54 ` Tom Lane <[email protected]>
2025-07-15 17:07   ` Greg Hennessy <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox