Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7zuC-00070m-2Q for pgsql-hackers@arkaria.postgresql.org; Wed, 01 Apr 2026 18:03:05 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7zuB-001kBe-1I for pgsql-hackers@arkaria.postgresql.org; Wed, 01 Apr 2026 18:03:03 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7zuB-001kBW-0M for pgsql-hackers@lists.postgresql.org; Wed, 01 Apr 2026 18:03:03 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w7zu8-000000003HX-3Akr for pgsql-hackers@lists.postgresql.org; Wed, 01 Apr 2026 18:03:02 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 631I2wQF2992736; Wed, 1 Apr 2026 14:02:58 -0400 From: Tom Lane To: Andrew Dunstan cc: David Rowley , PostgreSQL Hackers Subject: Re: scale parallel_tuple_cost by tuple width In-reply-to: <2d7c4e54-6a0b-4b1a-87a6-278c49fa52f0@dunslane.net> References: <2005009.1774880253@sss.pgh.pa.us> <2d7c4e54-6a0b-4b1a-87a6-278c49fa52f0@dunslane.net> Comments: In-reply-to Andrew Dunstan message dated "Wed, 01 Apr 2026 07:15:40 -0400" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2992734.1775066578.1@sss.pgh.pa.us> Date: Wed, 01 Apr 2026 14:02:58 -0400 Message-ID: <2992735.1775066578@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Andrew Dunstan writes: > I followed your suggested methodology to measure how Gather IPC > cost actually scales with tuple width. I ran your test script on two of my development machines and got: Linux/x86_64: Width Parallel(ms) Serial(ms) Speedup Gather rows ----- ------------ ---------- ------- ----------- 8 510.976 1219.453 2.39x 235706 16 532.123 1271.692 2.39x 235603 32 588.826 1356.428 2.30x 242062 64 674.485 1570.758 2.33x 239561 128 817.417 1887.202 2.31x 243158 256 1066.836 2548.100 2.39x 242304 384 1293.900 3038.905 2.35x 243941 512 1515.822 3573.144 2.36x 239064 768 1998.638 4725.448 2.36x 247558 1024 9865.460 22779.795 2.31x 10000000 macOS/M4-Pro: Width Parallel(ms) Serial(ms) Speedup Gather rows ----- ------------ ---------- ------- ----------- 8 299.464 769.130 2.57x 242549 16 310.361 787.629 2.54x 243643 32 344.541 839.589 2.44x 242419 64 413.330 967.512 2.34x 238771 128 519.794 1185.757 2.28x 241440 256 1479.766 1823.559 1.23x 238615 384 2022.882 2326.823 1.15x 240617 512 2423.938 2778.995 1.15x 244752 768 3511.425 3934.384 1.12x 235814 1024 9905.073 12214.577 1.23x 10000000 It's not entirely clear to me how you reduced these numbers to a ptc formula, but we should do that and see how the results compare to your machine. > The original patch used PARALLEL_TUPLE_COST_FIXED_FRAC = 0.10, > which substantially underestimates the width-independent component. > A higher fixed fraction would dampen the width adjustment, which > also partly addresses Tom's concern about sensitivity to width > estimate errors: with ~45% of the cost being fixed, even a 2x > error in width only translates to a ~1.5x error in total cost. That does make me feel better, assuming that we come out with similar results on several machines. > The script used to get the timings is attached. If anyone else wants to try this with a platform having non-GNU grep, you'll need these changes: $ diff -pud ptc_calibrate.sh~ ptc_calibrate.sh --- ptc_calibrate.sh~ 2026-04-01 10:02:53.000000000 -0400 +++ ptc_calibrate.sh 2026-04-01 13:31:06.873772739 -0400 @@ -32,7 +32,8 @@ psql_cmd() { psql_cmd_timing() { "$PGBIN/psql" -h /tmp -p $PORT -d "$DB" -o /dev/null -qAt \ - -c '\timing on' "$@" 2>&1 | grep -oP 'Time: \K[\d.]+' | tail -1 + -c '\timing on' "$@" 2>&1 | \ + sed -n 's/.*Time: \([0-9.][0-9.]*\).*/\1/p' | tail -1 } # Create the database if needed @@ -114,7 +115,7 @@ for W in $WIDTHS; do # Get Gather row count from EXPLAIN gather_rows=$(psql_cmd -c "$SET_CMDS EXPLAIN (COSTS ON) $Q;" \ - | grep -oP 'Gather.*rows=\K\d+' | head -1) + | sed -n 's/.*Gather.*rows=\([0-9][0-9]*\).*/\1/p' | head -1) gather_rows=${gather_rows:-"?"} # Warm up (2 runs each) regards, tom lane