public inbox for [email protected]  
help / color / mirror / Atom feed
From: Chao Li <[email protected]>
To: Andres Freund <[email protected]>
To: David Rowley <[email protected]>
Cc: PostgreSQL Developers <[email protected]>
Subject: Re: More speedups for tuple deformation
Date: Fri, 23 Jan 2026 13:29:22 +0800
Message-ID: <[email protected]> (raw)
In-Reply-To: <pmik622adey6fnddivkt4uvkulvnc6rasmq3tcbrzeglx4hsn7@f3x6e2eph3w5>
References: <CAApHDvpoFjaj3+w_jD5uPnGazaw41A71tVJokLDJg2zfcigpMQ@mail.gmail.com>
	<CAApHDvrF6DG7=xD8JGo2HoQKN0LRFNF0ysVt6cKSNPiqbdQOSA@mail.gmail.com>
	<CAApHDvoh3Q413szd-zsUTCpQPWNdpUYvx-fvsB8DP8zOja+ckg@mail.gmail.com>
	<[email protected]>
	<CAApHDvqhbJU_-yF3Hbf4VhX33qXtpeYv3MsvMPDMcDwGGLr9ZQ@mail.gmail.com>
	<rbskhk7scqbxqnaw4o6nh6na2ffcclg3cxn4d4cn5jfr2z7vv3@kadtz65meesb>
	<CAApHDvpDxDFatUskuOfuM7A3VESrx8U7MtYnU_HiB0QLAg94zg@mail.gmail.com>
	<pmik622adey6fnddivkt4uvkulvnc6rasmq3tcbrzeglx4hsn7@f3x6e2eph3w5>



> On Jan 23, 2026, at 09:18, Andres Freund <[email protected]> wrote:
> 
> Hi,
> 
> I haven't yet looked at the new version of the patch, but I ran your benchmark
> from upthread (fwiw, I removed the sleep 10 to reduce runtimes, the results
> seem stable enough anyway) on two intel machines, as you mentioned that you
> saw a lot variation in Azure.
> 
> For both I disabled turbo boost, cpu idling and pinned the backend to a single
> CPU core.
> 
> There's a bit of noise on "awork3" (basically an editor and an idle browser
> window), but everything is pinned to the other socket. "awork4" is entirely
> idle.
> 
> 
> Looks like overall the results are quite impressive!  Some of the extra_cols=0
> runs saphire rapids are a bit slower, but the losses are much smaller than the
> gains in other cases.
> 
> 
> I think it'd be good to add a few test cases of "incremental deforming" to the
> benchmark. E.g. a qual that accesses column 10, but projection then deforms up
> to 20.  I'm a bit worried that e.g. the repeated first_null_attr()
> computations could cause regressions.
> 
> 
> Greetings,
> 
> Andres Freund
> <deform_bench.csv>

Today I ran the benchmark on my MacBook M4 against 3 versions (all without assert and with -O2):

1) Master (f9a468c664a)
2) Master + v4
3) Master + v4 + My tweak (first_null_attr immediately returns 0 when natts == 0)

Overall, v4 shows significant improvements across most configuration combinations. In the best case, v4 is about 43% faster than master.

The tweak version is only slightly faster than v4. In the best case, the tweak achieves an additional ~3.5% improvement over v4.

Note that the MacBook is my working laptop. I didn’t actively work on it while the tests were running, but it was still not fully idle, as some other applications (Email, VScode, etc.) were running in the background. That said, I suppose this is still fair for the three rounds of test runs.

See the attached Excel sheet for details.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/






Attachments:

  [application/vnd.openxmlformats-officedocument.spreadsheetml.sheet] pgbench_comparison_chao_li_mac_m4.xlsx (15.8K, 2-pgbench_comparison_chao_li_mac_m4.xlsx)
  download

view thread (19+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: More speedups for tuple deformation
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox