public inbox for [email protected]  
help / color / mirror / Atom feed
From: Heikki Linnakangas <[email protected]>
To: David Geier <[email protected]>
To: Matthias van de Meent <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Reduce build times of pg_trgm GIN indexes
Date: Tue, 7 Apr 2026 14:27:40 +0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CAEze2WiUL9idZBbuUN+MuWqr6DcPr_-C91E9MTx=H62Xx5fHaQ@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>

On 03/03/2026 19:31, David Geier wrote:
>> Attached are the patches rebased on latest master.
>>
>> I've removed the ASCII fast-path patch 0006 as it turned out to be more
>> complicated to make work than expected.
>>
>> I kept the radix sort patch because it gives a decent speedup but I
>> would like to focus for now on getting patches 0001 - 0004 merged.
>> They're all simple and, the way I see it, uncontroversial.
>>
>> I remeasured the savings of 0001 - 0004, which comes on top of the
>> already committed patch that inlined the comparison function, which gave
>> another ~5%:
>>
>> Data set            | Patched (ms) | Master (ms)  | Speedup
>> --------------------|--------------|--------------|----------
>> movies(plot)        |   8,058      |  10,311      | 1.27x
>> lineitem(l_comment) | 223,233      | 256,986      | 1.19x
>>
>> I've also registered the change at the commit fest, see
>> https://commitfest.postgresql.org/patch/6418/.
> 
> Attached is v5 that removes an incorrect assertion from the radix sort code.
>
> v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch
> v5-0002-Optimize-generate_trgm-with-sort_template.h.patch
> v5-0003-Make-btint4cmp-branchless.patch
> v5-0004-Faster-qunique-comparator-in-generate_trgm.patch
> v5-0005-Optimize-generate_trgm-with-radix-sort.patch

Pushed 0001 as commit 6f5ad00ab7.

I squashed 0002 and 0004 into one commit, and did some more refactoring: 
I created a trigram_qsort() helper function that calls the signed or 
unsigned variant, so that that logic doesn't need to be duplicated in 
the callers. For symmetry, I also added a trigram_qunique() helper 
function which just calls qunique() with the new, faster CMPTRGM_EQ 
comparator. Pushed these as commit 9f3755ea07.

Patch 0003 gives me pause. It's a tiny patch:

> @@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS)
>  	int32		a = PG_GETARG_INT32(0);
>  	int32		b = PG_GETARG_INT32(1);
>  
> -	if (a > b)
> -		PG_RETURN_INT32(A_GREATER_THAN_B);
> -	else if (a == b)
> -		PG_RETURN_INT32(0);
> -	else
> -		PG_RETURN_INT32(A_LESS_THAN_B);
> +	PG_RETURN_INT32(pg_cmp_s32(a, b));
>  }

But the comments on the pg_cmp functions say:

>  * NB: If the comparator function is inlined, some compilers may produce
>  * worse code with these helper functions than with code with the
>  * following form:
>  *
>  *     if (a < b)
>  *         return -1;
>  *     if (a > b)
>  *         return 1;
>  *     return 0;
>  *

So, uh, is that really a universal improvement? Is that comment about 
producing worse code outdated?

- Heikki






view thread (31+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Reduce build times of pg_trgm GIN indexes
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox