public inbox for [email protected]help / color / mirror / Atom feed
Re: Reduce build times of pg_trgm GIN indexes 31+ messages / 8 participants [nested] [flat]
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-01-21 20:50 Matthias van de Meent <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Matthias van de Meent @ 2026-01-21 20:50 UTC (permalink / raw) To: David Geier <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; pgsql-hackers On Wed, 21 Jan 2026 at 16:45, David Geier <[email protected]> wrote: > > How do we usually go about such backwards-compatibility breaking > changes? When it concerns a bug, we mention the change in the release notes with a warning to reindex affected indexes to be sure no known corruption remains. See e.g. the final entry in the PG18 release notes' migration section here: https://www.postgresql.org/docs/18/release-18.html#RELEASE-18-MIGRATION. > Could we have pg_upgrade reindex all GIN indexes? Would that be > acceptable? No. We'd handle this like any other collation/opclass fixes; we ask users to reindex their indexes in their own time after they've upgraded their cluster. Note that in this case it concerns an issue with just one GIN opclass, not all GIN indexes; so even if we were to address this in pg_upgrade it wouldn't be a correct choice to reindex every GIN index, as only a subset of those would be affected by this issue. Generally speaking, pg_upgrade doesn't concern itself with the validity of the data structures that are described by the catalogs that it upgrades, it only concerns itself with that it correctly transcribes the catalogs from one version to another, and that the data files of the old cluster are transfered correctly without changes. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com) ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-01-23 10:18 David Geier <[email protected]> parent: Matthias van de Meent <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: David Geier @ 2026-01-23 10:18 UTC (permalink / raw) To: Matthias van de Meent <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; pgsql-hackers Hi Matthias, On 21.01.2026 21:50, Matthias van de Meent wrote: > On Wed, 21 Jan 2026 at 16:45, David Geier <[email protected]> wrote: >> >> How do we usually go about such backwards-compatibility breaking >> changes? > > When it concerns a bug, we mention the change in the release notes > with a warning to reindex affected indexes to be sure no known > corruption remains. See e.g. the final entry in the PG18 release > notes' migration section here: > https://www.postgresql.org/docs/18/release-18.html#RELEASE-18-MIGRATION. > >> Could we have pg_upgrade reindex all GIN indexes? Would that be >> acceptable? > > No. We'd handle this like any other collation/opclass fixes; we ask > users to reindex their indexes in their own time after they've > upgraded their cluster. Note that in this case it concerns an issue > with just one GIN opclass, not all GIN indexes; so even if we were to > address this in pg_upgrade it wouldn't be a correct choice to reindex > every GIN index, as only a subset of those would be affected by this > issue. > > Generally speaking, pg_upgrade doesn't concern itself with the > validity of the data structures that are described by the catalogs > that it upgrades, it only concerns itself with that it correctly > transcribes the catalogs from one version to another, and that the > data files of the old cluster are transfered correctly without > changes. Thanks for the clarifications and the link to the release notes. That's very helpful. Then I know how to move on and will update the patch accordingly. -- David Geier ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-03-02 12:17 David Geier <[email protected]> parent: David Geier <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: David Geier @ 2026-03-02 12:17 UTC (permalink / raw) To: Matthias van de Meent <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; pgsql-hackers On 23.01.2026 11:18, David Geier wrote: > Hi Matthias, > > On 21.01.2026 21:50, Matthias van de Meent wrote: >> On Wed, 21 Jan 2026 at 16:45, David Geier <[email protected]> wrote: >>> >>> How do we usually go about such backwards-compatibility breaking >>> changes? >> >> When it concerns a bug, we mention the change in the release notes >> with a warning to reindex affected indexes to be sure no known >> corruption remains. See e.g. the final entry in the PG18 release >> notes' migration section here: >> https://www.postgresql.org/docs/18/release-18.html#RELEASE-18-MIGRATION. >> >>> Could we have pg_upgrade reindex all GIN indexes? Would that be >>> acceptable? >> >> No. We'd handle this like any other collation/opclass fixes; we ask >> users to reindex their indexes in their own time after they've >> upgraded their cluster. Note that in this case it concerns an issue >> with just one GIN opclass, not all GIN indexes; so even if we were to >> address this in pg_upgrade it wouldn't be a correct choice to reindex >> every GIN index, as only a subset of those would be affected by this >> issue. >> >> Generally speaking, pg_upgrade doesn't concern itself with the >> validity of the data structures that are described by the catalogs >> that it upgrades, it only concerns itself with that it correctly >> transcribes the catalogs from one version to another, and that the >> data files of the old cluster are transfered correctly without >> changes. > > Thanks for the clarifications and the link to the release notes. That's > very helpful. Then I know how to move on and will update the patch > accordingly. Attached are the patches rebased on latest master. I've removed the ASCII fast-path patch 0006 as it turned out to be more complicated to make work than expected. I kept the radix sort patch because it gives a decent speedup but I would like to focus for now on getting patches 0001 - 0004 merged. They're all simple and, the way I see it, uncontroversial. I remeasured the savings of 0001 - 0004, which comes on top of the already committed patch that inlined the comparison function, which gave another ~5%: Data set | Patched (ms) | Master (ms) | Speedup --------------------|--------------|--------------|---------- movies(plot) | 8,058 | 10,311 | 1.27x lineitem(l_comment) | 223,233 | 256,986 | 1.19x I've also registered the change at the commit fest, see https://commitfest.postgresql.org/patch/6418/. -- David Geier Attachments: [text/x-patch] v4-0005-Optimize-generate_trgm-with-radix-sort.patch (2.9K, 2-v4-0005-Optimize-generate_trgm-with-radix-sort.patch) download | inline diff: From cc377266ea8071bb000f0c05d0aca0ab2012cdfe Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Tue, 11 Nov 2025 13:18:59 +0100 Subject: [PATCH v4 5/5] Optimize generate_trgm() with radix sort --- contrib/pg_trgm/trgm_op.c | 64 +++++++++++++++++++++++++++++++++------ 1 file changed, 54 insertions(+), 10 deletions(-) diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c index 07daf111729..1603a55022b 100644 --- a/contrib/pg_trgm/trgm_op.c +++ b/contrib/pg_trgm/trgm_op.c @@ -235,14 +235,6 @@ CMPTRGM_CHOOSE(const void *a, const void *b) } /* Define our specialized sort function name */ -#define ST_SORT trigram_qsort_signed -#define ST_ELEMENT_TYPE_VOID -#define ST_COMPARE(a, b) CMPTRGM_SIGNED(a, b) -#define ST_SCOPE static -#define ST_DEFINE -#define ST_DECLARE -#include "lib/sort_template.h" - #define ST_SORT trigram_qsort_unsigned #define ST_ELEMENT_TYPE_VOID #define ST_COMPARE(a, b) CMPTRGM_UNSIGNED(a, b) @@ -564,6 +556,58 @@ generate_trgm_only(growable_trgm_array *dst, char *str, int slen, TrgmBound **bo pfree(buf); } +/* + * Needed to properly handle negative numbers in case char is signed. + */ +static inline unsigned char FlipSign(char x) +{ + return x^0x80; +} + +static void radix_sort_trigrams_signed(trgm *trg, int count) +{ + trgm *buffer = palloc_array(trgm, count); + trgm *starts[256]; + trgm *from = trg; + trgm *to = buffer; + int freqs[3][256]; + + /* + * Compute frequencies to partition the buffer. + */ + memset(freqs, 0, sizeof(freqs)); + + for (int i=0; i<count; i++) + for (int j=0; j<3; j++) + freqs[j][FlipSign(trg[i][j])]++; + + /* + * Do the sorting. Start with last character because that's the is "LSB" + * in a trigram. Avoid unnecessary copies by ping-ponging between the buffers. + */ + for (int i=2; i>=0; i--) + { + trgm *old_from = from; + trgm *next = to; + + for (int j=0; j<256; j++) + { + starts[j] = next; + next += freqs[i][j]; + } + + for (int j=0; j<count; j++) + memcpy(starts[FlipSign(from[j][i])]++, from[j], sizeof(trgm)); + + from = to; + to = old_from; + } + + Assert(to == buffer); + memcpy(trg, buffer, sizeof(trgm) * count); + pfree(buffer); +} + /* * Make array of trigrams with sorting and removing duplicate items. * @@ -589,7 +633,7 @@ generate_trgm(char *str, int slen) if (len > 1) { if (GetDefaultCharSignedness()) - trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm)); + radix_sort_trigrams_signed((trgm *)GETARR(trg), len); else trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm)); @@ -1124,7 +1168,7 @@ generate_wildcard_trgm(const char *str, int slen) if (len > 1) { if (GetDefaultCharSignedness()) - trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm)); + radix_sort_trigrams_signed((trgm *)GETARR(trg), len); else trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm)); -- 2.51.0 [text/x-patch] v4-0004-Faster-qunique-comparator-in-generate_trgm.patch (2.0K, 3-v4-0004-Faster-qunique-comparator-in-generate_trgm.patch) download | inline diff: From 6d3a0f61928958b51989e10d5f85db2deef210cf Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Wed, 12 Nov 2025 14:27:13 +0100 Subject: [PATCH v4 4/5] Faster qunique() comparator in generate_trgm() --- contrib/pg_trgm/trgm_op.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c index 85df5ef2310..07daf111729 100644 --- a/contrib/pg_trgm/trgm_op.c +++ b/contrib/pg_trgm/trgm_op.c @@ -211,6 +211,14 @@ CMPTRGM_UNSIGNED(const void *a, const void *b) : CMPPCHAR_UNS(a, b, 2)); } +static inline int +CMPTRGM_EQ(const void *a, const void *b) +{ + char *aa = (char *)a; + char *bb = (char *)b; + return aa[0] != bb[0] || aa[1] != bb[1] || aa[2] != bb[2] ? 1 : 0; +} + /* * This gets called on the first call. It replaces the function pointer so * that subsequent calls are routed directly to the chosen implementation. @@ -581,15 +589,11 @@ generate_trgm(char *str, int slen) if (len > 1) { if (GetDefaultCharSignedness()) - { trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm)); - len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED); - } else - { trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm)); - len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED); - } + + len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_EQ); } SET_VARSIZE(trg, CALCGTSIZE(ARRKEY, len)); @@ -1120,15 +1124,11 @@ generate_wildcard_trgm(const char *str, int slen) if (len > 1) { if (GetDefaultCharSignedness()) - { trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm)); - len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED); - } else - { trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm)); - len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED); - } + + len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_EQ); } trg->flag = ARRKEY; -- 2.51.0 [text/x-patch] v4-0003-Make-btint4cmp-branchless.patch (1.0K, 4-v4-0003-Make-btint4cmp-branchless.patch) download | inline diff: From 49ea44e1bc26d3166239b4efd257c5f8c4f136fd Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Mon, 10 Nov 2025 15:40:11 +0100 Subject: [PATCH v4 3/5] Make btint4cmp() branchless --- src/backend/access/nbtree/nbtcompare.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c index 1d343377e98..ac16e3d993d 100644 --- a/src/backend/access/nbtree/nbtcompare.c +++ b/src/backend/access/nbtree/nbtcompare.c @@ -61,6 +61,7 @@ #include "utils/fmgrprotos.h" #include "utils/skipsupport.h" #include "utils/sortsupport.h" +#include "common/int.h" #ifdef STRESS_SORT_INT_MIN #define A_LESS_THAN_B INT_MIN @@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS) int32 a = PG_GETARG_INT32(0); int32 b = PG_GETARG_INT32(1); - if (a > b) - PG_RETURN_INT32(A_GREATER_THAN_B); - else if (a == b) - PG_RETURN_INT32(0); - else - PG_RETURN_INT32(A_LESS_THAN_B); + PG_RETURN_INT32(pg_cmp_s32(a, b)); } Datum -- 2.51.0 [text/x-patch] v4-0002-Optimize-generate_trgm-with-sort_template.h.patch (2.6K, 5-v4-0002-Optimize-generate_trgm-with-sort_template.h.patch) download | inline diff: From ba64b2fed648f9b3fef53e244d2665ddf1ac9ea3 Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Mon, 10 Nov 2025 13:35:11 +0100 Subject: [PATCH v4 2/5] Optimize generate_trgm() with sort_template.h --- contrib/pg_trgm/trgm_op.c | 47 ++++++++++++++++++++++++++++++--------- 1 file changed, 37 insertions(+), 10 deletions(-) diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c index ee89e548d16..85df5ef2310 100644 --- a/contrib/pg_trgm/trgm_op.c +++ b/contrib/pg_trgm/trgm_op.c @@ -226,6 +226,23 @@ CMPTRGM_CHOOSE(const void *a, const void *b) return CMPTRGM(a, b); } +/* Define our specialized sort function name */ +#define ST_SORT trigram_qsort_signed +#define ST_ELEMENT_TYPE_VOID +#define ST_COMPARE(a, b) CMPTRGM_SIGNED(a, b) +#define ST_SCOPE static +#define ST_DEFINE +#define ST_DECLARE +#include "lib/sort_template.h" + +#define ST_SORT trigram_qsort_unsigned +#define ST_ELEMENT_TYPE_VOID +#define ST_COMPARE(a, b) CMPTRGM_UNSIGNED(a, b) +#define ST_SCOPE static +#define ST_DEFINE +#define ST_DECLARE +#include "lib/sort_template.h" + /* * Deprecated function. * Use "pg_trgm.similarity_threshold" GUC variable instead of this function. @@ -281,12 +298,6 @@ show_limit(PG_FUNCTION_ARGS) PG_RETURN_FLOAT4(similarity_threshold); } -static int -comp_trgm(const void *a, const void *b) -{ - return CMPTRGM(a, b); -} - /* * Finds first word in string, returns pointer to the word, * endword points to the character after word @@ -569,8 +580,16 @@ generate_trgm(char *str, int slen) */ if (len > 1) { - qsort(GETARR(trg), len, sizeof(trgm), comp_trgm); - len = qunique(GETARR(trg), len, sizeof(trgm), comp_trgm); + if (GetDefaultCharSignedness()) + { + trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm)); + len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED); + } + else + { + trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm)); + len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED); + } } SET_VARSIZE(trg, CALCGTSIZE(ARRKEY, len)); @@ -1100,8 +1119,16 @@ generate_wildcard_trgm(const char *str, int slen) len = arr.length; if (len > 1) { - qsort(GETARR(trg), len, sizeof(trgm), comp_trgm); - len = qunique(GETARR(trg), len, sizeof(trgm), comp_trgm); + if (GetDefaultCharSignedness()) + { + trigram_qsort_signed((void *) GETARR(trg), len, sizeof(trgm)); + len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_SIGNED); + } + else + { + trigram_qsort_unsigned((void *) GETARR(trg), len, sizeof(trgm)); + len = qunique(GETARR(trg), len, sizeof(trgm), CMPTRGM_UNSIGNED); + } } trg->flag = ARRKEY; -- 2.51.0 [text/x-patch] v4-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch (7.8K, 6-v4-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch) download | inline diff: From 983e5543c0802201d8c2c43dbde0a19ae3d5ed1b Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Wed, 14 Jan 2026 10:54:32 +0100 Subject: [PATCH v4 1/5] Optimize sort and deduplication in ginExtractEntries --- src/backend/access/gin/ginutil.c | 152 +++++++++++++------------------ src/include/access/gin_private.h | 2 +- 2 files changed, 66 insertions(+), 88 deletions(-) diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c index ff927279cc3..15f4a686795 100644 --- a/src/backend/access/gin/ginutil.c +++ b/src/backend/access/gin/ginutil.c @@ -28,6 +28,7 @@ #include "utils/index_selfuncs.h" #include "utils/rel.h" #include "utils/typcache.h" +#include "lib/qunique.h" /* @@ -387,19 +388,6 @@ GinInitMetabuffer(Buffer b) ((char *) metadata + sizeof(GinMetaPageData)) - (char *) page; } -/* - * Support for sorting key datums in ginExtractEntries - * - * Note: we only have to worry about null and not-null keys here; - * ginExtractEntries never generates more than one placeholder null, - * so it doesn't have to sort those. - */ -typedef struct -{ - Datum datum; - bool isnull; -} keyEntryData; - typedef struct { FmgrInfo *cmpDatumFunc; @@ -410,24 +398,14 @@ typedef struct static int cmpEntries(const void *a, const void *b, void *arg) { - const keyEntryData *aa = (const keyEntryData *) a; - const keyEntryData *bb = (const keyEntryData *) b; + const Datum *aa = (const Datum *) a; + const Datum *bb = (const Datum *) b; cmpEntriesArg *data = (cmpEntriesArg *) arg; int res; - if (aa->isnull) - { - if (bb->isnull) - res = 0; /* NULL "=" NULL */ - else - res = 1; /* NULL ">" not-NULL */ - } - else if (bb->isnull) - res = -1; /* not-NULL "<" NULL */ - else - res = DatumGetInt32(FunctionCall2Coll(data->cmpDatumFunc, - data->collation, - aa->datum, bb->datum)); + res = DatumGetInt32(FunctionCall2Coll(data->cmpDatumFunc, + data->collation, + *aa, *bb)); /* * Detect if we have any duplicates. If there are equal keys, qsort must @@ -440,6 +418,14 @@ cmpEntries(const void *a, const void *b, void *arg) return res; } +#define ST_SORT qsort_arg_entries +#define ST_ELEMENT_TYPE Datum +#define ST_COMPARE_ARG_TYPE cmpEntriesArg +#define ST_COMPARE(a, b, arg) cmpEntries(a, b, arg) +#define ST_SCOPE static +#define ST_DEFINE +#define ST_DECLARE +#include "lib/sort_template.h" /* * Extract the index key values from an indexable item @@ -450,11 +436,13 @@ cmpEntries(const void *a, const void *b, void *arg) Datum * ginExtractEntries(GinState *ginstate, OffsetNumber attnum, Datum value, bool isNull, - int32 *nentries, GinNullCategory **categories) + int32 *nentries_p, GinNullCategory **categories_p) { Datum *entries; bool *nullFlags; - int32 i; + GinNullCategory *categories; + bool hasNull; + int32 nentries; /* * We don't call the extractValueFn on a null item. Instead generate a @@ -462,42 +450,60 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum, */ if (isNull) { - *nentries = 1; + *nentries_p = 1; entries = palloc_object(Datum); entries[0] = (Datum) 0; - *categories = palloc_object(GinNullCategory); - (*categories)[0] = GIN_CAT_NULL_ITEM; + *categories_p = palloc_object(GinNullCategory); + (*categories_p)[0] = GIN_CAT_NULL_ITEM; return entries; } /* OK, call the opclass's extractValueFn */ nullFlags = NULL; /* in case extractValue doesn't set it */ + nentries = 0; entries = (Datum *) DatumGetPointer(FunctionCall3Coll(&ginstate->extractValueFn[attnum - 1], ginstate->supportCollation[attnum - 1], value, - PointerGetDatum(nentries), + PointerGetDatum(&nentries), PointerGetDatum(&nullFlags))); /* * Generate a placeholder if the item contained no keys. */ - if (entries == NULL || *nentries <= 0) + if (entries == NULL || nentries <= 0) { - *nentries = 1; + *nentries_p = 1; entries = palloc_object(Datum); entries[0] = (Datum) 0; - *categories = palloc_object(GinNullCategory); - (*categories)[0] = GIN_CAT_EMPTY_ITEM; + *categories_p = palloc_object(GinNullCategory); + (*categories_p)[0] = GIN_CAT_EMPTY_ITEM; return entries; } /* - * If the extractValueFn didn't create a nullFlags array, create one, - * assuming that everything's non-null. + * Scan the items for any NULLs. All NULLs are considered equal, so we + * just need to check and remember if there are any. We remove them from + * the array here, and if necessary, put back one NULL entry to represent + * them all after deduplication. */ - if (nullFlags == NULL) - nullFlags = (bool *) palloc0(*nentries * sizeof(bool)); + hasNull = false; + if (nullFlags) + { + int32 numNonNulls = 0; + + for (int32 i = 0; i < nentries; i++) + { + if (nullFlags[i]) + hasNull = true; + else + { + entries[numNonNulls] = entries[i]; + numNonNulls++; + } + } + nentries = numNonNulls; + } /* * If there's more than one key, sort and unique-ify. @@ -506,63 +512,35 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum, * pretty bad too. For small numbers of keys it'd likely be better to use * a simple insertion sort. */ - if (*nentries > 1) + if (nentries > 1) { - keyEntryData *keydata; cmpEntriesArg arg; - - keydata = palloc_array(keyEntryData, *nentries); - for (i = 0; i < *nentries; i++) - { - keydata[i].datum = entries[i]; - keydata[i].isnull = nullFlags[i]; - } - arg.cmpDatumFunc = &ginstate->compareFn[attnum - 1]; arg.collation = ginstate->supportCollation[attnum - 1]; arg.haveDups = false; - qsort_arg(keydata, *nentries, sizeof(keyEntryData), - cmpEntries, &arg); - if (arg.haveDups) - { - /* there are duplicates, must get rid of 'em */ - int32 j; - - entries[0] = keydata[0].datum; - nullFlags[0] = keydata[0].isnull; - j = 1; - for (i = 1; i < *nentries; i++) - { - if (cmpEntries(&keydata[i - 1], &keydata[i], &arg) != 0) - { - entries[j] = keydata[i].datum; - nullFlags[j] = keydata[i].isnull; - j++; - } - } - *nentries = j; - } - else - { - /* easy, no duplicates */ - for (i = 0; i < *nentries; i++) - { - entries[i] = keydata[i].datum; - nullFlags[i] = keydata[i].isnull; - } - } + qsort_arg_entries(entries, nentries, &arg); - pfree(keydata); + if (arg.haveDups) + nentries = qunique_arg(entries, nentries, sizeof(Datum), cmpEntries, &arg); } /* - * Create GinNullCategory representation from nullFlags. + * Create GinNullCategory representation. */ - *categories = (GinNullCategory *) palloc0(*nentries * sizeof(GinNullCategory)); - for (i = 0; i < *nentries; i++) - (*categories)[i] = (nullFlags[i] ? GIN_CAT_NULL_KEY : GIN_CAT_NORM_KEY); + StaticAssertStmt(GIN_CAT_NORM_KEY == 0, "Assuming GIN_CAT_NORM_KEY=0"); + categories = palloc0_array(GinNullCategory, nentries + (hasNull ? 1 : 0)); + + /* Put back a NULL entry, if there were any */ + if (hasNull) + { + entries[nentries] = (Datum) 0; + categories[nentries] = GIN_CAT_NULL_KEY; + nentries++; + } + *nentries_p = nentries; + *categories_p = categories; return entries; } diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h index 7c3b4db94cd..c878546b9d2 100644 --- a/src/include/access/gin_private.h +++ b/src/include/access/gin_private.h @@ -99,7 +99,7 @@ extern void GinInitPage(Page page, uint32 f, Size pageSize); extern void GinInitMetabuffer(Buffer b); extern Datum *ginExtractEntries(GinState *ginstate, OffsetNumber attnum, Datum value, bool isNull, - int32 *nentries, GinNullCategory **categories); + int32 *nentries_p, GinNullCategory **categories_p); extern OffsetNumber gintuple_get_attrnum(GinState *ginstate, IndexTuple tuple); extern Datum gintuple_get_key(GinState *ginstate, IndexTuple tuple, -- 2.51.0 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-07 11:27 Heikki Linnakangas <[email protected]> parent: David Geier <[email protected]> 0 siblings, 4 replies; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-07 11:27 UTC (permalink / raw) To: David Geier <[email protected]>; Matthias van de Meent <[email protected]>; +Cc: pgsql-hackers On 03/03/2026 19:31, David Geier wrote: >> Attached are the patches rebased on latest master. >> >> I've removed the ASCII fast-path patch 0006 as it turned out to be more >> complicated to make work than expected. >> >> I kept the radix sort patch because it gives a decent speedup but I >> would like to focus for now on getting patches 0001 - 0004 merged. >> They're all simple and, the way I see it, uncontroversial. >> >> I remeasured the savings of 0001 - 0004, which comes on top of the >> already committed patch that inlined the comparison function, which gave >> another ~5%: >> >> Data set | Patched (ms) | Master (ms) | Speedup >> --------------------|--------------|--------------|---------- >> movies(plot) | 8,058 | 10,311 | 1.27x >> lineitem(l_comment) | 223,233 | 256,986 | 1.19x >> >> I've also registered the change at the commit fest, see >> https://commitfest.postgresql.org/patch/6418/. > > Attached is v5 that removes an incorrect assertion from the radix sort code. > > v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch > v5-0002-Optimize-generate_trgm-with-sort_template.h.patch > v5-0003-Make-btint4cmp-branchless.patch > v5-0004-Faster-qunique-comparator-in-generate_trgm.patch > v5-0005-Optimize-generate_trgm-with-radix-sort.patch Pushed 0001 as commit 6f5ad00ab7. I squashed 0002 and 0004 into one commit, and did some more refactoring: I created a trigram_qsort() helper function that calls the signed or unsigned variant, so that that logic doesn't need to be duplicated in the callers. For symmetry, I also added a trigram_qunique() helper function which just calls qunique() with the new, faster CMPTRGM_EQ comparator. Pushed these as commit 9f3755ea07. Patch 0003 gives me pause. It's a tiny patch: > @@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS) > int32 a = PG_GETARG_INT32(0); > int32 b = PG_GETARG_INT32(1); > > - if (a > b) > - PG_RETURN_INT32(A_GREATER_THAN_B); > - else if (a == b) > - PG_RETURN_INT32(0); > - else > - PG_RETURN_INT32(A_LESS_THAN_B); > + PG_RETURN_INT32(pg_cmp_s32(a, b)); > } But the comments on the pg_cmp functions say: > * NB: If the comparator function is inlined, some compilers may produce > * worse code with these helper functions than with code with the > * following form: > * > * if (a < b) > * return -1; > * if (a > b) > * return 1; > * return 0; > * So, uh, is that really a universal improvement? Is that comment about producing worse code outdated? - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-08 02:15 John Naylor <[email protected]> parent: Heikki Linnakangas <[email protected]> 3 siblings, 1 reply; 31+ messages in thread From: John Naylor @ 2026-04-08 02:15 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; +Cc: David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On Tue, Apr 7, 2026 at 6:27 PM Heikki Linnakangas <[email protected]> wrote: > But the comments on the pg_cmp functions say: > > > * NB: If the comparator function is inlined, some compilers may produce > > * worse code with these helper functions than with code with the > > * following form: > > * > > * if (a < b) > > * return -1; > > * if (a > b) > > * return 1; > > * return 0; > > * > > So, uh, is that really a universal improvement? Is that comment about > producing worse code outdated? No, it's quite recent: https://www.postgresql.org/message-id/20240212230423.GA3519%40nathanxps13 -- John Naylor Amazon Web Services ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-09 11:28 Bertrand Drouvot <[email protected]> parent: Heikki Linnakangas <[email protected]> 3 siblings, 1 reply; 31+ messages in thread From: Bertrand Drouvot @ 2026-04-09 11:28 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; +Cc: David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Hi, On Tue, Apr 07, 2026 at 02:27:40PM +0300, Heikki Linnakangas wrote: > On 03/03/2026 19:31, David Geier wrote: > > > Attached are the patches rebased on latest master. > > > > > > I've removed the ASCII fast-path patch 0006 as it turned out to be more > > > complicated to make work than expected. > > > > > > I kept the radix sort patch because it gives a decent speedup but I > > > would like to focus for now on getting patches 0001 - 0004 merged. > > > They're all simple and, the way I see it, uncontroversial. > > > > > > I remeasured the savings of 0001 - 0004, which comes on top of the > > > already committed patch that inlined the comparison function, which gave > > > another ~5%: > > > > > > Data set | Patched (ms) | Master (ms) | Speedup > > > --------------------|--------------|--------------|---------- > > > movies(plot) | 8,058 | 10,311 | 1.27x > > > lineitem(l_comment) | 223,233 | 256,986 | 1.19x > > > > > > I've also registered the change at the commit fest, see > > > https://commitfest.postgresql.org/patch/6418/. > > > > Attached is v5 that removes an incorrect assertion from the radix sort code. > > > > v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch > > v5-0002-Optimize-generate_trgm-with-sort_template.h.patch > > v5-0003-Make-btint4cmp-branchless.patch > > v5-0004-Faster-qunique-comparator-in-generate_trgm.patch > > v5-0005-Optimize-generate_trgm-with-radix-sort.patch > > Pushed 0001 as commit 6f5ad00ab7. This commit makes use of StaticAssertStmt() that has been deprecated in d50c86e74375. The attached, fixes it. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-12 18:05 Tom Lane <[email protected]> parent: Heikki Linnakangas <[email protected]> 3 siblings, 2 replies; 31+ messages in thread From: Tom Lane @ 2026-04-12 18:05 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; +Cc: David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Heikki Linnakangas <[email protected]> writes: > Pushed 0001 as commit 6f5ad00ab7. This commit has caused Coverity to start complaining that most of ginExtractEntries() is unreachable: *** CID 1691468: Control flow issues (DEADCODE) /srv/coverity/git/pgsql-git/postgresql/src/backend/access/gin/ginutil.c: 495 in ginExtractEntries() 489 /* 490 * Scan the items for any NULLs. All NULLs are considered equal, so we 491 * just need to check and remember if there are any. We remove them from 492 * the array here, and after deduplication, put back one NULL entry to 493 * represent them all. 494 */ >>> CID 1691468: Control flow issues (DEADCODE) >>> Execution cannot reach this statement: "hasNull = false;". 495 hasNull = false; 496 if (nullFlags) 497 { 498 int32 numNonNulls = 0; 499 500 for (int32 i = 0; i < nentries; i++) Evidently, it does not realize that the extractValueFn() can change nentries from its initial value of zero. I wouldn't be too surprised if that's related to our casting of the pointer to uintptr_t --- that may cause it to not see the passed pointer as a potential reference mechanism. I would just write that off as Coverity not being smart enough, except that I'm worried that some compiler might make a similar deduction and break the function completely. Was the switch to a local variable for nentries really a useful win performance-wise? regards, tom lane ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 09:41 Peter Eisentraut <[email protected]> parent: Bertrand Drouvot <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Peter Eisentraut @ 2026-04-13 09:41 UTC (permalink / raw) To: Bertrand Drouvot <[email protected]>; Heikki Linnakangas <[email protected]>; +Cc: David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On 09.04.26 13:28, Bertrand Drouvot wrote: > Hi, > > On Tue, Apr 07, 2026 at 02:27:40PM +0300, Heikki Linnakangas wrote: >> On 03/03/2026 19:31, David Geier wrote: >>>> Attached are the patches rebased on latest master. >>>> >>>> I've removed the ASCII fast-path patch 0006 as it turned out to be more >>>> complicated to make work than expected. >>>> >>>> I kept the radix sort patch because it gives a decent speedup but I >>>> would like to focus for now on getting patches 0001 - 0004 merged. >>>> They're all simple and, the way I see it, uncontroversial. >>>> >>>> I remeasured the savings of 0001 - 0004, which comes on top of the >>>> already committed patch that inlined the comparison function, which gave >>>> another ~5%: >>>> >>>> Data set | Patched (ms) | Master (ms) | Speedup >>>> --------------------|--------------|--------------|---------- >>>> movies(plot) | 8,058 | 10,311 | 1.27x >>>> lineitem(l_comment) | 223,233 | 256,986 | 1.19x >>>> >>>> I've also registered the change at the commit fest, see >>>> https://commitfest.postgresql.org/patch/6418/. >>> >>> Attached is v5 that removes an incorrect assertion from the radix sort code. >>> >>> v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch >>> v5-0002-Optimize-generate_trgm-with-sort_template.h.patch >>> v5-0003-Make-btint4cmp-branchless.patch >>> v5-0004-Faster-qunique-comparator-in-generate_trgm.patch >>> v5-0005-Optimize-generate_trgm-with-radix-sort.patch >> >> Pushed 0001 as commit 6f5ad00ab7. > > This commit makes use of StaticAssertStmt() that has been deprecated in > d50c86e74375. The attached, fixes it. I think the position of the static assertion is correct, because it refers to the palloc0_array() that follows. Maybe the comment could be a bit clearer, like "using palloc0_array requires GIN_CAT_NORM_KEY==0"? ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 11:04 Bertrand Drouvot <[email protected]> parent: Peter Eisentraut <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bertrand Drouvot @ 2026-04-13 11:04 UTC (permalink / raw) To: Peter Eisentraut <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Hi, On Mon, Apr 13, 2026 at 11:41:02AM +0200, Peter Eisentraut wrote: > On 09.04.26 13:28, Bertrand Drouvot wrote: > > > > This commit makes use of StaticAssertStmt() that has been deprecated in > > d50c86e74375. The attached, fixes it. > > I think the position of the static assertion is correct, because it refers > to the palloc0_array() that follows. Maybe the comment could be a bit > clearer, like "using palloc0_array requires GIN_CAT_NORM_KEY==0"? Yeah that looks better to not lose the connection with palloc0_array() here. Done that way in the attached and adding new braces to avoid warning from -Wdeclaration-after-statement. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 15:03 David Geier <[email protected]> parent: Bertrand Drouvot <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: David Geier @ 2026-04-13 15:03 UTC (permalink / raw) To: Bertrand Drouvot <[email protected]>; Peter Eisentraut <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Hi! On 13.04.2026 13:04, Bertrand Drouvot wrote: > Hi, > > On Mon, Apr 13, 2026 at 11:41:02AM +0200, Peter Eisentraut wrote: >> On 09.04.26 13:28, Bertrand Drouvot wrote: >>> >>> This commit makes use of StaticAssertStmt() that has been deprecated in >>> d50c86e74375. The attached, fixes it. I cannot find a comment close to StaticAssertStmt() that says it got deprecated. Is the goal to completely get rid of StaticAssertStmt()? >> I think the position of the static assertion is correct, because it refers >> to the palloc0_array() that follows. Maybe the comment could be a bit >> clearer, like "using palloc0_array requires GIN_CAT_NORM_KEY==0"? > > Yeah that looks better to not lose the connection with palloc0_array() here. > Done that way in the attached and adding new braces to avoid warning from > -Wdeclaration-after-statement. Looks good to me. -- David Geier ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 15:05 David Geier <[email protected]> parent: John Naylor <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: David Geier @ 2026-04-13 15:05 UTC (permalink / raw) To: John Naylor <[email protected]>; Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; pgsql-hackers On 08.04.2026 04:15, John Naylor wrote: > On Tue, Apr 7, 2026 at 6:27 PM Heikki Linnakangas <[email protected]> wrote: >> But the comments on the pg_cmp functions say: >> >>> * NB: If the comparator function is inlined, some compilers may produce >>> * worse code with these helper functions than with code with the >>> * following form: >>> * >>> * if (a < b) >>> * return -1; >>> * if (a > b) >>> * return 1; >>> * return 0; >>> * >> >> So, uh, is that really a universal improvement? Is that comment about >> producing worse code outdated? Well spotted. Thanks! > > No, it's quite recent: > > https://www.postgresql.org/message-id/20240212230423.GA3519%40nathanxps13 In my original benchmarks it was faster. I'll rebase the remaining commits and do some more analysis. -- David Geier ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 15:06 David Geier <[email protected]> parent: Heikki Linnakangas <[email protected]> 3 siblings, 1 reply; 31+ messages in thread From: David Geier @ 2026-04-13 15:06 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; Matthias van de Meent <[email protected]>; +Cc: pgsql-hackers Hi Heikki! > Pushed 0001 as commit 6f5ad00ab7. > > I squashed 0002 and 0004 into one commit, and did some more refactoring: > I created a trigram_qsort() helper function that calls the signed or > unsigned variant, so that that logic doesn't need to be duplicated in > the callers. For symmetry, I also added a trigram_qunique() helper > function which just calls qunique() with the new, faster CMPTRGM_EQ > comparator. Pushed these as commit 9f3755ea07. Thanks for committing these patches. -- David Geier ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 15:22 David Geier <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 0 replies; 31+ messages in thread From: David Geier @ 2026-04-13 15:22 UTC (permalink / raw) To: Tom Lane <[email protected]>; Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; pgsql-hackers On 12.04.2026 20:05, Tom Lane wrote: > Heikki Linnakangas <[email protected]> writes: >> Pushed 0001 as commit 6f5ad00ab7. > > This commit has caused Coverity to start complaining that > most of ginExtractEntries() is unreachable: > > *** CID 1691468: Control flow issues (DEADCODE) > /srv/coverity/git/pgsql-git/postgresql/src/backend/access/gin/ginutil.c: 495 in ginExtractEntries() > 489 /* > 490 * Scan the items for any NULLs. All NULLs are considered equal, so we > 491 * just need to check and remember if there are any. We remove them from > 492 * the array here, and after deduplication, put back one NULL entry to > 493 * represent them all. > 494 */ >>>> CID 1691468: Control flow issues (DEADCODE) >>>> Execution cannot reach this statement: "hasNull = false;". > 495 hasNull = false; > 496 if (nullFlags) > 497 { > 498 int32 numNonNulls = 0; > 499 > 500 for (int32 i = 0; i < nentries; i++) > > Evidently, it does not realize that the extractValueFn() can change > nentries from its initial value of zero. I wouldn't be too surprised > if that's related to our casting of the pointer to uintptr_t --- that > may cause it to not see the passed pointer as a potential reference > mechanism. Curious that we don't see that more frequently for other functions that have output arguments. But maybe there are just too few? > I would just write that off as Coverity not being smart enough, except > that I'm worried that some compiler might make a similar deduction and > break the function completely. Was the switch to a local variable > for nentries really a useful win performance-wise? I haven't benchmarked the variant with using the pointer directly. I can do that. -- David Geier ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 16:14 Heikki Linnakangas <[email protected]> parent: Tom Lane <[email protected]> 1 sibling, 1 reply; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-13 16:14 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On 12/04/2026 21:05, Tom Lane wrote: > Heikki Linnakangas <[email protected]> writes: >> Pushed 0001 as commit 6f5ad00ab7. > > This commit has caused Coverity to start complaining that > most of ginExtractEntries() is unreachable: > > *** CID 1691468: Control flow issues (DEADCODE) > /srv/coverity/git/pgsql-git/postgresql/src/backend/access/gin/ginutil.c: 495 in ginExtractEntries() > 489 /* > 490 * Scan the items for any NULLs. All NULLs are considered equal, so we > 491 * just need to check and remember if there are any. We remove them from > 492 * the array here, and after deduplication, put back one NULL entry to > 493 * represent them all. > 494 */ >>>> CID 1691468: Control flow issues (DEADCODE) >>>> Execution cannot reach this statement: "hasNull = false;". > 495 hasNull = false; > 496 if (nullFlags) > 497 { > 498 int32 numNonNulls = 0; > 499 > 500 for (int32 i = 0; i < nentries; i++) > > Evidently, it does not realize that the extractValueFn() can change > nentries from its initial value of zero. I wouldn't be too surprised > if that's related to our casting of the pointer to uintptr_t --- that > may cause it to not see the passed pointer as a potential reference > mechanism. > > I would just write that off as Coverity not being smart enough, except > that I'm worried that some compiler might make a similar deduction and > break the function completely. Was the switch to a local variable > for nentries really a useful win performance-wise? I didn't do it for performance, but because I find the function easier to read that way. We could change it back. It's a pretty scary thought that a compiler might misoptimize that though. In the same function we have 'nullFlags', too, as a local variable, even before this commit. Not sure why Coverity doesn't complain about that. > /* > * PointerGetDatum > * Returns datum representation for a pointer. > */ > static inline Datum > PointerGetDatum(const void *X) > { > return (Datum) (uintptr_t) X; > } Hmm, is that 'const' incorrect? This function doesn't modify *X, but the resulting address will be used to modify it. Maybe changing it to non-const "void *X" would give Coverity a hint. - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-13 17:15 Bertrand Drouvot <[email protected]> parent: David Geier <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Bertrand Drouvot @ 2026-04-13 17:15 UTC (permalink / raw) To: David Geier <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; Heikki Linnakangas <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Hi, On Mon, Apr 13, 2026 at 05:03:11PM +0200, David Geier wrote: > Hi! > > On 13.04.2026 13:04, Bertrand Drouvot wrote: > > Hi, > > > > On Mon, Apr 13, 2026 at 11:41:02AM +0200, Peter Eisentraut wrote: > >> On 09.04.26 13:28, Bertrand Drouvot wrote: > >>> > >>> This commit makes use of StaticAssertStmt() that has been deprecated in > >>> d50c86e74375. The attached, fixes it. > > I cannot find a comment close to StaticAssertStmt() that says it got > deprecated. The comment on top of it's definition is: " /* * StaticAssertStmt() was previously used to make static assertions work as a * statement, but its use is now deprecated. */ " > Is the goal to completely get rid of StaticAssertStmt()? According to its comment, I'd say so. > > Yeah that looks better to not lose the connection with palloc0_array() here. > > Done that way in the attached and adding new braces to avoid warning from > > -Wdeclaration-after-statement. > > Looks good to me. Thanks for looking at it! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-14 07:02 David Geier <[email protected]> parent: Heikki Linnakangas <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: David Geier @ 2026-04-14 07:02 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; Tom Lane <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; pgsql-hackers > I didn't do it for performance, but because I find the function easier > to read that way. We could change it back. > > It's a pretty scary thought that a compiler might misoptimize that > though. In the same function we have 'nullFlags', too, as a local > variable, even before this commit. Not sure why Coverity doesn't > complain about that. > >> /* >> * PointerGetDatum >> * Returns datum representation for a pointer. >> */ >> static inline Datum >> PointerGetDatum(const void *X) >> { >> return (Datum) (uintptr_t) X; >> } > > Hmm, is that 'const' incorrect? This function doesn't modify *X, but the > resulting address will be used to modify it. Maybe changing it to non- > const "void *X" would give Coverity a hint. Ah, that could be it. Is there a way for me to run Coverity on a patch to test that out? Which Coverity CI do we actually use? Is it this one here [1]? [1] https://scan.coverity.com/projects/209? -- David Geier ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-14 09:22 Heikki Linnakangas <[email protected]> parent: Bertrand Drouvot <[email protected]> 0 siblings, 0 replies; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-14 09:22 UTC (permalink / raw) To: Bertrand Drouvot <[email protected]>; David Geier <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On 13/04/2026 20:15, Bertrand Drouvot wrote: > On Mon, Apr 13, 2026 at 05:03:11PM +0200, David Geier wrote: >> On 13.04.2026 13:04, Bertrand Drouvot wrote: >>> On Mon, Apr 13, 2026 at 11:41:02AM +0200, Peter Eisentraut wrote: >>>> On 09.04.26 13:28, Bertrand Drouvot wrote: >>>>> >>>>> This commit makes use of StaticAssertStmt() that has been deprecated in >>>>> d50c86e74375. The attached, fixes it. >> >> I cannot find a comment close to StaticAssertStmt() that says it got >> deprecated. > > The comment on top of it's definition is: > > " > /* > * StaticAssertStmt() was previously used to make static assertions work as a > * statement, but its use is now deprecated. > */ > " > >> Is the goal to completely get rid of StaticAssertStmt()? > > According to its comment, I'd say so. > >>> Yeah that looks better to not lose the connection with palloc0_array() here. >>> Done that way in the attached and adding new braces to avoid warning from >>> -Wdeclaration-after-statement. >> >> Looks good to me. > > Thanks for looking at it! Committed this StaticAssertStmt/Decl() fix, thanks - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-14 13:05 David Geier <[email protected]> parent: David Geier <[email protected]> 0 siblings, 0 replies; 31+ messages in thread From: David Geier @ 2026-04-14 13:05 UTC (permalink / raw) To: John Naylor <[email protected]>; Heikki Linnakangas <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; pgsql-hackers On 13.04.2026 17:05, David Geier wrote: > On 08.04.2026 04:15, John Naylor wrote: >> On Tue, Apr 7, 2026 at 6:27 PM Heikki Linnakangas <[email protected]> wrote: >>> But the comments on the pg_cmp functions say: >>> >>>> * NB: If the comparator function is inlined, some compilers may produce >>>> * worse code with these helper functions than with code with the >>>> * following form: >>>> * >>>> * if (a < b) >>>> * return -1; >>>> * if (a > b) >>>> * return 1; >>>> * return 0; >>>> * >>> >>> So, uh, is that really a universal improvement? Is that comment about >>> producing worse code outdated? > > Well spotted. Thanks! > >> >> No, it's quite recent: >> >> https://www.postgresql.org/message-id/20240212230423.GA3519%40nathanxps13 FWICS, this would only matter if btint4cmp() would get inlined somewhere, where the compiler could actually make use of understanding that parts of the if-cascade are not needed. Andres' example was return DO_COMPARE(a, b) < 0 ? (DO_COMPARE(b, c) < 0 ? b : (DO_COMPARE(a, c) < 0 ? c : a)) : (DO_COMPARE(b, c) > 0 ? b : (DO_COMPARE(a, c) < 0 ? a : c)); In the case of btint4cmp(), it's only ever invoked from the function manager, where it cannot be inlined. Or are there ways to invoke btint4cmp() that can be inlined, which I'm unaware of? > In my original benchmarks it was faster. I'll rebase the remaining > commits and do some more analysis. Here is the disassembly and the perf top output of master vs patched. I compiled with GCC 15.2.0. The unpatched version of btint4cmp() contains a conditional jump, which is mispredicted frequently in the sort. The patched version is completely branchless. master ====== Dump of assembler code for function btint4cmp: 0x00005aa9e33ccdb0 <+0>: endbr64 0x00005aa9e33ccdb4 <+4>: mov 0x20(%rdi),%edx 0x00005aa9e33ccdb7 <+7>: mov $0x1,%eax 0x00005aa9e33ccdbc <+12>: cmp %edx,0x30(%rdi) 0x00005aa9e33ccdbf <+15>: jl 0x5aa9e33ccdca <btint4cmp+26> 0x00005aa9e33ccdc1 <+17>: setne %al 0x00005aa9e33ccdc4 <+20>: movzbl %al,%eax 0x00005aa9e33ccdc7 <+23>: neg %rax 0x00005aa9e33ccdca <+26>: ret 37.22% pg_trgm.so [.] trigram_qsort_signed.constprop.0 7.99% postgres [.] cmpEntryAccumulator 6.60% postgres [.] ginCombineData 6.03% postgres [.] FunctionCall2Coll 3.19% postgres [.] btint4cmp 2.30% postgres [.] rbt_insert 2.29% pg_trgm.so [.] generate_trgm 2.24% postgres [.] pg_mblen_range 1.77% libc.so.6 [.] __towlower_l 1.73% pg_trgm.so [.] trigram_qsort_signed_med3 1.56% postgres [.] pg_utf2wchar_with_len Patched ======= Dump of assembler code for function btint4cmp: 0x000055a69e87bdb0 <+0>: endbr64 0x000055a69e87bdb4 <+4>: mov 0x20(%rdi),%eax 0x000055a69e87bdb7 <+7>: cmp %eax,0x30(%rdi) 0x000055a69e87bdba <+10>: setl %al 0x000055a69e87bdbd <+13>: setg %dl 0x000055a69e87bdc0 <+16>: movzbl %dl,%edx 0x000055a69e87bdc3 <+19>: movzbl %al,%eax 0x000055a69e87bdc6 <+22>: sub %edx,%eax 0x000055a69e87bdc8 <+24>: cltq 0x000055a69e87bdca <+26>: ret 38.07% pg_trgm.so [.] trigram_qsort_signed.constprop.0 7.69% postgres [.] cmpEntryAccumulator 6.96% postgres [.] ginCombineData 3.90% postgres [.] FunctionCall2Coll 2.54% postgres [.] pg_mblen_range 2.40% postgres [.] btint4cmp 2.38% pg_trgm.so [.] generate_trgm 1.86% postgres [.] rbt_insert 1.80% libc.so.6 [.] __towlower_l 1.73% pg_trgm.so [.] trigram_qsort_signed_med3 1.66% postgres [.] pg_utf2wchar_with_len -- David Geier ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-14 14:24 David Geier <[email protected]> parent: David Geier <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: David Geier @ 2026-04-14 14:24 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; Matthias van de Meent <[email protected]>; +Cc: pgsql-hackers On 13.04.2026 17:06, David Geier wrote: >> I squashed 0002 and 0004 into one commit, and did some more refactoring: >> I created a trigram_qsort() helper function that calls the signed or >> unsigned variant, so that that logic doesn't need to be duplicated in >> the callers. For symmetry, I also added a trigram_qunique() helper >> function which just calls qunique() with the new, faster CMPTRGM_EQ >> comparator. Pushed these as commit 9f3755ea07. > > Thanks for committing these patches. Attached are the remaining patches (previously 0003 and 0005) rebased on latest master. Currently, there's no radix sort variant for the unsigned char case. Do we care about this case or is it fine if that case runs slower? The following perf profiles show that trigram_qsort() goes from ~34% down to ~7% with the radix sort optimization. The optimized run also includes the btint4cmp() optimization. Without that the result would be even better. With that change we could move on and tackle optimizing 1. 41.52% generate_trgm_only() by e.g. using an ASCII fast-patch 2. 32.72% ginInsertBAEntries() by no longer using the RB-tree but e.g. also the radix sort master - heapam_index_build_range_scan - 99.40% ginBuildCallback - ginHeapTupleBulkInsert - 66.55% ginExtractEntries - 65.29% FunctionCall3Coll - gin_extract_value_trgm - 62.80% generate_trgm + 34.33% trigram_qsort (inlined) + 26.20% generate_trgm_only + 2.23% trigram_qunique (inlined) + 1.74% detoast_attr + 1.19% qsort_arg_entries + 32.72% ginInsertBAEntries patched - heapam_index_build_range_scan - 99.42% ginBuildCallback - 95.95% ginHeapTupleBulkInsert - 59.11% ginExtractEntries - 56.93% FunctionCall3Coll - gin_extract_value_trgm - 52.19% generate_trgm + 41.52% generate_trgm_only + 7.14% trigram_qsort (inlined) + 3.53% trigram_qunique (inlined) + 4.08% detoast_attr + 2.13% qsort_arg_entries + 36.78% ginInsertBAEntries -- David Geier Attachments: [text/x-patch] v6-0002-Optimize-generate_trgm-with-radix-sort.patch (2.2K, 2-v6-0002-Optimize-generate_trgm-with-radix-sort.patch) download | inline diff: From b20910a814e8c8b64c3d74841fe27a7c12dd927a Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Tue, 11 Nov 2025 13:18:59 +0100 Subject: [PATCH v6 2/2] Optimize generate_trgm() with radix sort --- contrib/pg_trgm/trgm_op.c | 59 +++++++++++++++++++++++++++++++++------ 1 file changed, 51 insertions(+), 8 deletions(-) diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c index 0aca9b5826f..f2dbe88ece0 100644 --- a/contrib/pg_trgm/trgm_op.c +++ b/contrib/pg_trgm/trgm_op.c @@ -226,13 +226,56 @@ CMPTRGM_CHOOSE(const void *a, const void *b) return CMPTRGM(a, b); } -#define ST_SORT trigram_qsort_signed -#define ST_ELEMENT_TYPE_VOID -#define ST_COMPARE(a, b) CMPTRGM_SIGNED(a, b) -#define ST_SCOPE static -#define ST_DEFINE -#define ST_DECLARE -#include "lib/sort_template.h" +/* + * Needed to properly handle negative numbers in case char is signed. + */ +static inline unsigned char FlipSign(char x) +{ + return x^0x80; +} + +static void radix_sort_trigrams_signed(trgm *trg, int count) +{ + trgm *buffer = palloc_array(trgm, count); + trgm *starts[256]; + trgm *from = trg; + trgm *to = buffer; + int freqs[3][256]; + + /* + * Compute frequencies to partition the buffer. + */ + memset(freqs, 0, sizeof(freqs)); + + for (int i=0; i<count; i++) + for (int j=0; j<3; j++) + freqs[j][FlipSign(trg[i][j])]++; + + /* + * Do the sorting. Start with last character because that's the is "LSB" + * in a trigram. Avoid unnecessary copies by ping-ponging between the buffers. + */ + for (int i=2; i>=0; i--) + { + trgm *old_from = from; + trgm *next = to; + + for (int j=0; j<256; j++) + { + starts[j] = next; + next += freqs[i][j]; + } + + for (int j=0; j<count; j++) + memcpy(starts[FlipSign(from[j][i])]++, from[j], sizeof(trgm)); + + from = to; + to = old_from; + } + + memcpy(trg, buffer, sizeof(trgm) * count); + pfree(buffer); +} #define ST_SORT trigram_qsort_unsigned #define ST_ELEMENT_TYPE_VOID @@ -247,7 +290,7 @@ static void trigram_qsort(trgm *array, size_t n) { if (GetDefaultCharSignedness()) - trigram_qsort_signed(array, n, sizeof(trgm)); + radix_sort_trigrams_signed(array, n); else trigram_qsort_unsigned(array, n, sizeof(trgm)); } -- 2.51.0 [text/x-patch] v6-0001-Make-btint4cmp-branchless.patch (1.0K, 3-v6-0001-Make-btint4cmp-branchless.patch) download | inline diff: From 0f39f8618c784299b30566a0bf68beef8b2b8aef Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Mon, 10 Nov 2025 15:40:11 +0100 Subject: [PATCH v6 1/2] Make btint4cmp() branchless --- src/backend/access/nbtree/nbtcompare.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c index 1d343377e98..ac16e3d993d 100644 --- a/src/backend/access/nbtree/nbtcompare.c +++ b/src/backend/access/nbtree/nbtcompare.c @@ -61,6 +61,7 @@ #include "utils/fmgrprotos.h" #include "utils/skipsupport.h" #include "utils/sortsupport.h" +#include "common/int.h" #ifdef STRESS_SORT_INT_MIN #define A_LESS_THAN_B INT_MIN @@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS) int32 a = PG_GETARG_INT32(0); int32 b = PG_GETARG_INT32(1); - if (a > b) - PG_RETURN_INT32(A_GREATER_THAN_B); - else if (a == b) - PG_RETURN_INT32(0); - else - PG_RETURN_INT32(A_LESS_THAN_B); + PG_RETURN_INT32(pg_cmp_s32(a, b)); } Datum -- 2.51.0 ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-15 11:06 Heikki Linnakangas <[email protected]> parent: David Geier <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-15 11:06 UTC (permalink / raw) To: David Geier <[email protected]>; Tom Lane <[email protected]>; Peter Eisentraut <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; pgsql-hackers On 14/04/2026 10:02, David Geier wrote: >> I didn't do it for performance, but because I find the function easier >> to read that way. We could change it back. >> >> It's a pretty scary thought that a compiler might misoptimize that >> though. In the same function we have 'nullFlags', too, as a local >> variable, even before this commit. Not sure why Coverity doesn't >> complain about that. >> >>> /* >>> * PointerGetDatum >>> * Returns datum representation for a pointer. >>> */ >>> static inline Datum >>> PointerGetDatum(const void *X) >>> { >>> return (Datum) (uintptr_t) X; >>> } >> >> Hmm, is that 'const' incorrect? This function doesn't modify *X, but the >> resulting address will be used to modify it. Maybe changing it to non- >> const "void *X" would give Coverity a hint. This was briefly discussed when PointerGetDatum() was changed from a macro to a static inline function [1]. On that email, Peter pointed out that the compiler was doing the same deduction that Coverity did now, i.e. that if you pass the Datum returned by PointerGetDatum(&foo) to a function, it cannot change *foo. I'm surprised we dismissed that worry so quickly. If the compiler optimizes based on that assumption, you can get incorrect code. Three alternative fixes were discussed on that thread. Here's a fourth one that I think is better; #define PointerGetDatum(X) \ ((Datum) (uintptr_t) (true ? (X) : NULL)) I found this trick with the dummy conditional expression at [2]. It always evaluates to just (X), but it has the effect that you get a compiler error if (X) is not a pointer. [1] https://www.postgresql.org/message-id/812568f2-ff1d-ebd9-aee6-e00d8f2e0fb6%40enterprisedb.com [2] See "TO_VOID_PTR_EXPR()" at https://medium.com/@pauljlucas/generic-in-c-d7ab47e3b5ab > Ah, that could be it. > Is there a way for me to run Coverity on a patch to test that out? Not really I'm afraid. I can commit a fix and we'll see if it helps the next time that Coverity runs (= Sunday). > Which Coverity CI do we actually use? Is it this one here [1]? > > [1] https://scan.coverity.com/projects/209? Yeah, that's the one, but only the security team has access. - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-15 19:12 Peter Eisentraut <[email protected]> parent: Heikki Linnakangas <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Peter Eisentraut @ 2026-04-15 19:12 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; David Geier <[email protected]>; Tom Lane <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; pgsql-hackers On 15.04.26 13:06, Heikki Linnakangas wrote: > On 14/04/2026 10:02, David Geier wrote: >>> I didn't do it for performance, but because I find the function easier >>> to read that way. We could change it back. >>> >>> It's a pretty scary thought that a compiler might misoptimize that >>> though. In the same function we have 'nullFlags', too, as a local >>> variable, even before this commit. Not sure why Coverity doesn't >>> complain about that. >>> >>>> /* >>>> * PointerGetDatum >>>> * Returns datum representation for a pointer. >>>> */ >>>> static inline Datum >>>> PointerGetDatum(const void *X) >>>> { >>>> return (Datum) (uintptr_t) X; >>>> } >>> >>> Hmm, is that 'const' incorrect? This function doesn't modify *X, but the >>> resulting address will be used to modify it. Maybe changing it to non- >>> const "void *X" would give Coverity a hint. > > This was briefly discussed when PointerGetDatum() was changed from a > macro to a static inline function [1]. On that email, Peter pointed out > that the compiler was doing the same deduction that Coverity did now, > i.e. that if you pass the Datum returned by PointerGetDatum(&foo) to a > function, it cannot change *foo. I'm surprised we dismissed that worry > so quickly. If the compiler optimizes based on that assumption, you can > get incorrect code. I don't think this is in evidence. AFAICT, it's just Coverity that is complaining here, which is its right, but the code is not incorrect. ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-15 21:25 Tom Lane <[email protected]> parent: Peter Eisentraut <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Tom Lane @ 2026-04-15 21:25 UTC (permalink / raw) To: Peter Eisentraut <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Peter Eisentraut <[email protected]> writes: > On 15.04.26 13:06, Heikki Linnakangas wrote: >> This was briefly discussed when PointerGetDatum() was changed from a >> macro to a static inline function [1]. On that email, Peter pointed out >> that the compiler was doing the same deduction that Coverity did now, >> i.e. that if you pass the Datum returned by PointerGetDatum(&foo) to a >> function, it cannot change *foo. I'm surprised we dismissed that worry >> so quickly. If the compiler optimizes based on that assumption, you can >> get incorrect code. > I don't think this is in evidence. AFAICT, it's just Coverity that is > complaining here, which is its right, but the code is not incorrect. Are you sure? This seems like the sort of thing that will bite us on the rear sometime in the future, as the compiler geeks put in more and more aggressive optimizations. I think we should at least test the theory that changing PointerGetDatum to remove the const cast would silence Coverity's complaint. If it does not then we're attributing too much intelligence to Coverity. But if it does, then we've correctly identified why it's complaining, and we should take seriously the idea that they aren't the only ones making this sort of deduction (or won't be for long). regards, tom lane ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-16 08:45 Peter Eisentraut <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 2 replies; 31+ messages in thread From: Peter Eisentraut @ 2026-04-16 08:45 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On 15.04.26 23:25, Tom Lane wrote: > Peter Eisentraut <[email protected]> writes: >> On 15.04.26 13:06, Heikki Linnakangas wrote: >>> This was briefly discussed when PointerGetDatum() was changed from a >>> macro to a static inline function [1]. On that email, Peter pointed out >>> that the compiler was doing the same deduction that Coverity did now, >>> i.e. that if you pass the Datum returned by PointerGetDatum(&foo) to a >>> function, it cannot change *foo. I'm surprised we dismissed that worry >>> so quickly. If the compiler optimizes based on that assumption, you can >>> get incorrect code. > >> I don't think this is in evidence. AFAICT, it's just Coverity that is >> complaining here, which is its right, but the code is not incorrect. > > Are you sure? This seems like the sort of thing that will bite us on > the rear sometime in the future, as the compiler geeks put in more and > more aggressive optimizations. > > I think we should at least test the theory that changing > PointerGetDatum to remove the const cast would silence Coverity's > complaint. If it does not then we're attributing too much > intelligence to Coverity. But if it does, then we've correctly > identified why it's complaining, and we should take seriously the > idea that they aren't the only ones making this sort of deduction > (or won't be for long). I think it's quite clear to me that Coverity is complaining about this correctly, in its view of the world. Compilers sometimes complain about this, too, although in this case they apparently don't look quite as deeply to do this analysis. What I'm missing here is, essentially where the previous thread stopped: What is the overall message that we want to communicate with the API? If the default assumption is that what pointers converted to Datums point to should not be modified on the other side (where the Datum is converted back to a pointer), then the current declaration of PointerGetDatum() is suitable, and the GIN code can be considered an exception and we make a special API for that. The previous thread proposed NonconstPointerGetDatum(). (If this is the resolution, I also have half a patch somewhere that makes the string input argument for the InputFunctionCall family of functions const, which also seems intuitively sensible.) If, on the other hand, the decision is that there is in fact no such guarantee, that consumers of Datums are free to modify whatever they seem fit, then we should drop the const of PointerGetDatum and fix the fallout up the call stack. The macro proposed by Heikki, I don't know, still doesn't actually answer this question, just (possibly) makes these warnings go away in a slightly mysterious way. ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-16 09:49 Heikki Linnakangas <[email protected]> parent: Peter Eisentraut <[email protected]> 1 sibling, 1 reply; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-16 09:49 UTC (permalink / raw) To: Peter Eisentraut <[email protected]>; Tom Lane <[email protected]>; +Cc: David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On 16/04/2026 11:45, Peter Eisentraut wrote: > On 15.04.26 23:25, Tom Lane wrote: >> Peter Eisentraut <[email protected]> writes: >>> On 15.04.26 13:06, Heikki Linnakangas wrote: >>>> This was briefly discussed when PointerGetDatum() was changed from a >>>> macro to a static inline function [1]. On that email, Peter pointed out >>>> that the compiler was doing the same deduction that Coverity did now, >>>> i.e. that if you pass the Datum returned by PointerGetDatum(&foo) to a >>>> function, it cannot change *foo. I'm surprised we dismissed that worry >>>> so quickly. If the compiler optimizes based on that assumption, you can >>>> get incorrect code. >> >>> I don't think this is in evidence. AFAICT, it's just Coverity that is >>> complaining here, which is its right, but the code is not incorrect. >> >> Are you sure? This seems like the sort of thing that will bite us on >> the rear sometime in the future, as the compiler geeks put in more and >> more aggressive optimizations. >> >> I think we should at least test the theory that changing >> PointerGetDatum to remove the const cast would silence Coverity's >> complaint. If it does not then we're attributing too much >> intelligence to Coverity. But if it does, then we've correctly >> identified why it's complaining, and we should take seriously the >> idea that they aren't the only ones making this sort of deduction >> (or won't be for long). > > I think it's quite clear to me that Coverity is complaining about this > correctly, in its view of the world. Compilers sometimes complain about > this, too, although in this case they apparently don't look quite as > deeply to do this analysis. > > What I'm missing here is, essentially where the previous thread stopped: > What is the overall message that we want to communicate with the API? > > If the default assumption is that what pointers converted to Datums > point to should not be modified on the other side (where the Datum is > converted back to a pointer), then the current declaration of > PointerGetDatum() is suitable, and the GIN code can be considered an > exception and we make a special API for that. The previous thread > proposed NonconstPointerGetDatum(). > > (If this is the resolution, I also have half a patch somewhere that > makes the string input argument for the InputFunctionCall family of > functions const, which also seems intuitively sensible.) > > If, on the other hand, the decision is that there is in fact no such > guarantee, that consumers of Datums are free to modify whatever they > seem fit, then we should drop the const of PointerGetDatum and fix the > fallout up the call stack. > > The macro proposed by Heikki, I don't know, still doesn't actually > answer this question, just (possibly) makes these warnings go away in a > slightly mysterious way. My intention was that if you do: const foo *ptr; ... datum = PointerGetDatum(ptr); Then you're not allowed to modify the contents of *ptr through the 'datum'. But if you do: foo *ptr; ... datum = PointerGetDatum(ptr); Then it is allowed. I don't know how to tell the compiler exactly that. I tried various hacks with _Generic(), but couldn't make it work. The macro casts away the 'const' in the first case, which might disable some optimizations that the compiler could otherwise do. We could have all three: ConstPointerGetDatum(const void *): Promises that the returned Datum is not used to modify NonConstPointerGetDatum(void *): No such promise. PointerGetDatum(): Macro as I proposed. Like NonConstPointerGetDatum(), but for backwards-compatibility it doesn't emit a warning if you pass a const pointer to it. - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-16 14:37 Tom Lane <[email protected]> parent: Heikki Linnakangas <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Tom Lane @ 2026-04-16 14:37 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Heikki Linnakangas <[email protected]> writes: > On 16/04/2026 11:45, Peter Eisentraut wrote: >> What I'm missing here is, essentially where the previous thread stopped: >> What is the overall message that we want to communicate with the API? Good point. >> If the default assumption is that what pointers converted to Datums >> point to should not be modified on the other side (where the Datum is >> converted back to a pointer), then the current declaration of >> PointerGetDatum() is suitable, and the GIN code can be considered an >> exception and we make a special API for that. The previous thread >> proposed NonconstPointerGetDatum(). I think there can be no doubt that most functions receiving a pass-by-ref Datum are not supposed to scribble on the pointed-to data. So it makes sense to me that PointerGetDatum should carry an implication of const-ness, and then we need to invent a new notation to use in the small number of places where that's not appropriate. I'd capitalize it as NonConstPointerGetDatum, but other than that nit that naming suggestion seems fine to me. Of course, then the *real* question is why DatumGetPointer doesn't deliver a const pointer. But I don't see how to get there without extremely invasive changes. > We could have all three: Not excited about making massive changes for this. I remain far less certain than Peter is that this discussion has anything to do with why Coverity is complaining about ginExtractEntries. I still think we should make some minimum-effort change to see if the complaint goes away before expending a lot of brain cells on choosing a final fix. regards, tom lane ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-16 14:43 Andres Freund <[email protected]> parent: Peter Eisentraut <[email protected]> 1 sibling, 0 replies; 31+ messages in thread From: Andres Freund @ 2026-04-16 14:43 UTC (permalink / raw) To: [email protected]; Peter Eisentraut <[email protected]>; Tom Lane <[email protected]>; +Cc: Heikki Linnakangas <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Hi, On April 16, 2026 4:45:55 AM EDT, Peter Eisentraut <[email protected]> wrote: >On 15.04.26 23:25, Tom Lane wrote: >> Peter Eisentraut <[email protected]> writes: >>> On 15.04.26 13:06, Heikki Linnakangas wrote: >>>> This was briefly discussed when PointerGetDatum() was changed from a >>>> macro to a static inline function [1]. On that email, Peter pointed out >>>> that the compiler was doing the same deduction that Coverity did now, >>>> i.e. that if you pass the Datum returned by PointerGetDatum(&foo) to a >>>> function, it cannot change *foo. I'm surprised we dismissed that worry >>>> so quickly. If the compiler optimizes based on that assumption, you can >>>> get incorrect code. >> >>> I don't think this is in evidence. AFAICT, it's just Coverity that is >>> complaining here, which is its right, but the code is not incorrect. >> >> Are you sure? This seems like the sort of thing that will bite us on >> the rear sometime in the future, as the compiler geeks put in more and >> more aggressive optimizations. >> >> I think we should at least test the theory that changing >> PointerGetDatum to remove the const cast would silence Coverity's >> complaint. If it does not then we're attributing too much >> intelligence to Coverity. But if it does, then we've correctly >> identified why it's complaining, and we should take seriously the >> idea that they aren't the only ones making this sort of deduction >> (or won't be for long). > >I think it's quite clear to me that Coverity is complaining about this correctly, in its view of the world. Compilers sometimes complain about this, too, although in this case they apparently don't look quite as deeply to do this analysis. > >What I'm missing here is, essentially where the previous thread stopped: What is the overall message that we want to communicate with the API? > >If the default assumption is that what pointers converted to Datums point to should not be modified on the other side (where the Datum is converted back to a pointer), then the current declaration of PointerGetDatum() is suitable, and the GIN code can be considered an exception and we make a special API for that. The previous thread proposed NonconstPointerGetDatum(). To me it seems way way way to dangerous to just redefine what PointerGetDatum() means for all existing callers, without doing an exhaustive verification of all the callers. Separately from that, it doesn't seem defensible to take a const pointer and return a non const one. Why is that sane? Greetings, Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-16 17:30 Heikki Linnakangas <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-16 17:30 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On 16/04/2026 17:37, Tom Lane wrote: > Heikki Linnakangas <[email protected]> writes: >> On 16/04/2026 11:45, Peter Eisentraut wrote: >>> What I'm missing here is, essentially where the previous thread stopped: >>> What is the overall message that we want to communicate with the API? > > Good point. > >>> If the default assumption is that what pointers converted to Datums >>> point to should not be modified on the other side (where the Datum is >>> converted back to a pointer), then the current declaration of >>> PointerGetDatum() is suitable, and the GIN code can be considered an >>> exception and we make a special API for that. The previous thread >>> proposed NonconstPointerGetDatum(). > > I think there can be no doubt that most functions receiving a > pass-by-ref Datum are not supposed to scribble on the pointed-to > data. So it makes sense to me that PointerGetDatum should carry > an implication of const-ness, and then we need to invent a new > notation to use in the small number of places where that's not > appropriate. I'd capitalize it as NonConstPointerGetDatum, > but other than that nit that naming suggestion seems fine to me. That makes sense. My worry is that we're changing the rules in a very subtle way: It used to be OK to use PointerGetDatum(), pass the resulting datum to something that modifies it. Now we say it's not OK, and you must use NonConstPointerGetDatum(). You don't get any compiler warnings if you use it wrong, except for this one coverity warning apparently, but it doesn't catch this reliably either. > Of course, then the *real* question is why DatumGetPointer > doesn't deliver a const pointer. But I don't see how to get > there without extremely invasive changes. Good point. >> We could have all three: > > Not excited about making massive changes for this. Having all three would be a very localized change in postgres.h. > I remain far less certain than Peter is that this discussion has > anything to do with why Coverity is complaining about > ginExtractEntries. I still think we should make some minimum-effort > change to see if the complaint goes away before expending a lot of > brain cells on choosing a final fix. I think I'm going to commit my proposal to turn PointerGetDatum() back into a macro, and see if that makes Coverity happy. Then we'll know, and we can decide on the next steps. Any objections? One open question is whether we should backpatch any of this. I guess compilers don't misoptimize this in practice, or we would've gotten more reports, but I really can't rationalize why not and a new compiler version might well start hitting this. - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-16 17:47 Tom Lane <[email protected]> parent: Heikki Linnakangas <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Tom Lane @ 2026-04-16 17:47 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers Heikki Linnakangas <[email protected]> writes: > On 16/04/2026 17:37, Tom Lane wrote: >> Not excited about making massive changes for this. > Having all three would be a very localized change in postgres.h. Sure, but *using* them in a consistent way would be invasive. >> I remain far less certain than Peter is that this discussion has >> anything to do with why Coverity is complaining about >> ginExtractEntries. I still think we should make some minimum-effort >> change to see if the complaint goes away before expending a lot of >> brain cells on choosing a final fix. > I think I'm going to commit my proposal to turn PointerGetDatum() back > into a macro, and see if that makes Coverity happy. Then we'll know, and > we can decide on the next steps. Any objections? WFM. regards, tom lane ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-17 19:21 Heikki Linnakangas <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 1 reply; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-17 19:21 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; David Geier <[email protected]>; Matthias van de Meent <[email protected]>; pgsql-hackers On 16/04/2026 20:47, Tom Lane wrote: > Heikki Linnakangas <[email protected]> writes: >> On 16/04/2026 17:37, Tom Lane wrote: >>> Not excited about making massive changes for this. > >> Having all three would be a very localized change in postgres.h. > > Sure, but *using* them in a consistent way would be invasive. > >>> I remain far less certain than Peter is that this discussion has >>> anything to do with why Coverity is complaining about >>> ginExtractEntries. I still think we should make some minimum-effort >>> change to see if the complaint goes away before expending a lot of >>> brain cells on choosing a final fix. > >> I think I'm going to commit my proposal to turn PointerGetDatum() back >> into a macro, and see if that makes Coverity happy. Then we'll know, and >> we can decide on the next steps. Any objections? > > WFM. Grepping for PointerGetDatum(), there are a bunch of wrappers of it for specific types, like: static inline Datum CStringGetDatum(const char *X) static inline Datum NumericGetDatum(Numeric X) Most are marked "const". These all potentially have the same problem, but I think for these it is a good assumption that resulting Datum will not be used to modify *X, so we can leave them alone. I guess we didn't do that for NumericGetDatum just because the Numeric typedef doesn't allow that. There's also: static inline Datum fetch_att(const void *T, bool attbyval, int attlen) The "const" seems reasonable on that too. This is an interesting case: static inline Datum EOHPGetRWDatum(const struct ExpandedObjectHeader *eohptr) { return PointerGetDatum(eohptr->eoh_rw_ptr); } That RW stands for read/write, which sounds alarming. But the returned datum points to eohptr->eoh_rw_ptr rather than *eohptr itself, so I think the 'const' is correct here after all. So, pushed a commit that changes just PointerGetDatum() itself, leaving all those others alone. - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-22 21:11 Heikki Linnakangas <[email protected]> parent: Heikki Linnakangas <[email protected]> 0 siblings, 0 replies; 31+ messages in thread From: Heikki Linnakangas @ 2026-04-22 21:11 UTC (permalink / raw) To: Tom Lane <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers; David Geier <[email protected]> On 17/04/2026 22:21, Heikki Linnakangas wrote: > On 16/04/2026 20:47, Tom Lane wrote: >> Heikki Linnakangas <[email protected]> writes: >>> On 16/04/2026 17:37, Tom Lane wrote: >>>> Not excited about making massive changes for this. >> >>> Having all three would be a very localized change in postgres.h. >> >> Sure, but *using* them in a consistent way would be invasive. >> >>>> I remain far less certain than Peter is that this discussion has >>>> anything to do with why Coverity is complaining about >>>> ginExtractEntries. I still think we should make some minimum-effort >>>> change to see if the complaint goes away before expending a lot of >>>> brain cells on choosing a final fix. >> >>> I think I'm going to commit my proposal to turn PointerGetDatum() back >>> into a macro, and see if that makes Coverity happy. Then we'll know, and >>> we can decide on the next steps. Any objections? >> >> WFM. > > ... > > So, pushed a commit that changes just PointerGetDatum() itself, leaving > all those others alone. As we thought, this made the Coverity warning go away. I'm happy with the status quo in master, but if we want to introduce new ConstPointerGetDatum() or NonConstPointerGetDatum() variants instead of the macro, now is the time to do it. For backbranches, IMHO we should go with the macro. It's a little scary to replace such a widely used function as PointerGetDatum() in back-branches, but I do think this should be fixed. Introducing new variants doesn't seems even less backpatchable. - Heikki ^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Reduce build times of pg_trgm GIN indexes @ 2026-04-22 21:26 David Geier <[email protected]> parent: David Geier <[email protected]> 0 siblings, 0 replies; 31+ messages in thread From: David Geier @ 2026-04-22 21:26 UTC (permalink / raw) To: Heikki Linnakangas <[email protected]>; Matthias van de Meent <[email protected]>; +Cc: pgsql-hackers On 14.04.2026 16:24, David Geier wrote: > Attached are the remaining patches (previously 0003 and 0005) rebased on > latest master. Currently, there's no radix sort variant for the unsigned > char case. Do we care about this case or is it fine if that case runs > slower? > > The following perf profiles show that trigram_qsort() goes from ~34% > down to ~7% with the radix sort optimization. The optimized run also > includes the btint4cmp() optimization. Without that the result would be > even better. > > With that change we could move on and tackle optimizing > > 1. 41.52% generate_trgm_only() by e.g. using an ASCII fast-patch > 2. 32.72% ginInsertBAEntries() by no longer using the RB-tree but > e.g. also the radix sort Attached is the rebased patch set as well as a new patch that optimizes ginInsertBAEntries(). Performance improvements are as follows, measured with the same benchmark I used in the first mail of this thread. Runtimes and deltas are in milliseconds. Code | movies | delta | lineitem | delta -----------------------------------|--------|--------|------------------ master | 11,160 | - | 248,146 | - v7-0001-Make-btint4cmp-branchless | 9,509 | 1,651 | 236,760 | 11,386 v7-0002-Use-radix-sort | 6,123 | 3,386 | 214,632 | 22,128 v7-0003-Replace-RB-tree | 4,755 | 1,368 | 144,252 | 70,380 I optimized ginInsertBAEntries() by replacing the red-black tree by a hash map (simplehash.h) on the keys (e.g. trigrams), where each hash map entry contains an item pointer list with the row TIDs the key occurs in. This is in the spirit of the original red-black tree implementation but this way the deduplication of each key is handled in O(1), instead of O(log(num_unique_keys)). This reduces the overall runtime complexity from O(num_total_keys * log(num_unique_keys)) -> O(num_total_keys + num_unique_keys * log(num_unique_keys)) As there are normally a lot less keys than rows, this is complexity-wise preferable. And even if all keys are unique, the rebalancing operations are pretty expensive and in reality outweigh the slightly better complexity in the worst-case. - The sorting of the item pointer lists for each key stays as is, except for that I switched to sort_template.h. - The red-black tree code in rbtree.c/.h is now completely unused and could be removed. Thoughts? - The size on disk changed very slightly. I think this is because the memory accounting is slightly different and with that slightly more or less keys / item pointers can be processed in one go. - FWICS, we can use datumIsEqual() and datum_image_hash() in the hash map because as of today, the GIN index code doesn't work properly with non-deterministic collations / non-image-equal types, e.g. see [1]. - I also simplified ginInsertBAEntries(). Previously, it used an insertion order that minimizes the number of RB-tree rebalancing operations for sorted inputs. With the hash map approach, it's better to insert in sort order as this increases the likelihood of hitting the same hash map entry again for equal keys. - This patch set also improves parallel GIN index builds as the parallel code builds on top of ginInsertBAEntries(). - Regression tests pass and gin_index_check() from amcheck couldn't find an error in the data structures. To be fair: 0001 is less interesting after removing the RB-tree because a lot less key comparisons are being performed. I still think it makes sense to merge it as it should also accelerate e.g. B-tree traversals. With these changes the number one bottleneck becomes the trigram generation in generate_trgm_only(). The following perf profile nicely shows that: - 99.89% 0.00% postgres postgres [.] ginbuild ginbuild table_index_build_scan (inlined) - heapam_index_build_range_scan - 94.05% ginBuildCallback - 86.48% ginHeapTupleBulkInsert - 68.57% ginExtractEntries - 64.69% FunctionCall3Coll - gin_extract_value_trgm - 62.75% generate_trgm - 39.57% generate_trgm_only + 18.53% str_tolower + 16.91% find_word (inlined) + 1.36% make_trigrams (inlined) 0.66% __strlen_avx2 + 18.49% trigram_qsort (inlined) + 4.17% trigram_qunique (inlined) 0.79% trgm2int + 3.33% qsort_arg_entries - 17.28% ginInsertBAEntries - ginInsertBAEntry (inlined) - 8.04% ginbuild_insert (inlined) - 4.89% ginbuild_insert_hash_internal (inlined) 1.25% gin_equal_key (inlined) 0.68% ginbuild_initial_bucket (inlined) + 3.14% gin_hash_key (inlined) + 2.45% AllocSetRealloc + 0.85% asm_exc_page_fault 0.52% getDatumCopy (inlined) + 6.00% ginEntryInsert + 1.19% ginBeginBAScan + 2.99% heap_getnext [1] https://www.postgresql.org/message-id/flat/8ef4899c4acfebca45cc6c042a6dc611d25ffab1.camel%40cybertec... -- David Geier Attachments: [text/x-patch] v7-0003-Replace-RB-tree-with-hash-map-and-sort-in-GIN-ind.patch (14.6K, 2-v7-0003-Replace-RB-tree-with-hash-map-and-sort-in-GIN-ind.patch) download | inline diff: From c3fb626cae56bcb620af02be55dd18ccab97da72 Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Wed, 22 Apr 2026 14:00:40 +0200 Subject: [PATCH v7 3/3] Replace RB-tree with hash map and sort in GIN index --- src/backend/access/gin/ginbulk.c | 343 +++++++++++++++---------------- src/include/access/gin_private.h | 24 +-- 2 files changed, 172 insertions(+), 195 deletions(-) diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c index 85865b39105..e2d23786225 100644 --- a/src/backend/access/gin/ginbulk.c +++ b/src/backend/access/gin/ginbulk.c @@ -17,107 +17,118 @@ #include <limits.h> #include "access/gin_private.h" +#include "common/hashfn.h" #include "utils/datum.h" #include "utils/memutils.h" +#define DEF_NENTRY 2048 /* Initial hash table size */ +#define DEF_ITEMS_PER_KEY 8 /* Initial ItemPointer array size per key */ -#define DEF_NENTRY 2048 /* GinEntryAccumulator allocation quantum */ -#define DEF_NPTR 5 /* ItemPointer initial allocation quantum */ - - -/* Combiner function for rbtree.c */ -static void -ginCombineData(RBTNode *existing, const RBTNode *newdata, void *arg) +typedef struct GinHashKey { - GinEntryAccumulator *eo = (GinEntryAccumulator *) existing; - const GinEntryAccumulator *en = (const GinEntryAccumulator *) newdata; - BuildAccumulator *accum = (BuildAccumulator *) arg; + OffsetNumber attnum; + GinNullCategory category; + Datum key; +} GinHashKey; - /* - * Note this code assumes that newdata contains only one itempointer. - */ - if (eo->count >= eo->maxcount) - { - if (eo->maxcount > INT_MAX) - ereport(ERROR, - (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), - errmsg("posting list is too long"), - errhint("Reduce \"maintenance_work_mem\"."))); +typedef struct GinHashEntry +{ + GinHashKey hashkey; + uint32 hash; + char status; + ItemPointerData * items; + uint32 numItems; + uint32 allocatedItems; +} GinHashEntry; + +typedef struct GinSortEntry +{ + GinHashKey hashkey; + ItemPointerData * items; + uint32 numItems; +} GinSortEntry; + +static uint32 gin_hash_key(struct ginbuild_hash *tb, GinHashKey *key); +static bool gin_equal_key(struct ginbuild_hash *tb, GinHashKey *a, GinHashKey *b); + +#define SH_PREFIX ginbuild +#define SH_ELEMENT_TYPE GinHashEntry +#define SH_KEY_TYPE GinHashKey +#define SH_KEY hashkey +#define SH_HASH_KEY(tb, key) gin_hash_key(tb, &key) +#define SH_EQUAL(tb, a, b) gin_equal_key(tb, &a, &b) +#define SH_SCOPE static inline +#define SH_STORE_HASH +#define SH_GET_HASH(tb, a) (a)->hash +#define SH_DEFINE +#define SH_DECLARE +#include "lib/simplehash.h" + +static uint32 +gin_hash_key(struct ginbuild_hash *tb, GinHashKey *key) +{ + BuildAccumulator *accum = (BuildAccumulator *) tb->private_data; + uint32 hash; - accum->allocatedMemory -= GetMemoryChunkSpace(eo->list); - eo->maxcount *= 2; - eo->list = (ItemPointerData *) - repalloc_huge(eo->list, sizeof(ItemPointerData) * eo->maxcount); - accum->allocatedMemory += GetMemoryChunkSpace(eo->list); - } + hash = hash_combine(0, murmurhash32((uint32) key->attnum)); + hash = hash_combine(hash, murmurhash32((uint32) key->category)); - /* If item pointers are not ordered, they will need to be sorted later */ - if (eo->shouldSort == false) + if (key->category == GIN_CAT_NORM_KEY) { - int res; - - res = ginCompareItemPointers(eo->list + eo->count - 1, en->list); - Assert(res != 0); + CompactAttribute *att; - if (res > 0) - eo->shouldSort = true; + att = TupleDescCompactAttr(accum->ginstate->origTupdesc, key->attnum - 1); + hash = hash_combine(hash, datum_image_hash(key->key, att->attbyval, att->attlen)); } - eo->list[eo->count] = en->list[0]; - eo->count++; + return hash; } -/* Comparator function for rbtree.c */ -static int -cmpEntryAccumulator(const RBTNode *a, const RBTNode *b, void *arg) +static bool +gin_equal_key(struct ginbuild_hash *tb, GinHashKey *a, GinHashKey *b) { - const GinEntryAccumulator *ea = (const GinEntryAccumulator *) a; - const GinEntryAccumulator *eb = (const GinEntryAccumulator *) b; - BuildAccumulator *accum = (BuildAccumulator *) arg; - - return ginCompareAttEntries(accum->ginstate, - ea->attnum, ea->key, ea->category, - eb->attnum, eb->key, eb->category); -} + BuildAccumulator *accum = (BuildAccumulator *) tb->private_data; + CompactAttribute *att; -/* Allocator function for rbtree.c */ -static RBTNode * -ginAllocEntryAccumulator(void *arg) -{ - BuildAccumulator *accum = (BuildAccumulator *) arg; - GinEntryAccumulator *ea; + if (a->attnum != b->attnum) + return false; + if (a->category != b->category) + return false; + if (a->category != GIN_CAT_NORM_KEY) + return true; /* - * Allocate memory by rather big chunks to decrease overhead. We have no - * need to reclaim RBTNodes individually, so this costs nothing. + * Compare the actual key values using image equality. + * This is correct because we don't want to deduplicate at this point. */ - if (accum->entryallocator == NULL || accum->eas_used >= DEF_NENTRY) - { - accum->entryallocator = palloc_array(GinEntryAccumulator, DEF_NENTRY); - accum->allocatedMemory += GetMemoryChunkSpace(accum->entryallocator); - accum->eas_used = 0; - } - - /* Allocate new RBTNode from current chunk */ - ea = accum->entryallocator + accum->eas_used; - accum->eas_used++; - - return (RBTNode *) ea; + att = TupleDescCompactAttr(accum->ginstate->origTupdesc, a->attnum - 1); + return datumIsEqual(a->key, b->key, att->attbyval, att->attlen); } +#define ST_SORT sort_itempointers +#define ST_ELEMENT_TYPE ItemPointerData +#define ST_COMPARE(a, b) ginCompareItemPointers(a, b) +#define ST_SCOPE static +#define ST_DEFINE +#include "lib/sort_template.h" + +#define ST_SORT sort_keys +#define ST_ELEMENT_TYPE GinSortEntry +#define ST_COMPARE_ARG_TYPE GinState +#define ST_COMPARE(a, b, state) ginCompareAttEntries(state, a->hashkey.attnum, a->hashkey.key, a->hashkey.category, b->hashkey.attnum, b->hashkey.key, b->hashkey.category) +#define ST_SCOPE static +#define ST_DEFINE +#include "lib/sort_template.h" + void ginInitBA(BuildAccumulator *accum) { /* accum->ginstate is intentionally not set here */ - accum->allocatedMemory = 0; - accum->entryallocator = NULL; - accum->eas_used = 0; - accum->tree = rbt_create(sizeof(GinEntryAccumulator), - cmpEntryAccumulator, - ginCombineData, - ginAllocEntryAccumulator, - NULL, /* no freefunc needed */ - accum); + accum->hash = ginbuild_create(CurrentMemoryContext, DEF_NENTRY, accum); + accum->allocatedMemory = accum->hash->size * sizeof(GinHashEntry); + accum->sorted_entries = NULL; + accum->num_entries = 0; + accum->current_pos = 0; } /* @@ -142,124 +153,109 @@ getDatumCopy(BuildAccumulator *accum, OffsetNumber attnum, Datum value) } /* - * Find/store one entry from indexed value. + * Insert one entry into the hash map. + * If the key already exists, append to its ItemPointer array. + * Otherwise, create a new hash entry with a new ItemPointer array. */ static void ginInsertBAEntry(BuildAccumulator *accum, ItemPointer heapptr, OffsetNumber attnum, Datum key, GinNullCategory category) { - GinEntryAccumulator eatmp; - GinEntryAccumulator *ea; - bool isNew; - - /* - * For the moment, fill only the fields of eatmp that will be looked at by - * cmpEntryAccumulator or ginCombineData. - */ - eatmp.attnum = attnum; - eatmp.key = key; - eatmp.category = category; - /* temporarily set up single-entry itempointer list */ - eatmp.list = heapptr; + GinHashKey hashkey; + GinHashEntry *entry; + bool found; + uint64 oldsize; + + hashkey.attnum = attnum; + hashkey.category = category; + if (category == GIN_CAT_NORM_KEY) + hashkey.key = getDatumCopy(accum, attnum, key); + else + hashkey.key = key; - ea = (GinEntryAccumulator *) rbt_insert(accum->tree, (RBTNode *) &eatmp, - &isNew); + oldsize = accum->hash->size; + entry = ginbuild_insert(accum->hash, hashkey, &found); - if (isNew) + if (!found) { - /* - * Finish initializing new tree entry, including making permanent - * copies of the datum (if it's not null) and itempointer. - */ - if (category == GIN_CAT_NORM_KEY) - ea->key = getDatumCopy(accum, attnum, key); - ea->maxcount = DEF_NPTR; - ea->count = 1; - ea->shouldSort = false; - ea->list = palloc_array(ItemPointerData, DEF_NPTR); - ea->list[0] = *heapptr; - accum->allocatedMemory += GetMemoryChunkSpace(ea->list); + entry->items = palloc_array(ItemPointerData, DEF_ITEMS_PER_KEY); + entry->numItems = 0; + entry->allocatedItems = DEF_ITEMS_PER_KEY; + accum->allocatedMemory += (accum->hash->size - oldsize) * sizeof(GinHashEntry); + accum->allocatedMemory += GetMemoryChunkSpace(entry->items); } - else + + if (entry->numItems >= entry->allocatedItems) { - /* - * ginCombineData did everything needed. - */ + uint32 new_allocated; + + if (entry->allocatedItems > UINT32_MAX / 2) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("too many GIN item pointers for a single key"), + errhint("Reduce \"maintenance_work_mem\"."))); + + accum->allocatedMemory -= GetMemoryChunkSpace(entry->items); + new_allocated = entry->allocatedItems * 2; + entry->items = repalloc_huge(entry->items, sizeof(ItemPointerData) * new_allocated); + entry->allocatedItems = new_allocated; + accum->allocatedMemory += GetMemoryChunkSpace(entry->items); } + + entry->items[entry->numItems++] = *heapptr; } -/* - * Insert the entries for one heap pointer. - * - * Since the entries are being inserted into a balanced binary tree, you - * might think that the order of insertion wouldn't be critical, but it turns - * out that inserting the entries in sorted order results in a lot of - * rebalancing operations and is slow. To prevent this, we attempt to insert - * the nodes in an order that will produce a nearly-balanced tree if the input - * is in fact sorted. - * - * We do this as follows. First, we imagine that we have an array whose size - * is the smallest power of two greater than or equal to the actual array - * size. Second, we insert the middle entry of our virtual array into the - * tree; then, we insert the middles of each half of our virtual array, then - * middles of quarters, etc. - */ void ginInsertBAEntries(BuildAccumulator *accum, ItemPointer heapptr, OffsetNumber attnum, Datum *entries, GinNullCategory *categories, int32 nentries) { - uint32 step = nentries; - if (nentries <= 0) return; Assert(ItemPointerIsValid(heapptr) && attnum >= FirstOffsetNumber); - /* - * step will contain largest power of 2 and <= nentries - */ - step |= (step >> 1); - step |= (step >> 2); - step |= (step >> 4); - step |= (step >> 8); - step |= (step >> 16); - step >>= 1; - step++; - - while (step > 0) - { - int i; - - for (i = step - 1; i < nentries && i >= 0; i += step << 1 /* *2 */ ) - ginInsertBAEntry(accum, heapptr, attnum, - entries[i], categories[i]); - - step >>= 1; /* /2 */ - } + for (int i = 0; i < nentries; i++) + ginInsertBAEntry(accum, heapptr, attnum, entries[i], categories[i]); } -static int -qsortCompareItemPointers(const void *a, const void *b) -{ - int res = ginCompareItemPointers((const ItemPointerData *) a, (const ItemPointerData *) b); - - /* Assert that there are no equal item pointers being sorted */ - Assert(res != 0); - return res; -} - -/* Prepare to read out the rbtree contents using ginGetBAEntry */ +/* Prepare to read out the hash table contents using ginGetBAEntry */ void ginBeginBAScan(BuildAccumulator *accum) { - rbt_begin_iterate(accum->tree, LeftRightWalk, &accum->tree_walk); + ginbuild_iterator iter; + GinHashEntry *entry; + uint32 i = 0; + + accum->num_entries = accum->hash->members; + accum->current_pos = 0; + + if (accum->num_entries == 0) + return; + + accum->sorted_entries = palloc_array(GinSortEntry, accum->num_entries); + ginbuild_start_iterate(accum->hash, &iter); + + while ((entry = ginbuild_iterate(accum->hash, &iter)) != NULL) + { + GinSortEntry *se = &accum->sorted_entries[i]; + sort_itempointers(entry->items, entry->numItems); + + se->hashkey = entry->hashkey; + se->items = entry->items; + se->numItems = entry->numItems; + i++; + } + + Assert(i == accum->num_entries); + sort_keys(accum->sorted_entries, accum->num_entries, accum->ginstate); + accum->current_pos = 0; } /* - * Get the next entry in sequence from the BuildAccumulator's rbtree. + * Get the next entry in sequence from the BuildAccumulator's sorted hash entries. * This consists of a single key datum and a list (array) of one or more * heap TIDs in which that key is found. The list is guaranteed sorted. */ @@ -268,25 +264,18 @@ ginGetBAEntry(BuildAccumulator *accum, OffsetNumber *attnum, Datum *key, GinNullCategory *category, uint32 *n) { - GinEntryAccumulator *entry; - ItemPointerData *list; - - entry = (GinEntryAccumulator *) rbt_iterate(&accum->tree_walk); + GinSortEntry *entry; - if (entry == NULL) + if (accum->current_pos >= accum->num_entries) return NULL; /* no more entries */ - *attnum = entry->attnum; - *key = entry->key; - *category = entry->category; - list = entry->list; - *n = entry->count; - - Assert(list != NULL && entry->count > 0); + entry = &accum->sorted_entries[accum->current_pos]; + accum->current_pos++; - if (entry->shouldSort && entry->count > 1) - qsort(list, entry->count, sizeof(ItemPointerData), - qsortCompareItemPointers); + *attnum = entry->hashkey.attnum; + *key = entry->hashkey.key; + *category = entry->hashkey.category; + *n = entry->numItems; - return list; + return entry->items; } diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h index 6725ee2839f..39c015b1355 100644 --- a/src/include/access/gin_private.h +++ b/src/include/access/gin_private.h @@ -18,7 +18,6 @@ #include "common/int.h" #include "catalog/pg_am_d.h" #include "fmgr.h" -#include "lib/rbtree.h" #include "nodes/tidbitmap.h" #include "storage/bufmgr.h" @@ -420,26 +419,15 @@ extern void ginadjustmembers(Oid opfamilyoid, List *functions); /* ginbulk.c */ -typedef struct GinEntryAccumulator -{ - RBTNode rbtnode; - Datum key; - GinNullCategory category; - OffsetNumber attnum; - bool shouldSort; - ItemPointerData *list; - uint32 maxcount; /* allocated size of list[] */ - uint32 count; /* current number of list[] entries */ -} GinEntryAccumulator; typedef struct { - GinState *ginstate; - Size allocatedMemory; - GinEntryAccumulator *entryallocator; - uint32 eas_used; - RBTree *tree; - RBTreeIterator tree_walk; + GinState * ginstate; + Size allocatedMemory; + struct ginbuild_hash * hash; + struct GinSortEntry * sorted_entries; + uint32 num_entries; + uint32 current_pos; } BuildAccumulator; extern void ginInitBA(BuildAccumulator *accum); -- 2.51.0 [text/x-patch] v7-0002-Optimize-generate_trgm-with-radix-sort.patch (2.2K, 3-v7-0002-Optimize-generate_trgm-with-radix-sort.patch) download | inline diff: From eb8f6c90152ce976161b04c9d540f338a0640b2a Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Tue, 11 Nov 2025 13:18:59 +0100 Subject: [PATCH v7 2/3] Optimize generate_trgm() with radix sort --- contrib/pg_trgm/trgm_op.c | 59 +++++++++++++++++++++++++++++++++------ 1 file changed, 51 insertions(+), 8 deletions(-) diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c index 3fd087852e7..f5dc5242c7d 100644 --- a/contrib/pg_trgm/trgm_op.c +++ b/contrib/pg_trgm/trgm_op.c @@ -226,13 +226,56 @@ CMPTRGM_CHOOSE(const void *a, const void *b) return CMPTRGM(a, b); } -#define ST_SORT trigram_qsort_signed -#define ST_ELEMENT_TYPE_VOID -#define ST_COMPARE(a, b) CMPTRGM_SIGNED(a, b) -#define ST_SCOPE static -#define ST_DEFINE -#define ST_DECLARE -#include "lib/sort_template.h" +/* + * Needed to properly handle negative numbers in case char is signed. + */ +static inline unsigned char FlipSign(char x) +{ + return x^0x80; +} + +static void radix_sort_trigrams_signed(trgm *trg, int count) +{ + trgm *buffer = palloc_array(trgm, count); + trgm *starts[256]; + trgm *from = trg; + trgm *to = buffer; + int freqs[3][256]; + + /* + * Compute frequencies to partition the buffer. + */ + memset(freqs, 0, sizeof(freqs)); + + for (int i=0; i<count; i++) + for (int j=0; j<3; j++) + freqs[j][FlipSign(trg[i][j])]++; + + /* + * Do the sorting. Start with last character because that's the is "LSB" + * in a trigram. Avoid unnecessary copies by ping-ponging between the buffers. + */ + for (int i=2; i>=0; i--) + { + trgm *old_from = from; + trgm *next = to; + + for (int j=0; j<256; j++) + { + starts[j] = next; + next += freqs[i][j]; + } + + for (int j=0; j<count; j++) + memcpy(starts[FlipSign(from[j][i])]++, from[j], sizeof(trgm)); + + from = to; + to = old_from; + } + + memcpy(trg, buffer, sizeof(trgm) * count); + pfree(buffer); +} #define ST_SORT trigram_qsort_unsigned #define ST_ELEMENT_TYPE_VOID @@ -247,7 +290,7 @@ static void trigram_qsort(trgm *array, size_t n) { if (GetDefaultCharSignedness()) - trigram_qsort_signed(array, n, sizeof(trgm)); + radix_sort_trigrams_signed(array, n); else trigram_qsort_unsigned(array, n, sizeof(trgm)); } -- 2.51.0 [text/x-patch] v7-0001-Make-btint4cmp-branchless.patch (1.0K, 4-v7-0001-Make-btint4cmp-branchless.patch) download | inline diff: From abeea158b0f58344abb085f914d5cc5f804395c8 Mon Sep 17 00:00:00 2001 From: David Geier <[email protected]> Date: Mon, 10 Nov 2025 15:40:11 +0100 Subject: [PATCH v7 1/3] Make btint4cmp() branchless --- src/backend/access/nbtree/nbtcompare.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c index 1d343377e98..ac16e3d993d 100644 --- a/src/backend/access/nbtree/nbtcompare.c +++ b/src/backend/access/nbtree/nbtcompare.c @@ -61,6 +61,7 @@ #include "utils/fmgrprotos.h" #include "utils/skipsupport.h" #include "utils/sortsupport.h" +#include "common/int.h" #ifdef STRESS_SORT_INT_MIN #define A_LESS_THAN_B INT_MIN @@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS) int32 a = PG_GETARG_INT32(0); int32 b = PG_GETARG_INT32(1); - if (a > b) - PG_RETURN_INT32(A_GREATER_THAN_B); - else if (a == b) - PG_RETURN_INT32(0); - else - PG_RETURN_INT32(A_LESS_THAN_B); + PG_RETURN_INT32(pg_cmp_s32(a, b)); } Datum -- 2.51.0 ^ permalink raw reply [nested|flat] 31+ messages in thread
end of thread, other threads:[~2026-04-22 21:26 UTC | newest] Thread overview: 31+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-01-21 20:50 Re: Reduce build times of pg_trgm GIN indexes Matthias van de Meent <[email protected]> 2026-01-23 10:18 ` David Geier <[email protected]> 2026-03-02 12:17 ` David Geier <[email protected]> 2026-04-07 11:27 ` Heikki Linnakangas <[email protected]> 2026-04-08 02:15 ` John Naylor <[email protected]> 2026-04-13 15:05 ` David Geier <[email protected]> 2026-04-14 13:05 ` David Geier <[email protected]> 2026-04-09 11:28 ` Bertrand Drouvot <[email protected]> 2026-04-13 09:41 ` Peter Eisentraut <[email protected]> 2026-04-13 11:04 ` Bertrand Drouvot <[email protected]> 2026-04-13 15:03 ` David Geier <[email protected]> 2026-04-13 17:15 ` Bertrand Drouvot <[email protected]> 2026-04-14 09:22 ` Heikki Linnakangas <[email protected]> 2026-04-12 18:05 ` Tom Lane <[email protected]> 2026-04-13 15:22 ` David Geier <[email protected]> 2026-04-13 16:14 ` Heikki Linnakangas <[email protected]> 2026-04-14 07:02 ` David Geier <[email protected]> 2026-04-15 11:06 ` Heikki Linnakangas <[email protected]> 2026-04-15 19:12 ` Peter Eisentraut <[email protected]> 2026-04-15 21:25 ` Tom Lane <[email protected]> 2026-04-16 08:45 ` Peter Eisentraut <[email protected]> 2026-04-16 09:49 ` Heikki Linnakangas <[email protected]> 2026-04-16 14:37 ` Tom Lane <[email protected]> 2026-04-16 17:30 ` Heikki Linnakangas <[email protected]> 2026-04-16 17:47 ` Tom Lane <[email protected]> 2026-04-17 19:21 ` Heikki Linnakangas <[email protected]> 2026-04-22 21:11 ` Heikki Linnakangas <[email protected]> 2026-04-16 14:43 ` Andres Freund <[email protected]> 2026-04-13 15:06 ` David Geier <[email protected]> 2026-04-14 14:24 ` David Geier <[email protected]> 2026-04-22 21:26 ` David Geier <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox