Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wA4b0-0022N5-0C for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 11:27:50 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wA4ay-00GvoB-1j for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 11:27:48 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wA4ay-00Gvnz-0g for pgsql-hackers@lists.postgresql.org; Tue, 07 Apr 2026 11:27:48 +0000 Received: from meesny.iki.fi ([2001:67c:2b0:1c1::201]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wA4av-000000011SO-1WWS for pgsql-hackers@postgresql.org; Tue, 07 Apr 2026 11:27:47 +0000 Received: from [10.0.2.15] (unknown [130.41.208.1]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by meesny.iki.fi (Postfix) with ESMTPSA id 4fqkSs5YYrzyQn; Tue, 07 Apr 2026 14:27:41 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=meesny; t=1775561262; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E7h80NwUS2t0et4zQQ9eVgBuxcbLeVazJhAaL1Cc+XA=; b=sPYSQtr9gHe6LDWk1qcG68V7iZREXG0MwJ6Ce6kgwdRNETqVqHwjJAZMuCd8O9ptZml9s7 eR/GCqTB+vhJ/PNlV8xLKDYECFXYklIc7bgmLs/PAQVeIDbn8iTcX8z3woAB/nc0PDnZ43 XSc4lhGQv7nQWXiXdApXLpLubzXbFjA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=meesny; t=1775561262; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E7h80NwUS2t0et4zQQ9eVgBuxcbLeVazJhAaL1Cc+XA=; b=Oot2GWgRwI725wLvPbhxKbwrWD8F7Un6KGovU0VxZ4gHNDXDAP2LJYEShOoxqnkfZLWPAK 57oQVq3ygahfsJZfIIsIlyZbTm5opc19NZoO+R9u22Q6VP6U3Bdo0k4b5WR5uhWRBAGNrq 4DqJKYdRZ7y1wxvJiBEHHRoIQIZt3Ew= ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi ARC-Seal: i=1; a=rsa-sha256; d=iki.fi; s=meesny; cv=none; t=1775561262; b=u6uIB5dRnHrwqk83H4FTtgrC9RZFqGdFm36XCEzZIttTCgxf9szV9/GQEpdbm6gDjWab5A fHDOG50qV9a11KFHteKyvRn30CerSYvsuZrkmS6SgDx4i2T7gKWJzwCw/xCaCPSNUUT7Uc QsJ2AVdpyW0Q99MbaQw6AbCCwPvccbA= Message-ID: <6439c655-e281-409d-b884-6586750d5820@iki.fi> Date: Tue, 7 Apr 2026 14:27:40 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Reduce build times of pg_trgm GIN indexes To: David Geier , Matthias van de Meent Cc: pgsql-hackers References: <5d366878-2007-4d31-861e-19294b7a583b@gmail.com> <9ac3931a-180e-4283-a7a8-05eb66099206@iki.fi> <2e11134f-02c3-43da-8c39-fb520a1a251d@iki.fi> <66620ec7-0f81-4813-9cf1-b901a56efcc3@gmail.com> <2a76b5ef-4b12-4023-93a1-eed6e64968f3@gmail.com> Content-Language: en-US From: Heikki Linnakangas In-Reply-To: <2a76b5ef-4b12-4023-93a1-eed6e64968f3@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 03/03/2026 19:31, David Geier wrote: >> Attached are the patches rebased on latest master. >> >> I've removed the ASCII fast-path patch 0006 as it turned out to be more >> complicated to make work than expected. >> >> I kept the radix sort patch because it gives a decent speedup but I >> would like to focus for now on getting patches 0001 - 0004 merged. >> They're all simple and, the way I see it, uncontroversial. >> >> I remeasured the savings of 0001 - 0004, which comes on top of the >> already committed patch that inlined the comparison function, which gave >> another ~5%: >> >> Data set | Patched (ms) | Master (ms) | Speedup >> --------------------|--------------|--------------|---------- >> movies(plot) | 8,058 | 10,311 | 1.27x >> lineitem(l_comment) | 223,233 | 256,986 | 1.19x >> >> I've also registered the change at the commit fest, see >> https://commitfest.postgresql.org/patch/6418/. > > Attached is v5 that removes an incorrect assertion from the radix sort code. > > v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch > v5-0002-Optimize-generate_trgm-with-sort_template.h.patch > v5-0003-Make-btint4cmp-branchless.patch > v5-0004-Faster-qunique-comparator-in-generate_trgm.patch > v5-0005-Optimize-generate_trgm-with-radix-sort.patch Pushed 0001 as commit 6f5ad00ab7. I squashed 0002 and 0004 into one commit, and did some more refactoring: I created a trigram_qsort() helper function that calls the signed or unsigned variant, so that that logic doesn't need to be duplicated in the callers. For symmetry, I also added a trigram_qunique() helper function which just calls qunique() with the new, faster CMPTRGM_EQ comparator. Pushed these as commit 9f3755ea07. Patch 0003 gives me pause. It's a tiny patch: > @@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS) > int32 a = PG_GETARG_INT32(0); > int32 b = PG_GETARG_INT32(1); > > - if (a > b) > - PG_RETURN_INT32(A_GREATER_THAN_B); > - else if (a == b) > - PG_RETURN_INT32(0); > - else > - PG_RETURN_INT32(A_LESS_THAN_B); > + PG_RETURN_INT32(pg_cmp_s32(a, b)); > } But the comments on the pg_cmp functions say: > * NB: If the comparator function is inlined, some compilers may produce > * worse code with these helper functions than with code with the > * following form: > * > * if (a < b) > * return -1; > * if (a > b) > * return 1; > * return 0; > * So, uh, is that really a universal improvement? Is that comment about producing worse code outdated? - Heikki