Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCega-002CEc-2e for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 14:24:17 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wCegZ-00CL8U-0N for pgsql-hackers@arkaria.postgresql.org; Tue, 14 Apr 2026 14:24:16 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCegY-00CL8M-2H for pgsql-hackers@lists.postgresql.org; Tue, 14 Apr 2026 14:24:15 +0000 Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wCegX-00000000z74-1V3k for pgsql-hackers@postgresql.org; Tue, 14 Apr 2026 14:24:14 +0000 Received: by mail-wm1-x32f.google.com with SMTP id 5b1f17b1804b1-482f454be5bso60023535e9.0 for ; Tue, 14 Apr 2026 07:24:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776176652; x=1776781452; darn=postgresql.org; h=in-reply-to:content-language:references:cc:to:from:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=ur60I9zSF4hnraoIJCTk0ohyNzlR4XzXzmquM1a0+V8=; b=HLUSDa5ZOwjik/TMAipUrxUsXzkURHhL2kp3Q0xiD56CACdjP9hHSiGJadUwwjhECV SQDbl37IR3/7brBFZUFpNemd4tLz8HNUvmKZ+6xyqIKthhW0g6zkXmvfZf5pcONrPqzF F2JZ1VugC4FM25l07VDCjk0kLmph8oEFVahpnq0QZI0COdzrRiL7JRFvJi2Ff1Y8f3po i8Ha0nI3TCqfz3UNkXXuOOSqvAftXsRKsc4PA3qN/rtGCbW1wXz74HdVHnMdJ5jRDhMO r8vs0Vp5DFbfFuEwmaoYOpiW7JTUbvNCVLmqpCZ4YVAKCQU9nBnTIe2eanAs3TmDtMp9 JsRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776176652; x=1776781452; h=in-reply-to:content-language:references:cc:to:from:subject :user-agent:mime-version:date:message-id:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ur60I9zSF4hnraoIJCTk0ohyNzlR4XzXzmquM1a0+V8=; b=EnY1ELNBb6UZl+8X22b2A/+sSxbCTUJ8XuYmGcsbNWNIUDMaPGOxygDRiiNRykytgx 9WdOzp6FHleP5BesY0JYXRPXRM5n8UylM+nb+9YAxOWs5iOgQRDhrLd1GQrqQvWQcx0H dMbjpGnCwPP5FuyPvAFPr3jRjlvdtmM3M/ltICzAX32m/kdo65zGjllSh4Km1zQYO+3V AljIYRFSs+2EtpP8dxhAfFCyMb2YOPTS3etjWGSrpzSH23oYd8+5BU2K1IfNq8Nc6KpY LtbaM/+R75u9zeL3EJ/8hbcyHgSbMFllXsZFTNqFzdzaqKj9P2FukqZzwSCeeg3Tt6cP f14A== X-Gm-Message-State: AOJu0YwXb7AQeA1cJB6RHRYWnAf3NqJSLDd+uKp8Xonb7GBIaWDqHIv5 FQ653L+sOGDITrgaSyIMfFCoKSXytLduevUr4wLO6vrMNEJurWDNbb63 X-Gm-Gg: AeBDiesQBu8wcdHfsiSKUCmu/06hN20u7b+jwsmWtMdoEKL5V9b60lljTD/aqwsF5Tw 21+2E/WNi8Rxd7Ls/xCIaZ2sQvdhIhJs3YLJVUJpGLzc7At4wOTtw20hCd6euWrmtlT1MhyUp9j 67dkyAks539JKV+zZR/YmvGgEbmwxZbp8JGLH1QsEGItwbxs8viex+8R4bGVCjbUFNWfXhwUX5g t9hiOSl+ju2/r10NFcNSeJZCdcLU5/26LTfpM2PBA/oiRSlbL1BTRnraO83CrPZ4A22wCToWLrW M6a6NFxX2UOdsQmOOA3ZFYjlzscxFU+UX8ZucPu0Ee8oT7yekyPyCMNpJHPfA28A3ArNJETe4lO KraQpQxDwTk4vTiFe2odzVZjbVoeDfCMFPI1MtWHLdltL0IyQTO3tZfptp+Yem31WoEHX9MDAER rMTWKMpKWQyvDDXiUj03A= X-Received: by 2002:a05:600c:19cd:b0:488:c239:d498 with SMTP id 5b1f17b1804b1-488d68712d5mr245497575e9.8.1776176652172; Tue, 14 Apr 2026 07:24:12 -0700 (PDT) Received: from [172.31.5.233] ([165.225.27.16]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-488ede1e826sm111866755e9.6.2026.04.14.07.24.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 14 Apr 2026 07:24:11 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------aCgN3MnRBD7oS7jRvk4TkPw5" Message-ID: <5650bf75-dcb8-446d-8cba-e626eb44594b@gmail.com> Date: Tue, 14 Apr 2026 16:24:10 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Reduce build times of pg_trgm GIN indexes From: David Geier To: Heikki Linnakangas , Matthias van de Meent Cc: pgsql-hackers References: <5d366878-2007-4d31-861e-19294b7a583b@gmail.com> <9ac3931a-180e-4283-a7a8-05eb66099206@iki.fi> <2e11134f-02c3-43da-8c39-fb520a1a251d@iki.fi> <66620ec7-0f81-4813-9cf1-b901a56efcc3@gmail.com> <2a76b5ef-4b12-4023-93a1-eed6e64968f3@gmail.com> <6439c655-e281-409d-b884-6586750d5820@iki.fi> <5e74f77a-1dfc-45b5-9fcf-62afe8dbbaf2@gmail.com> Content-Language: en-US In-Reply-To: <5e74f77a-1dfc-45b5-9fcf-62afe8dbbaf2@gmail.com> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------aCgN3MnRBD7oS7jRvk4TkPw5 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 13.04.2026 17:06, David Geier wrote: >> I squashed 0002 and 0004 into one commit, and did some more refactoring: >> I created a trigram_qsort() helper function that calls the signed or >> unsigned variant, so that that logic doesn't need to be duplicated in >> the callers. For symmetry, I also added a trigram_qunique() helper >> function which just calls qunique() with the new, faster CMPTRGM_EQ >> comparator. Pushed these as commit 9f3755ea07. > > Thanks for committing these patches. Attached are the remaining patches (previously 0003 and 0005) rebased on latest master. Currently, there's no radix sort variant for the unsigned char case. Do we care about this case or is it fine if that case runs slower? The following perf profiles show that trigram_qsort() goes from ~34% down to ~7% with the radix sort optimization. The optimized run also includes the btint4cmp() optimization. Without that the result would be even better. With that change we could move on and tackle optimizing 1. 41.52% generate_trgm_only() by e.g. using an ASCII fast-patch 2. 32.72% ginInsertBAEntries() by no longer using the RB-tree but e.g. also the radix sort master - heapam_index_build_range_scan - 99.40% ginBuildCallback - ginHeapTupleBulkInsert - 66.55% ginExtractEntries - 65.29% FunctionCall3Coll - gin_extract_value_trgm - 62.80% generate_trgm + 34.33% trigram_qsort (inlined) + 26.20% generate_trgm_only + 2.23% trigram_qunique (inlined) + 1.74% detoast_attr + 1.19% qsort_arg_entries + 32.72% ginInsertBAEntries patched - heapam_index_build_range_scan - 99.42% ginBuildCallback - 95.95% ginHeapTupleBulkInsert - 59.11% ginExtractEntries - 56.93% FunctionCall3Coll - gin_extract_value_trgm - 52.19% generate_trgm + 41.52% generate_trgm_only + 7.14% trigram_qsort (inlined) + 3.53% trigram_qunique (inlined) + 4.08% detoast_attr + 2.13% qsort_arg_entries + 36.78% ginInsertBAEntries -- David Geier --------------aCgN3MnRBD7oS7jRvk4TkPw5 Content-Type: text/x-patch; charset=UTF-8; name="v6-0002-Optimize-generate_trgm-with-radix-sort.patch" Content-Disposition: attachment; filename="v6-0002-Optimize-generate_trgm-with-radix-sort.patch" Content-Transfer-Encoding: base64 RnJvbSBiMjA5MTBhODE0ZThjOGI2NGMzZDc0ODQxZmUyN2E3YzEyZGQ5MjdhIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBEYXZpZCBHZWllciA8Z2VpZGF2LnBnQGdtYWlsLmNv bT4KRGF0ZTogVHVlLCAxMSBOb3YgMjAyNSAxMzoxODo1OSArMDEwMApTdWJqZWN0OiBbUEFU Q0ggdjYgMi8yXSBPcHRpbWl6ZSBnZW5lcmF0ZV90cmdtKCkgd2l0aCByYWRpeCBzb3J0Cgot LS0KIGNvbnRyaWIvcGdfdHJnbS90cmdtX29wLmMgfCA1OSArKysrKysrKysrKysrKysrKysr KysrKysrKysrKysrKystLS0tLS0KIDEgZmlsZSBjaGFuZ2VkLCA1MSBpbnNlcnRpb25zKCsp LCA4IGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL2NvbnRyaWIvcGdfdHJnbS90cmdtX29w LmMgYi9jb250cmliL3BnX3RyZ20vdHJnbV9vcC5jCmluZGV4IDBhY2E5YjU4MjZmLi5mMmRi ZTg4ZWNlMCAxMDA2NDQKLS0tIGEvY29udHJpYi9wZ190cmdtL3RyZ21fb3AuYworKysgYi9j b250cmliL3BnX3RyZ20vdHJnbV9vcC5jCkBAIC0yMjYsMTMgKzIyNiw1NiBAQCBDTVBUUkdN X0NIT09TRShjb25zdCB2b2lkICphLCBjb25zdCB2b2lkICpiKQogCXJldHVybiBDTVBUUkdN KGEsIGIpOwogfQogCi0jZGVmaW5lIFNUX1NPUlQgdHJpZ3JhbV9xc29ydF9zaWduZWQKLSNk ZWZpbmUgU1RfRUxFTUVOVF9UWVBFX1ZPSUQKLSNkZWZpbmUgU1RfQ09NUEFSRShhLCBiKSBD TVBUUkdNX1NJR05FRChhLCBiKQotI2RlZmluZSBTVF9TQ09QRSBzdGF0aWMKLSNkZWZpbmUg U1RfREVGSU5FCi0jZGVmaW5lIFNUX0RFQ0xBUkUKLSNpbmNsdWRlICJsaWIvc29ydF90ZW1w bGF0ZS5oIgorLyoKKyAqIE5lZWRlZCB0byBwcm9wZXJseSBoYW5kbGUgbmVnYXRpdmUgbnVt YmVycyBpbiBjYXNlIGNoYXIgaXMgc2lnbmVkLgorICovCitzdGF0aWMgaW5saW5lIHVuc2ln bmVkIGNoYXIgRmxpcFNpZ24oY2hhciB4KQoreworCXJldHVybiB4XjB4ODA7Cit9CisKK3N0 YXRpYyB2b2lkIHJhZGl4X3NvcnRfdHJpZ3JhbXNfc2lnbmVkKHRyZ20gKnRyZywgaW50IGNv dW50KQoreworCXRyZ20gKmJ1ZmZlciA9IHBhbGxvY19hcnJheSh0cmdtLCBjb3VudCk7CisJ dHJnbSAqc3RhcnRzWzI1Nl07CisJdHJnbSAqZnJvbSA9IHRyZzsKKwl0cmdtICp0byA9IGJ1 ZmZlcjsKKwlpbnQgZnJlcXNbM11bMjU2XTsKKworCS8qCisJICogQ29tcHV0ZSBmcmVxdWVu Y2llcyB0byBwYXJ0aXRpb24gdGhlIGJ1ZmZlci4KKwkgKi8KKwltZW1zZXQoZnJlcXMsIDAs IHNpemVvZihmcmVxcykpOworCisJZm9yIChpbnQgaT0wOyBpPGNvdW50OyBpKyspCisJCWZv ciAoaW50IGo9MDsgajwzOyBqKyspCisJCQlmcmVxc1tqXVtGbGlwU2lnbih0cmdbaV1bal0p XSsrOworCisJLyoKKwkgKiBEbyB0aGUgc29ydGluZy4gU3RhcnQgd2l0aCBsYXN0IGNoYXJh Y3RlciBiZWNhdXNlIHRoYXQncyB0aGUgaXMgIkxTQiIKKwkgKiBpbiBhIHRyaWdyYW0uIEF2 b2lkIHVubmVjZXNzYXJ5IGNvcGllcyBieSBwaW5nLXBvbmdpbmcgYmV0d2VlbiB0aGUgYnVm ZmVycy4KKwkgKi8KKwlmb3IgKGludCBpPTI7IGk+PTA7IGktLSkKKwl7CisJCXRyZ20gKm9s ZF9mcm9tID0gZnJvbTsKKwkJdHJnbSAqbmV4dCA9IHRvOworCisJCWZvciAoaW50IGo9MDsg ajwyNTY7IGorKykKKwkJeworCQkJc3RhcnRzW2pdID0gbmV4dDsKKwkJCW5leHQgKz0gZnJl cXNbaV1bal07CisJCX0KKworCQlmb3IgKGludCBqPTA7IGo8Y291bnQ7IGorKykKKwkJCW1l bWNweShzdGFydHNbRmxpcFNpZ24oZnJvbVtqXVtpXSldKyssIGZyb21bal0sIHNpemVvZih0 cmdtKSk7CisKKwkJZnJvbSA9IHRvOworCQl0byA9IG9sZF9mcm9tOworCX0KKworCW1lbWNw eSh0cmcsIGJ1ZmZlciwgc2l6ZW9mKHRyZ20pICogY291bnQpOworCXBmcmVlKGJ1ZmZlcik7 Cit9CiAKICNkZWZpbmUgU1RfU09SVCB0cmlncmFtX3Fzb3J0X3Vuc2lnbmVkCiAjZGVmaW5l IFNUX0VMRU1FTlRfVFlQRV9WT0lECkBAIC0yNDcsNyArMjkwLDcgQEAgc3RhdGljIHZvaWQK IHRyaWdyYW1fcXNvcnQodHJnbSAqYXJyYXksIHNpemVfdCBuKQogewogCWlmIChHZXREZWZh dWx0Q2hhclNpZ25lZG5lc3MoKSkKLQkJdHJpZ3JhbV9xc29ydF9zaWduZWQoYXJyYXksIG4s IHNpemVvZih0cmdtKSk7CisJCXJhZGl4X3NvcnRfdHJpZ3JhbXNfc2lnbmVkKGFycmF5LCBu KTsKIAllbHNlCiAJCXRyaWdyYW1fcXNvcnRfdW5zaWduZWQoYXJyYXksIG4sIHNpemVvZih0 cmdtKSk7CiB9Ci0tIAoyLjUxLjAKCg== --------------aCgN3MnRBD7oS7jRvk4TkPw5 Content-Type: text/x-patch; charset=UTF-8; name="v6-0001-Make-btint4cmp-branchless.patch" Content-Disposition: attachment; filename="v6-0001-Make-btint4cmp-branchless.patch" Content-Transfer-Encoding: base64 RnJvbSAwZjM5Zjg2MThjNzg0Mjk5YjMwNTY2YTBiZjY4YmVlZjhiMmI4YWVmIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBEYXZpZCBHZWllciA8Z2VpZGF2LnBnQGdtYWlsLmNv bT4KRGF0ZTogTW9uLCAxMCBOb3YgMjAyNSAxNTo0MDoxMSArMDEwMApTdWJqZWN0OiBbUEFU Q0ggdjYgMS8yXSBNYWtlIGJ0aW50NGNtcCgpIGJyYW5jaGxlc3MKCi0tLQogc3JjL2JhY2tl bmQvYWNjZXNzL25idHJlZS9uYnRjb21wYXJlLmMgfCA4ICsrLS0tLS0tCiAxIGZpbGUgY2hh bmdlZCwgMiBpbnNlcnRpb25zKCspLCA2IGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL3Ny Yy9iYWNrZW5kL2FjY2Vzcy9uYnRyZWUvbmJ0Y29tcGFyZS5jIGIvc3JjL2JhY2tlbmQvYWNj ZXNzL25idHJlZS9uYnRjb21wYXJlLmMKaW5kZXggMWQzNDMzNzdlOTguLmFjMTZlM2Q5OTNk IDEwMDY0NAotLS0gYS9zcmMvYmFja2VuZC9hY2Nlc3MvbmJ0cmVlL25idGNvbXBhcmUuYwor KysgYi9zcmMvYmFja2VuZC9hY2Nlc3MvbmJ0cmVlL25idGNvbXBhcmUuYwpAQCAtNjEsNiAr NjEsNyBAQAogI2luY2x1ZGUgInV0aWxzL2ZtZ3Jwcm90b3MuaCIKICNpbmNsdWRlICJ1dGls cy9za2lwc3VwcG9ydC5oIgogI2luY2x1ZGUgInV0aWxzL3NvcnRzdXBwb3J0LmgiCisjaW5j bHVkZSAiY29tbW9uL2ludC5oIgogCiAjaWZkZWYgU1RSRVNTX1NPUlRfSU5UX01JTgogI2Rl ZmluZSBBX0xFU1NfVEhBTl9CCQlJTlRfTUlOCkBAIC0yMDMsMTIgKzIwNCw3IEBAIGJ0aW50 NGNtcChQR19GVU5DVElPTl9BUkdTKQogCWludDMyCQlhID0gUEdfR0VUQVJHX0lOVDMyKDAp OwogCWludDMyCQliID0gUEdfR0VUQVJHX0lOVDMyKDEpOwogCi0JaWYgKGEgPiBiKQotCQlQ R19SRVRVUk5fSU5UMzIoQV9HUkVBVEVSX1RIQU5fQik7Ci0JZWxzZSBpZiAoYSA9PSBiKQot CQlQR19SRVRVUk5fSU5UMzIoMCk7Ci0JZWxzZQotCQlQR19SRVRVUk5fSU5UMzIoQV9MRVNT X1RIQU5fQik7CisJUEdfUkVUVVJOX0lOVDMyKHBnX2NtcF9zMzIoYSwgYikpOwogfQogCiBE YXR1bQotLSAKMi41MS4wCgo= --------------aCgN3MnRBD7oS7jRvk4TkPw5--