Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQi8Z-00FROw-1w for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 08:23:00 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vQi8Y-00C9to-1r for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 08:22:58 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQi8Y-00C9tf-0V for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 08:22:58 +0000 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vQi8V-002tRH-1p for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 08:22:57 +0000 Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-29555b384acso71918825ad.1 for ; Wed, 03 Dec 2025 00:22:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764750175; x=1765354975; darn=lists.postgresql.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=AEIugNFV3BmeZXRXrlmf8YIXXn65R1jpUTHjwVA0bz8=; b=jG0BbJCpEqPc0Q12ptcswArLHn3dkDWsJShqJRycwilPgc6VVVJkwBh7N4YXfAwFqz HgYtYMd9jDlpdMMOWP7v56nxj8maKiPlkKggB+ytl2qa2j0r3lZZR57RJwllVuQmAkZ5 9JMu3LdV8wRNZZDdIbmwCKMr+CbpKf9u9CH775oaTIv+uyGEw5w/hj+d/tJmCdnwNUJm j4IWJbpBv5G4YW3cHg6kTOvQAFSxcmmWBkkfd2iwL1N+Cm/lGkNJ8AzptJc7Iei44raT pqzI8gACXmiTihXQULuL5JyB52vRfSf+pYnfJEwRmO6Lg947CCmSZXhqQD+Me3HzmGEH De3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764750175; x=1765354975; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=AEIugNFV3BmeZXRXrlmf8YIXXn65R1jpUTHjwVA0bz8=; b=MoI9L193XgLvX4uDzkpM4yjn2TKxPpj90gpnFbNuWbDKWgRDaobi8e0Kweu/qPinGd xrMT7ZIuwIpnygXJuti3prtGuwFDeFTrwJPSJ5iIkP28F1FL8euoClB0kKa1s5MwRq2v hqJ2KZi3zc4tTmCg8Z900ahLvNW8qJAExZEp4P0Uy5s4oJFITlb0BpyNpTdD0VLAV6pK QjAVwnigwfirC0bbGR1lc2JFAvj75ydAjht5KF6RfcyIqetaLOBpbPcgOfjQrigjHZlV pS/RzWsijV/jpgzbyg3itJNRTUGE5SHhu7GQjQGdDxMwkkPPm1LSUJkG91bXcCQhWmMn ySHw== X-Forwarded-Encrypted: i=1; AJvYcCXp76UL0Nio900HonwcNjVpAlZpnBih/sONI2DrMEvlW4JL2vqM1uYSvS+Uza49KnjL0GSyFD09xLkQZdSv@lists.postgresql.org X-Gm-Message-State: AOJu0YzfL9t8+fiiLSzhnwk8K1YF58vvsh01s/Hx/BJMiJr8Mo5rJdZA QnyHhV06dpNjrsbpqKJeSSzT3/2e5CZ/iMi1GMVBaU9RB0dinzd5Uiot X-Gm-Gg: ASbGncuijk3dsh/lr5ZVZQwtoZj+F4xs4YotUARV9BLnWnKpZDbLDxvnEltxEz72UZU AWbNBq16B4EGGPcIrtO/mr+zs+mRWHRRxMK8X3yWJFWLg6yBDnQz85+yUE26gaJAm8uxrdCWSp3 vHs0rhHXDwV90LSkq7DJQgD+zgnxfYCxyi/UJQG/ARU1gw2v3LGGrznpK6StFOJ0VXH+3+t5TYC 3yPEl69MCUtyzB+7x9E8Ayehx4FxjwLXHza/LxpIGaTf2s++5XLInVU/T1jxr2StWoNvLXXYeXw I6iVPfD60orffBINKHGDqgMtGPYNBTrq3chba/eRfUnGkMku43zjcpzd4YuSExMeZi1FxaZMzeF yXGCNwz1YcPNe85Zwe4gN7u6kO6j9XgbcEr+fbf6copKjeWg1g6LbRwJ6rEQPYpBp7hs9+HgQkJ YWt7b+QexnQ4ILM/d/XeA= X-Google-Smtp-Source: AGHT+IGevkxquWm4rqVt/++yS651JlVX5M08nD1oOr+SBQ8FuyzUqpwNtod5gFLXke/1GNdEdQeh5Q== X-Received: by 2002:a17:902:d552:b0:297:c4b0:8d58 with SMTP id d9443c01a7336-29d683b2b6bmr19409375ad.46.1764750175254; Wed, 03 Dec 2025 00:22:55 -0800 (PST) Received: from smtpclient.apple ([45.32.121.103]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-29bce41785bsm175278655ad.20.2025.12.03.00.22.51 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Dec 2025 00:22:54 -0800 (PST) From: Chao Li Message-Id: <606C775A-4C1A-482B-BE7D-2E7A46AE14B9@gmail.com> Content-Type: multipart/mixed; boundary="Apple-Mail=_A7A6D6D3-1EAE-40E7-AD01-5BBEC88DC6FC" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81\)) Subject: Re: tuple radix sort Date: Wed, 3 Dec 2025 16:22:13 +0800 In-Reply-To: Cc: Chengpeng Yan , "pgsql-hackers@lists.postgresql.org" , Peter Geoghegan To: John Naylor References: <269A8FB9-6D43-43CF-A6FE-52D28CBDB8A9@Outlook.com> X-Mailer: Apple Mail (2.3826.700.81) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_A7A6D6D3-1EAE-40E7-AD01-5BBEC88DC6FC Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi John, I played with this again today and found an optimization that seems to = dramatically improve the performance: ``` +static void +radix_sort_tuple(SortTuple *begin, size_t n_elems, int level, = Tuplesortstate *state) +{ + RadixPartitionInfo partitions[256] =3D {0}; + uint8_t remaining_partitions[256] =3D {0}; ``` Here partitions and remaining_partitions are just temporary buffers, = allocating memory from stack and initialize them seems slow. By passing = them as function parameters are much faster. See attached diff for my = change. V5 patch: by the way, v5 is very faster than v1. ``` evantest=3D# select * from test_multi order by category, name; Time: 299.912 ms evantest=3D# select * from test_multi order by category, name; Time: 298.793 ms evantest=3D# select * from test_multi order by category, name; Time: 300.306 ms evantest=3D# select * from test_multi order by category, name; Time: 302.140 ms ``` v5 + my change: ``` evantest=3D# select * from test_multi order by category, name; Time: 152.572 ms evantest=3D# select * from test_multi order by category, name; Time: 143.296 ms evantest=3D# select * from test_multi order by category, name; Time: 151.606 ms ``` The test I did today is just the high cardinality first column test I = had done before: ``` drop table if exists test_multi; create unlogged table test_multi (category int, name text); insert into test_multi select (random() * 1000000)::int as category, = md5(random()::text) || md5(random()::text) as name from = generate_series(1, 1000000); vacuum freeze test_multi; \timing on \o /dev/null set wip_radix_sort =3D =E2=80=98on; set work_mem =3D =E2=80=982GB=E2=80=99; select * from test_multi order by category, name; ``` Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ --Apple-Mail=_A7A6D6D3-1EAE-40E7-AD01-5BBEC88DC6FC Content-Disposition: attachment; filename=tuplesort_chaoli.diff Content-Type: application/octet-stream; x-unix-mode=0644; name="tuplesort_chaoli.diff" Content-Transfer-Encoding: 7bit diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index 028c5b71c27..2bf2785f5ec 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -2760,10 +2760,11 @@ normalize_datum(Datum orig, SortSupport ssup) * DEALINGS IN THE SOFTWARE. */ static void -radix_sort_tuple(SortTuple *begin, size_t n_elems, int level, Tuplesortstate *state) +radix_sort_tuple(SortTuple *begin, size_t n_elems, int level, Tuplesortstate *state, + RadixPartitionInfo * partitions, uint8_t *remaining_partitions) { - RadixPartitionInfo partitions[256] = {0}; - uint8_t remaining_partitions[256] = {0}; + /* RadixPartitionInfo partitions[256] = {0}; */ + /* uint8_t remaining_partitions[256] = {0}; */ size_t total = 0; int num_partitions = 0; int num_remaining; @@ -2771,6 +2772,9 @@ radix_sort_tuple(SortTuple *begin, size_t n_elems, int level, Tuplesortstate *st size_t start_offset = 0; SortTuple *partition_begin = begin; + memset(partitions, 0, sizeof(RadixPartitionInfo) * 256); + memset(remaining_partitions, 0, sizeof(uint8_t) * 256); + /* count number of occurrences of each byte */ for (SortTuple *tup = begin; tup < begin + n_elems; tup++) { @@ -2881,7 +2885,9 @@ radix_sort_tuple(SortTuple *begin, size_t n_elems, int level, Tuplesortstate *st radix_sort_tuple(partition_begin, num_elements, level + 1, - state); + state, + partitions, + remaining_partitions); } } else if (state->base.onlyKey == NULL) @@ -3019,10 +3025,15 @@ sort_byvalue_datum(Tuplesortstate *state) } else { + RadixPartitionInfo partitions[256]; + uint8_t remaining_partitions[256]; + radix_sort_tuple(not_null_start, not_null_count, 0, - state); + state, + partitions, + remaining_partitions); } } } --Apple-Mail=_A7A6D6D3-1EAE-40E7-AD01-5BBEC88DC6FC Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_A7A6D6D3-1EAE-40E7-AD01-5BBEC88DC6FC--