Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vqiz6-00Ge5c-2f for pgsql-hackers@arkaria.postgresql.org; Fri, 13 Feb 2026 02:32:46 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vqiz6-00C9dy-08 for pgsql-hackers@arkaria.postgresql.org; Fri, 13 Feb 2026 02:32:44 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vqiz5-00C9dq-2K for pgsql-hackers@lists.postgresql.org; Fri, 13 Feb 2026 02:32:44 +0000 Received: from mail-qt1-x841.google.com ([2607:f8b0:4864:20::841]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vqiz4-00000000PqE-2UwJ for pgsql-hackers@lists.postgresql.org; Fri, 13 Feb 2026 02:32:44 +0000 Received: by mail-qt1-x841.google.com with SMTP id d75a77b69052e-503347e8715so5864871cf.2 for ; Thu, 12 Feb 2026 18:32:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1770949960; cv=none; d=google.com; s=arc-20240605; b=bYhVCZe5Tovp4KWAsHx6SGxxJJ5TpKyIDRDQslPDNoRrWEcW3McGNHGFd4p6hurzHG Qh1oQn0u47QDROMCMTS8Tbo1e/6rwh+4jQwZgFEJxlMnkN7h0Q0Fk6T593TCH3dOoqxm x9S6stcPR4Htm1/KQARnhL3mheMQHL+Qu1wcCi1JPvTDRb2MrX69XoNIfeFqdR9dHng2 MJ7MLBAbEqQ7+RkiNbsFCUNH3+DkWNJbaBSi9f/6d5CAH+Xxz63pRh1O+Vwm2+9kV9mS lHy0XfNkVyqbHPhvsTu4VxQquCGFpRMv3pH9w4ZjRTYcR7b75G5d4zzlwV1LvKMR2/Ed By1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=7cW+L2x/KxYM7Jj8X8xNaNbrTHZLkRMyloxr8/Yf4z0=; fh=6LDy0HRDUukMtJ0z00ycHm0ANVxm8x1xJMoAixAQWy4=; b=aqqKy3jBc0IqJIqHG6KLWGrfqLI2L+pOFVB91a1tb8sKkWB6B3V4uogpKLFnrRkZSg EhORXppmkDP7yvhG3YdkoaEDdeYop+SnHByz4y4S7V8XkTsm+skjv4rUcQL2R6lE/nZA L/k1RLRT6ADXF53QwcMY41VuYzjElLz0bJ8KH/xpwcB8+li/e5NIsCkw6Ut4KcL9dwHB 1lJg+sNMZx4kH9SrtYSA9iPWqOU95k9fQ0xMy6P9PrvdjUBduoGilKzHIz/6lyciAK6l 26Xl+fqRPHFjTOb9SQjJMu7bS5GxoZSbm1AE/iOwIm/l0Ol9MWzszIVvrXzopvaitiF2 scmQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1770949960; x=1771554760; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7cW+L2x/KxYM7Jj8X8xNaNbrTHZLkRMyloxr8/Yf4z0=; b=XThdVndPPmHZny9RDlRNHOxJpNkxPV2YtIxl04VjcbAncHH7/MBpY7rC+wD0EkrKEw 5Y5ci8Q75MTdOaFO+hvhykVhMgZwz1kP3LeX3v2THWwHxCt7JSP3YTVrPZC8z7IcREUz FrA2k7Q/MihhJZMItu+8EwJPIeRJrWW7jCIq9afFDj8eTfoVo60cCFuvIQ9xgsiR/eEm z3UzvlTmlS+4RbmbWJh4TtqWOzYxWWUTrEM6xnT1lgeMudyfhQQL+HZZVUR5CxbIlSnp Rqcdfi3o1YMPBVu8CCvmTD5rWFxrJcH8+YpQr6Q+HM3k1FaDCAUsZLbleupA6d15TxnF 7f6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770949960; x=1771554760; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7cW+L2x/KxYM7Jj8X8xNaNbrTHZLkRMyloxr8/Yf4z0=; b=mdqeyUzx2kRIr9PtUa237Gh0R1JW9TfjClQ1V/5jvtOfyw3CkavVJOdcdcvAgLjgE5 E8Hc3Z0bYhGb+lR2nADrlPfRG7BC6ZciB5QpLc+GNypKdYk0KERQuhANZUhwH94x0/iQ Vcpq4SDwxnUZTPnlphQNlrktD0B8JT0drrI8a5fj7Xce2aQMc0U6SSG9pFeEN2YTY1Ku XU5VD7LNdvb16yhI2+s45LoEw3etZ6Y6v4RuKBwuZPWN2PLEYnCCl1UK1RhjKZ+v2OUT 9pqp+i3z6wpfV3o3QcVHiBERya2oWyAJYWZJpJSa0M7rcOd32dmuIPHIa1A56+ll7b5J WhsQ== X-Forwarded-Encrypted: i=1; AJvYcCVnvFof4tIQxXynShCohEFLYGeUaPNpDS5GTs9QLDSYHSziKn+v7TCwfy7lUaEQMzQBXOtGes85kTR7FuYm@lists.postgresql.org X-Gm-Message-State: AOJu0YzPGW+O6lUEftexoqlyjCoZFsXP9voIZUp6ApbABmHyPq0aqSKg E+2Kctl9LITzbOiuj2JvKycUPzWn7uk44l5qHHVzxjGZ9qVO1W0SoGHT81dGfpGr92nhiwPmsiY g5yg3Coygma8vpcD+M2K3TMui1rfQU2DV2uvTPN8= X-Gm-Gg: AZuq6aJn005ywdwOVEIuEZJDZInf3S8aA6SAxS9LJfChu4gHHBN3VpZWqHH//ljbcSc MAkJ7TStmWZx8NfTuD8bpt0weV8Ir1uaER5XnflJGrh/fy9bj2N2JOf4c1zmJiFupbqLFDYtmEX 7hNkaXQ+9WkU+kHrdDgZUmDkwl4aLP96UrjfbgytvioGxAKD4gHCcXDIe+VGp+aOUMjsPZobIeP Jl7to8APZkHdn9wMoW8czfpdzktkk1h0m0HVe42yeFseSx53OGe0qb8gZHuwOF7Di9wIb1Vv3wB RhPqH2vqR2KnkgStz2MuZt36fGBhKSf12pfC2RYzZ+va19xvHwhQsjJQ3A4YTnR6aWNwBTnQQE+ EgyBxYnNX6g/TXWXfrkECkvVaQUM= X-Received: by 2002:a05:622a:1189:b0:4ed:1af6:230e with SMTP id d75a77b69052e-506a832400bmr6303971cf.56.1770949960470; Thu, 12 Feb 2026 18:32:40 -0800 (PST) MIME-Version: 1.0 References: <269A8FB9-6D43-43CF-A6FE-52D28CBDB8A9@Outlook.com> <606C775A-4C1A-482B-BE7D-2E7A46AE14B9@gmail.com> <9D3D4647-868B-4562-B382-D201478AD67B@gmail.com> <54659731-2232-4E74-9533-D136D01B153F@gmail.com> In-Reply-To: From: John Naylor Date: Fri, 13 Feb 2026 09:32:29 +0700 X-Gm-Features: AZwV_QiPBrTweFekTsAAIJBMYXBVpAhFNOQVqDIzvc7bRz2BJsy-iqoE-J1B_5E Message-ID: Subject: Re: tuple radix sort To: cca5507 Cc: zengman , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Feb 11, 2026 at 5:25=E2=80=AFPM zengman wr= ote: > I'm wondering if we should replace `state->memtuples` with `data` in the = `sort_byvalue_datum()` function here. > ``` > if (nulls_first) > { > null_start =3D state->memtuples; Sounds good. On Wed, Feb 11, 2026 at 6:47=E2=80=AFPM cca5507 wrote: > +1. Now data =3D=3D state->memtuples, how about remove the parameter "dat= a" and "n" and just > get them from "Tuplesortstate"? The parameter list currently looks like a sort function. I'm not sure why you think that's objectionable? > Some comments for v7: > > 1) > > ``` > /* > * Retrieve byte from datum, indexed by 'level': 0 for LSB, 7 for MSB > */ > static inline uint8 > current_byte(Datum key, int level) > { > int shift =3D (SIZEOF_DATUM - 1 - level) * BI= TS_PER_BYTE; > > return (key >> shift) & 0xFF; > } > ``` > > Maybe "0 for MSB, 7 for LSB"? If level =3D=3D 0, this function will retur= n the Most Significant Byte. Will fix, thanks. > 2) radix_sort_tuple() > > ``` > size_t end_offset =3D partitions[*rp].next_offse= t; > SortTuple *partition_end =3D begin + end_offset; > ptrdiff_t num_elements =3D end_offset - start_offse= t; > ``` > > Why the type of "num_elements" is "ptrdiff_t"? Maybe just "size_t"? That's from upstream. It's pretty rare in our codebase for in-core code, so I'll go ahead and change it. > 3) tuplesort_sort_memtuples() > > ``` > /* > * Do we have the leading column's value or abbreviation = in datum1? > */ > if (state->base.haveDatum1 && state->base.sortKeys) > { > SortSupportData ssup =3D state->base.sortKeys[0]; > ``` > > I think we should avoid the copy of SortSupportData. We can just use a po= inter. Will do, for consistency. > 4) > > Many places just consider "Datum" as integer, do we need to add a DatumGe= tUInt**() for them? It already does that for the case where it specifically needs uint32. For the general case, I don't think so. We can treat a Datum as an unsigned integer and don't care about the actual size or what it contains. There is one exception that I can see: 0002 has a call to pg_leftmost_one_pos64(), so a macro for that would be more appropriate than an implicit cast. Now I'm wondering if 0002 should normalize the datum as it determines the common prefix. That should only matter for int32 where the input has a mix of positive and negative, but it could be worth the additional calculation. I'll hold off on committing 0002 for a bit longer, but 0001 (plus the changes that I've agreed to) is still on schedule. Also, looking at commit 6aebedc38, it's preferred to use "sizeof(Datum)" rather than the vestigial SIZEOF_DATUM, so I'll change that in 0001/2. -- John Naylor Amazon Web Services