Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vOWl5-00BRQL-1K for pgsql-hackers@arkaria.postgresql.org; Thu, 27 Nov 2025 07:49:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vOWl3-004TGK-20 for pgsql-hackers@arkaria.postgresql.org; Thu, 27 Nov 2025 07:49:41 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vOWl3-004TGB-0m for pgsql-hackers@lists.postgresql.org; Thu, 27 Nov 2025 07:49:41 +0000 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vOWl0-001i07-2h for pgsql-hackers@lists.postgresql.org; Thu, 27 Nov 2025 07:49:40 +0000 Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-7bab7c997eeso673698b3a.0 for ; Wed, 26 Nov 2025 23:49:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764229778; x=1764834578; darn=lists.postgresql.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Ya/fJm+kHnDcm4XjojcmkjOdNR8BQoEA6SWhVF8fGR0=; b=iQ97d+zmG2FMr7OhJrQxNvihDJg4ACKkgXKPhoOOOJdXeMXVBOvICbvNp4B1tyG976 R1bY66qDkStTRX4yhfDUT5UGFDgz/N5DkHZRrYTLqmCdGky6/l4hYGjEVB/CNao9CPza RX0Aw3vn1wyN07mE+nsFSkbE4GwrBV3e38UL50reQuVisIAzcAU6woLGWcgxkDEbbnLv 7GlqDiOqnkLUTbNDdzgDkKAhdcP5ijNy/6AMbgqyPXLS9CoKLZcbzWtwnKwXGOKBZd2j XigZ0b+48s1sSxK52R8cIsr5FbP8YGGZ0QSBObQ8JtJ/wOhmiy+eWQ8SCtOYSWiGFVSB vuyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764229778; x=1764834578; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Ya/fJm+kHnDcm4XjojcmkjOdNR8BQoEA6SWhVF8fGR0=; b=JtBSaou1UNgnfLLTFYBQ5mKRf/k8qe7sQ8MXhAFJbNE3aSLfW37VyrY6VSp17bmS1Y smexcKZfSgRhXoCtMecZ+atsrrEhC/KJpabefziDiple6pALcBuSYkUxZva6sofvYqYQ ZpqmFptOMD+hN9W8X2TeHbfzYeMn15TC3NmJ6WcGIg3pMjGYLga5qXQg6EK2bq9iw8oY J6RcITunN71Jv6KHijoL+7Plnh5st4zi/QSc4tQTu/OHDITMoDEhMowbtFoqPf229SWH 3rpFmOa0y2dO8HChAR5Z0G06fZL6KKlcNessAQqWzmjxQ6XquF1swU81zdN308HX/NNY kocw== X-Forwarded-Encrypted: i=1; AJvYcCWAUaGvCZYEpo7a1LbpG2QaLsV2zAd0FwLubL3o9uKYuAJDCHlzg7V+6IXGZ9Ba9fIJbCk+KW9RSPfE9sif@lists.postgresql.org X-Gm-Message-State: AOJu0YwbeF5GibDkJ5zt4rVVfEOVR+elm0c/EVMpslsTbIanCI3bZxFw wgZwtKhxC2N2T8XlRXWTP80In/gY0FP2KnhxvxSKM6u6exh0hWHNPit6 X-Gm-Gg: ASbGncsQL+QwBrbhlPr5p07ZRcWga6AYYgrZaH6iU3U/1HTjmyxSk3LW4lSBJIr2LjS +wIOr1/zSxAdPM/NRdIPzXUipENF0nwv8SDEyFi4LY5qKSRvGtDwTC62e9WM5MQMm/s9dmCZZmp XsBPRgjaDIbdcUO0tIB/gaWDHBZL2HAB7KIcmkR9XUn7LBoUN4wH7vT0jouW9oYIWfvzCHLyOe4 0LXwZmVAIy2ZJyppDLemCDhQRyBLpslOWNgCidr4iJrfjAoDTNUORNASKUNcgQVu4dYrvmZAiUN DsFK10XUSsz5DaJrKk9o4jnNRiiE1DHwbB0MFpX+6Crr9Kj3eZnjzdXJhYLWH/hM2tT9W5G7y1V cjrcZQnY1z3qB/qVTjqM9EtWJvXXuHO+m6Uhf7Ky6gr8u7ktHwoYO9d9O5em5z4Rq6FlNDQo+Ux 8xtY0lQHnwwMa1p3Wfk7M= X-Google-Smtp-Source: AGHT+IEfEFzEtVqHID3+t9/+CopSwQOrj/8ffIFZaAE2uCbFhZNuTA6Z9qh8tf3wuWnqtaV9lCB1Tg== X-Received: by 2002:a05:6a20:e293:b0:33e:6885:2bd4 with SMTP id adf61e73a8af0-36150f058ffmr25726693637.29.1764229778565; Wed, 26 Nov 2025 23:49:38 -0800 (PST) Received: from smtpclient.apple ([45.32.121.103]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-be509d5723dsm999218a12.30.2025.11.26.23.49.36 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 26 Nov 2025 23:49:37 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81\)) Subject: Re: tuple radix sort From: Chao Li In-Reply-To: Date: Thu, 27 Nov 2025 15:48:58 +0800 Cc: Chengpeng Yan , "pgsql-hackers@lists.postgresql.org" , Peter Geoghegan Content-Transfer-Encoding: quoted-printable Message-Id: References: <269A8FB9-6D43-43CF-A6FE-52D28CBDB8A9@Outlook.com> To: John Naylor X-Mailer: Apple Mail (2.3826.700.81) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi John, I did an initial test before, but I didn=E2=80=99t read the code at the = time. Today, I spent time reviewing 0001. Overall, I believe the = implementation is solid. I just got a few comments/suggestions. > On Nov 26, 2025, at 21:11, John Naylor = wrote: >=20 >=20 > For v5 I've also added CHECK_FOR_INTERRUPTS and rewrote some comments. >=20 >=20 > -- > John Naylor > Amazon Web Services > = 1 - 0001 ``` + /* extract the byte for this level from the normalized = datum */ + current_byte =3D = extract_byte(normalize_datum(tup->datum1, ssup), + = level); + + /* save it for the permutation step */ + tup->current_byte =3D current_byte; ``` We recompute normalize_datum(tup->datum1, ssup) for every tuple in every = level, why don=E2=80=99t cache the result in SortTuple. As we have = cached current_byte in SortTuple, it shouldn=E2=80=99t be a big deal to = add one more field to it. 2 - 0001 ``` + while (num_remaining > 1) + { + /* start the count over */ + num_remaining =3D num_partitions; ``` The swap loop always start the count over, so that sorted partitions are = re-scanned as well. I think we can do an optimization like: ``` num_active =3D num_partitions; while (num_active > 1) { for (int i =3D 0; i < num_active; ) { uint8 idx =3D remaining_partitions[i]; // do the swaps for the partition =E2=80=A6 if (partitions[idx].offset =3D=3D partitions[idx].next_offset) { remaining_partitions[i] =3D remaining_partitions[num_active = - 1]; num_active--; } else i++; } } ``` This way we move out sorted partitions, so they will not be re-scanned. 3 - 0001 In sort_byvalue_datum, we can add a fast-path for all NULL and all NOT = NULL cases, so that they won=E2=80=99t need to run the branchless cyclic = permutation and the two =E2=80=9Cfor=E2=80=9D loops of assertions. = Something like: ``` while (d1 < state->memtupcount && data[d1].isnull1 =3D=3D nulls_first) d1++; null_count =3D d1; not_null_count =3D state->memtupcount - d1; /* fast paths: all on one side */ if (null_count =3D=3D 0 || not_null_count =3D=3D 0) { if (nulls_first) { null_start =3D data; not_null_start =3D data + null_count; } else { not_null_start =3D data; null_start =3D data + not_null_count; } /* only one partition is non-empty; sort it and return */ if (not_null_count > 1) { if (not_null_count < QSORT_THRESHOLD) qsort_tuple(not_null_start, not_null_count, = state->base.comparetup, state); else radix_sort_tuple(not_null_start, not_null_count, 0, state); } else if (null_count > 1 && state->base.onlyKey =3D=3D NULL) { qsort_tuple(null_start, null_count, = state->base.comparetup_tiebreak, state); } return; } /* ... existing branchless cyclic permutation ... */ ``` Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/