public inbox for [email protected]  
help / color / mirror / Atom feed
[PATCH] Simplify SortSupport implementation for macaddr
5+ messages / 4 participants
[nested] [flat]

* [PATCH] Simplify SortSupport implementation for macaddr
@ 2026-02-25 13:05 Aleksander Alekseev <[email protected]>
  2026-03-05 18:02 ` Re: [PATCH] Simplify SortSupport implementation for macaddr Andrey Borodin <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: Aleksander Alekseev @ 2026-02-25 13:05 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: John Naylor <[email protected]>

Hi,

Since Datums are 64-bit values and MAC addresses have only 6 bytes,
the abbreviated key contains the entire MAC address and is
authoritative, as pointed out by John Naylor [1]

This fact eliminates the need for cardinality estimation using
HyperLogLog since macaddr_abbrev_convert() is dirt cheap and not
lossy. There are no reasons to give up on abbreviation.

Potentially we could go even further and pass MAC addresses by value
rather by reference [2]. This would eliminate the need of abbreviation
completely since SortSupport->comparator could just compare two
Datums, as we do for Timestamps. This is a more invasive change though
that deserves more discussion and thus not proposed here.

[1]: https://postgr.es/m/CANWCAZYWdOEnoL_88VpMge1RtRpBz-VRCjdcu-eA4q3U6LvpDw%40mail.gmail.com
[2]: https://postgre.es/m/CAJ7c6TM8up%3DYih8pRLPy4wHwLzHf7w22tQ80-8ZBm__E%3D8_5HA%40mail.gmail.com

-- 
Best regards,
Aleksander Alekseev


Attachments:

  [text/x-patch] v1-0001-Simplify-SortSupport-implementation-for-macaddr.patch (5.8K, 2-v1-0001-Simplify-SortSupport-implementation-for-macaddr.patch)
  download | inline diff:
From 558c090d4e112794fb25ca16aa83874c5d2fafb3 Mon Sep 17 00:00:00 2001
From: Aleksander Alekseev <[email protected]>
Date: Wed, 25 Feb 2026 14:40:26 +0300
Subject: [PATCH v1] Simplify SortSupport implementation for macaddr

Since Datums are 64-bit values and MAC addresses have only 6 bytes, the
abbreviated key contains the entire MAC address and is authoritative. Thus
there is no need for cardinality estimation using HyperLogLog.

Author: Aleksander Alekseev <[email protected]>
Suggested-by: John Naylor <[email protected]>
Reviewed-by: TODO FIXME
Discussion: TODO FIXME
---
 src/backend/utils/adt/mac.c | 100 ++----------------------------------
 1 file changed, 5 insertions(+), 95 deletions(-)

diff --git a/src/backend/utils/adt/mac.c b/src/backend/utils/adt/mac.c
index f14675dea40..433dce015c7 100644
--- a/src/backend/utils/adt/mac.c
+++ b/src/backend/utils/adt/mac.c
@@ -14,11 +14,9 @@
 #include "postgres.h"
 
 #include "common/hashfn.h"
-#include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
 #include "port/pg_bswap.h"
 #include "utils/fmgrprotos.h"
-#include "utils/guc.h"
 #include "utils/inet.h"
 #include "utils/sortsupport.h"
 
@@ -33,15 +31,6 @@
 #define lobits(addr) \
   ((unsigned long)(((addr)->d<<16)|((addr)->e<<8)|((addr)->f)))
 
-/* sortsupport for macaddr */
-typedef struct
-{
-	int64		input_count;	/* number of non-null values seen */
-	bool		estimating;		/* true if estimating cardinality */
-
-	hyperLogLogState abbr_card; /* cardinality estimator */
-} macaddr_sortsupport_state;
-
 static int	macaddr_cmp_internal(macaddr *a1, macaddr *a2);
 static int	macaddr_fast_cmp(Datum x, Datum y, SortSupport ssup);
 static bool macaddr_abbrev_abort(int memtupcount, SortSupport ssup);
@@ -369,24 +358,10 @@ macaddr_sortsupport(PG_FUNCTION_ARGS)
 
 	if (ssup->abbreviate)
 	{
-		macaddr_sortsupport_state *uss;
-		MemoryContext oldcontext;
-
-		oldcontext = MemoryContextSwitchTo(ssup->ssup_cxt);
-
-		uss = palloc_object(macaddr_sortsupport_state);
-		uss->input_count = 0;
-		uss->estimating = true;
-		initHyperLogLog(&uss->abbr_card, 10);
-
-		ssup->ssup_extra = uss;
-
 		ssup->comparator = ssup_datum_unsigned_cmp;
 		ssup->abbrev_converter = macaddr_abbrev_convert;
 		ssup->abbrev_abort = macaddr_abbrev_abort;
 		ssup->abbrev_full_comparator = macaddr_fast_cmp;
-
-		MemoryContextSwitchTo(oldcontext);
 	}
 
 	PG_RETURN_VOID();
@@ -406,61 +381,12 @@ macaddr_fast_cmp(Datum x, Datum y, SortSupport ssup)
 }
 
 /*
- * Callback for estimating effectiveness of abbreviated key optimization.
- *
- * We pay no attention to the cardinality of the non-abbreviated data, because
- * there is no equality fast-path within authoritative macaddr comparator.
+ * Abbreviation is never aborted for macaddr because the 6-byte MAC address
+ * fits entirely within a 64-bit Datum, making the abbreviated key authoritative.
  */
 static bool
 macaddr_abbrev_abort(int memtupcount, SortSupport ssup)
 {
-	macaddr_sortsupport_state *uss = ssup->ssup_extra;
-	double		abbr_card;
-
-	if (memtupcount < 10000 || uss->input_count < 10000 || !uss->estimating)
-		return false;
-
-	abbr_card = estimateHyperLogLog(&uss->abbr_card);
-
-	/*
-	 * If we have >100k distinct values, then even if we were sorting many
-	 * billion rows we'd likely still break even, and the penalty of undoing
-	 * that many rows of abbrevs would probably not be worth it. At this point
-	 * we stop counting because we know that we're now fully committed.
-	 */
-	if (abbr_card > 100000.0)
-	{
-		if (trace_sort)
-			elog(LOG,
-				 "macaddr_abbrev: estimation ends at cardinality %f"
-				 " after " INT64_FORMAT " values (%d rows)",
-				 abbr_card, uss->input_count, memtupcount);
-		uss->estimating = false;
-		return false;
-	}
-
-	/*
-	 * Target minimum cardinality is 1 per ~2k of non-null inputs. 0.5 row
-	 * fudge factor allows us to abort earlier on genuinely pathological data
-	 * where we've had exactly one abbreviated value in the first 2k
-	 * (non-null) rows.
-	 */
-	if (abbr_card < uss->input_count / 2000.0 + 0.5)
-	{
-		if (trace_sort)
-			elog(LOG,
-				 "macaddr_abbrev: aborting abbreviation at cardinality %f"
-				 " below threshold %f after " INT64_FORMAT " values (%d rows)",
-				 abbr_card, uss->input_count / 2000.0 + 0.5, uss->input_count,
-				 memtupcount);
-		return true;
-	}
-
-	if (trace_sort)
-		elog(LOG,
-			 "macaddr_abbrev: cardinality %f after " INT64_FORMAT
-			 " values (%d rows)", abbr_card, uss->input_count, memtupcount);
-
 	return false;
 }
 
@@ -469,14 +395,13 @@ macaddr_abbrev_abort(int memtupcount, SortSupport ssup)
  * to abbreviated key representation.
  *
  * Packs the bytes of a 6-byte MAC address into a Datum and treats it as an
- * unsigned integer for purposes of comparison. On a 64-bit machine, there
- * will be two zeroed bytes of padding. The integer is converted to native
- * endianness to facilitate easy comparison.
+ * unsigned integer for purposes of comparison. There will be two zeroed bytes
+ * of padding. The integer is converted to native endianness to facilitate
+ * easy comparison.
  */
 static Datum
 macaddr_abbrev_convert(Datum original, SortSupport ssup)
 {
-	macaddr_sortsupport_state *uss = ssup->ssup_extra;
 	macaddr    *authoritative = DatumGetMacaddrP(original);
 	Datum		res;
 
@@ -489,21 +414,6 @@ macaddr_abbrev_convert(Datum original, SortSupport ssup)
 					 "Datum is too small for macaddr");
 	memset(&res, 0, sizeof(res));
 	memcpy(&res, authoritative, sizeof(macaddr));
-	uss->input_count += 1;
-
-	/*
-	 * Cardinality estimation. The estimate uses uint32, so XOR the two 32-bit
-	 * halves together to produce slightly more entropy. The two zeroed bytes
-	 * won't have any practical impact on this operation.
-	 */
-	if (uss->estimating)
-	{
-		uint32		tmp;
-
-		tmp = DatumGetUInt32(res) ^ (uint32) (DatumGetUInt64(res) >> 32);
-
-		addHyperLogLog(&uss->abbr_card, DatumGetUInt32(hash_uint32(tmp)));
-	}
 
 	/*
 	 * Byteswap on little-endian machines.
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: [PATCH] Simplify SortSupport implementation for macaddr
  2026-02-25 13:05 [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
@ 2026-03-05 18:02 ` Andrey Borodin <[email protected]>
  2026-03-09 13:08   ` Re: [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: Andrey Borodin @ 2026-03-05 18:02 UTC (permalink / raw)
  To: Aleksander Alekseev <[email protected]>; +Cc: pgsql-hackers; John Naylor <[email protected]>



> On 25 Feb 2026, at 18:05, Aleksander Alekseev <[email protected]> wrote:
> 
> <v1-0001-Simplify-SortSupport-implementation-for-macaddr.patch>

The patch looks correct and useful to me.

Two small points:

1. The assignment ssup->ssup_extra = NULL can be removed. The
   SortSupport struct is zeroed before the callback is called (see
   sortsupport.h). There are about 22 similar assignments elsewhere;
   it does not seem to be established practice, many other places have
   no such assignment.

2. I checked whether the existing tests actually use the SortSupport
   path. If macaddr_abbrev_abort() is made to error out, the macaddr.sql
   tests fail in two places: once for the index and once for the SELECT.
   So the current test file does exercise the SortSupport code path. I
   also tried making macaddr_abbrev_abort() always return true (so
   abbreviation is always aborted); the test dataset is small, but
   sorting seem to produce correct results.

The patch looks good to me.


Best regards, Andrey Borodin.




^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: [PATCH] Simplify SortSupport implementation for macaddr
  2026-02-25 13:05 [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
  2026-03-05 18:02 ` Re: [PATCH] Simplify SortSupport implementation for macaddr Andrey Borodin <[email protected]>
@ 2026-03-09 13:08   ` Aleksander Alekseev <[email protected]>
  2026-03-25 08:00     ` Re: [PATCH] Simplify SortSupport implementation for macaddr Michael Paquier <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: Aleksander Alekseev @ 2026-03-09 13:08 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: Andrey Borodin <[email protected]>; John Naylor <[email protected]>

Hi Andrey,

Many thanks for your feedback!

> 1. The assignment ssup->ssup_extra = NULL can be removed. The
>    SortSupport struct is zeroed before the callback is called (see
>    sortsupport.h). There are about 22 similar assignments elsewhere;
>    it does not seem to be established practice, many other places have
>    no such assignment.

Agree. I removed this assignment in 0001 and added 0002 that removes
if for the rest of *_sortsupport() functions.

-- 
Best regards,
Aleksander Alekseev


Attachments:

  [text/x-patch] v2-0002-Remove-redundant-ssup_extra-zeroing-in-_sortsuppo.patch (9.9K, 2-v2-0002-Remove-redundant-ssup_extra-zeroing-in-_sortsuppo.patch)
  download | inline diff:
From 2d3b1df3b946cad0f88624dc0ae5e07b6c28640b Mon Sep 17 00:00:00 2001
From: Aleksander Alekseev <[email protected]>
Date: Mon, 9 Mar 2026 15:51:04 +0300
Subject: [PATCH v2 2/2] Remove redundant ssup_extra zeroing in *_sortsupport()

SortSupportData is guaranteed to be zeroed by callers. Therefore, zeroing
ssup_extra in *_sortsupport() functions is redundant. Remove corresponding
code in order to make all the functions consistent.

Author: Aleksander Alekseev <[email protected]>
Suggested-by: Andrey Borodin <[email protected]>
Reviewed-by: TODO FIXME
Discussion: https://postgr.es/m/CAJ7c6TMk10rF_LiMz6j9rRy1rqk-5s+wBPuBefLix4cY+-4s1w@mail.gmail.com
---
 contrib/btree_gist/btree_bit.c      | 2 --
 contrib/btree_gist/btree_bool.c     | 1 -
 contrib/btree_gist/btree_bytea.c    | 1 -
 contrib/btree_gist/btree_cash.c     | 1 -
 contrib/btree_gist/btree_date.c     | 1 -
 contrib/btree_gist/btree_float4.c   | 1 -
 contrib/btree_gist/btree_float8.c   | 1 -
 contrib/btree_gist/btree_inet.c     | 1 -
 contrib/btree_gist/btree_interval.c | 1 -
 contrib/btree_gist/btree_macaddr.c  | 1 -
 contrib/btree_gist/btree_macaddr8.c | 1 -
 contrib/btree_gist/btree_numeric.c  | 1 -
 contrib/btree_gist/btree_oid.c      | 1 -
 contrib/btree_gist/btree_text.c     | 2 --
 contrib/btree_gist/btree_time.c     | 1 -
 contrib/btree_gist/btree_ts.c       | 1 -
 contrib/btree_gist/btree_uuid.c     | 1 -
 src/backend/utils/adt/network.c     | 1 -
 src/backend/utils/adt/rangetypes.c  | 1 -
 src/backend/utils/adt/uuid.c        | 1 -
 20 files changed, 22 deletions(-)

diff --git a/contrib/btree_gist/btree_bit.c b/contrib/btree_gist/btree_bit.c
index 2b9c18a586f..3f6f73ac9dc 100644
--- a/contrib/btree_gist/btree_bit.c
+++ b/contrib/btree_gist/btree_bit.c
@@ -233,7 +233,6 @@ gbt_bit_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_bit_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
@@ -244,7 +243,6 @@ gbt_varbit_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_bit_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_bool.c b/contrib/btree_gist/btree_bool.c
index 8e59523f474..7de93662c92 100644
--- a/contrib/btree_gist/btree_bool.c
+++ b/contrib/btree_gist/btree_bool.c
@@ -182,7 +182,6 @@ gbt_bool_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_bool_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_bytea.c b/contrib/btree_gist/btree_bytea.c
index 50bb24308e9..221e4bd9490 100644
--- a/contrib/btree_gist/btree_bytea.c
+++ b/contrib/btree_gist/btree_bytea.c
@@ -187,7 +187,6 @@ gbt_bytea_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_bytea_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_cash.c b/contrib/btree_gist/btree_cash.c
index a347b1320e4..13aacc0d762 100644
--- a/contrib/btree_gist/btree_cash.c
+++ b/contrib/btree_gist/btree_cash.c
@@ -237,7 +237,6 @@ gbt_cash_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_cash_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_date.c b/contrib/btree_gist/btree_date.c
index 5e41f4574c5..74dccb9613b 100644
--- a/contrib/btree_gist/btree_date.c
+++ b/contrib/btree_gist/btree_date.c
@@ -273,7 +273,6 @@ gbt_date_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_date_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_float4.c b/contrib/btree_gist/btree_float4.c
index c076918fd48..b694a175836 100644
--- a/contrib/btree_gist/btree_float4.c
+++ b/contrib/btree_gist/btree_float4.c
@@ -226,7 +226,6 @@ gbt_float4_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_float4_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_float8.c b/contrib/btree_gist/btree_float8.c
index d7386e885a2..fb906a38610 100644
--- a/contrib/btree_gist/btree_float8.c
+++ b/contrib/btree_gist/btree_float8.c
@@ -234,7 +234,6 @@ gbt_float8_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_float8_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_inet.c b/contrib/btree_gist/btree_inet.c
index 9d04b92d3b8..9684283fa4c 100644
--- a/contrib/btree_gist/btree_inet.c
+++ b/contrib/btree_gist/btree_inet.c
@@ -204,7 +204,6 @@ gbt_inet_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_inet_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_interval.c b/contrib/btree_gist/btree_interval.c
index 6b81fa2b39b..f288d97ee09 100644
--- a/contrib/btree_gist/btree_interval.c
+++ b/contrib/btree_gist/btree_interval.c
@@ -315,7 +315,6 @@ gbt_intv_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_intv_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_macaddr.c b/contrib/btree_gist/btree_macaddr.c
index df7c0040011..140f1ad4135 100644
--- a/contrib/btree_gist/btree_macaddr.c
+++ b/contrib/btree_gist/btree_macaddr.c
@@ -211,7 +211,6 @@ gbt_macaddr_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_macaddr_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_macaddr8.c b/contrib/btree_gist/btree_macaddr8.c
index 3b0c9b81e85..8b2a01567b7 100644
--- a/contrib/btree_gist/btree_macaddr8.c
+++ b/contrib/btree_gist/btree_macaddr8.c
@@ -208,7 +208,6 @@ gbt_macad8_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_macaddr8_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_numeric.c b/contrib/btree_gist/btree_numeric.c
index dba04c3a1b3..716b46c27ba 100644
--- a/contrib/btree_gist/btree_numeric.c
+++ b/contrib/btree_gist/btree_numeric.c
@@ -246,7 +246,6 @@ gbt_numeric_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_numeric_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_oid.c b/contrib/btree_gist/btree_oid.c
index 3ddf2a993d4..70eb91d8781 100644
--- a/contrib/btree_gist/btree_oid.c
+++ b/contrib/btree_gist/btree_oid.c
@@ -236,7 +236,6 @@ gbt_oid_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_oid_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_text.c b/contrib/btree_gist/btree_text.c
index 2ac12f1cab2..903d80a42d4 100644
--- a/contrib/btree_gist/btree_text.c
+++ b/contrib/btree_gist/btree_text.c
@@ -313,7 +313,6 @@ gbt_text_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_text_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
@@ -346,7 +345,6 @@ gbt_bpchar_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_bpchar_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_time.c b/contrib/btree_gist/btree_time.c
index 9d23fb4b867..610cb90aa6f 100644
--- a/contrib/btree_gist/btree_time.c
+++ b/contrib/btree_gist/btree_time.c
@@ -341,7 +341,6 @@ gbt_time_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_timekey_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_ts.c b/contrib/btree_gist/btree_ts.c
index 0f88d5d72f4..32d77c622af 100644
--- a/contrib/btree_gist/btree_ts.c
+++ b/contrib/btree_gist/btree_ts.c
@@ -412,7 +412,6 @@ gbt_ts_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_ts_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/contrib/btree_gist/btree_uuid.c b/contrib/btree_gist/btree_uuid.c
index c891840fb25..a1345a8401f 100644
--- a/contrib/btree_gist/btree_uuid.c
+++ b/contrib/btree_gist/btree_uuid.c
@@ -251,7 +251,6 @@ gbt_uuid_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = gbt_uuid_ssup_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/src/backend/utils/adt/network.c b/src/backend/utils/adt/network.c
index 3a2002097dd..223672ec8b4 100644
--- a/src/backend/utils/adt/network.c
+++ b/src/backend/utils/adt/network.c
@@ -435,7 +435,6 @@ network_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = network_fast_cmp;
-	ssup->ssup_extra = NULL;
 
 	if (ssup->abbreviate)
 	{
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index 06cc3af4f4a..883c0e40cf3 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -1467,7 +1467,6 @@ range_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = range_fast_cmp;
-	ssup->ssup_extra = NULL;
 
 	PG_RETURN_VOID();
 }
diff --git a/src/backend/utils/adt/uuid.c b/src/backend/utils/adt/uuid.c
index 6ee3752ac78..64519d4ead6 100644
--- a/src/backend/utils/adt/uuid.c
+++ b/src/backend/utils/adt/uuid.c
@@ -279,7 +279,6 @@ uuid_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = uuid_fast_cmp;
-	ssup->ssup_extra = NULL;
 
 	if (ssup->abbreviate)
 	{
-- 
2.43.0



  [text/x-patch] v2-0001-Simplify-SortSupport-implementation-for-macaddr.patch (6.1K, 3-v2-0001-Simplify-SortSupport-implementation-for-macaddr.patch)
  download | inline diff:
From cee3499a36d8c5457fe8cedf538ed84e1b700c88 Mon Sep 17 00:00:00 2001
From: Aleksander Alekseev <[email protected]>
Date: Wed, 25 Feb 2026 14:40:26 +0300
Subject: [PATCH v2 1/2] Simplify SortSupport implementation for macaddr

Since Datums are 64-bit values and MAC addresses have only 6 bytes, the
abbreviated key contains the entire MAC address and is authoritative. Thus
there is no need for cardinality estimation using HyperLogLog.

Author: Aleksander Alekseev <[email protected]>
Suggested-by: John Naylor <[email protected]>
Reviewed-by: Andrey Borodin <[email protected]>
Discussion: https://postgr.es/m/CAJ7c6TMk10rF_LiMz6j9rRy1rqk-5s+wBPuBefLix4cY+-4s1w@mail.gmail.com
---
 src/backend/utils/adt/mac.c | 101 ++----------------------------------
 1 file changed, 5 insertions(+), 96 deletions(-)

diff --git a/src/backend/utils/adt/mac.c b/src/backend/utils/adt/mac.c
index f14675dea40..b72dee37319 100644
--- a/src/backend/utils/adt/mac.c
+++ b/src/backend/utils/adt/mac.c
@@ -14,11 +14,9 @@
 #include "postgres.h"
 
 #include "common/hashfn.h"
-#include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
 #include "port/pg_bswap.h"
 #include "utils/fmgrprotos.h"
-#include "utils/guc.h"
 #include "utils/inet.h"
 #include "utils/sortsupport.h"
 
@@ -33,15 +31,6 @@
 #define lobits(addr) \
   ((unsigned long)(((addr)->d<<16)|((addr)->e<<8)|((addr)->f)))
 
-/* sortsupport for macaddr */
-typedef struct
-{
-	int64		input_count;	/* number of non-null values seen */
-	bool		estimating;		/* true if estimating cardinality */
-
-	hyperLogLogState abbr_card; /* cardinality estimator */
-} macaddr_sortsupport_state;
-
 static int	macaddr_cmp_internal(macaddr *a1, macaddr *a2);
 static int	macaddr_fast_cmp(Datum x, Datum y, SortSupport ssup);
 static bool macaddr_abbrev_abort(int memtupcount, SortSupport ssup);
@@ -365,28 +354,13 @@ macaddr_sortsupport(PG_FUNCTION_ARGS)
 	SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
 
 	ssup->comparator = macaddr_fast_cmp;
-	ssup->ssup_extra = NULL;
 
 	if (ssup->abbreviate)
 	{
-		macaddr_sortsupport_state *uss;
-		MemoryContext oldcontext;
-
-		oldcontext = MemoryContextSwitchTo(ssup->ssup_cxt);
-
-		uss = palloc_object(macaddr_sortsupport_state);
-		uss->input_count = 0;
-		uss->estimating = true;
-		initHyperLogLog(&uss->abbr_card, 10);
-
-		ssup->ssup_extra = uss;
-
 		ssup->comparator = ssup_datum_unsigned_cmp;
 		ssup->abbrev_converter = macaddr_abbrev_convert;
 		ssup->abbrev_abort = macaddr_abbrev_abort;
 		ssup->abbrev_full_comparator = macaddr_fast_cmp;
-
-		MemoryContextSwitchTo(oldcontext);
 	}
 
 	PG_RETURN_VOID();
@@ -406,61 +380,12 @@ macaddr_fast_cmp(Datum x, Datum y, SortSupport ssup)
 }
 
 /*
- * Callback for estimating effectiveness of abbreviated key optimization.
- *
- * We pay no attention to the cardinality of the non-abbreviated data, because
- * there is no equality fast-path within authoritative macaddr comparator.
+ * Abbreviation is never aborted for macaddr because the 6-byte MAC address
+ * fits entirely within a 64-bit Datum, making the abbreviated key authoritative.
  */
 static bool
 macaddr_abbrev_abort(int memtupcount, SortSupport ssup)
 {
-	macaddr_sortsupport_state *uss = ssup->ssup_extra;
-	double		abbr_card;
-
-	if (memtupcount < 10000 || uss->input_count < 10000 || !uss->estimating)
-		return false;
-
-	abbr_card = estimateHyperLogLog(&uss->abbr_card);
-
-	/*
-	 * If we have >100k distinct values, then even if we were sorting many
-	 * billion rows we'd likely still break even, and the penalty of undoing
-	 * that many rows of abbrevs would probably not be worth it. At this point
-	 * we stop counting because we know that we're now fully committed.
-	 */
-	if (abbr_card > 100000.0)
-	{
-		if (trace_sort)
-			elog(LOG,
-				 "macaddr_abbrev: estimation ends at cardinality %f"
-				 " after " INT64_FORMAT " values (%d rows)",
-				 abbr_card, uss->input_count, memtupcount);
-		uss->estimating = false;
-		return false;
-	}
-
-	/*
-	 * Target minimum cardinality is 1 per ~2k of non-null inputs. 0.5 row
-	 * fudge factor allows us to abort earlier on genuinely pathological data
-	 * where we've had exactly one abbreviated value in the first 2k
-	 * (non-null) rows.
-	 */
-	if (abbr_card < uss->input_count / 2000.0 + 0.5)
-	{
-		if (trace_sort)
-			elog(LOG,
-				 "macaddr_abbrev: aborting abbreviation at cardinality %f"
-				 " below threshold %f after " INT64_FORMAT " values (%d rows)",
-				 abbr_card, uss->input_count / 2000.0 + 0.5, uss->input_count,
-				 memtupcount);
-		return true;
-	}
-
-	if (trace_sort)
-		elog(LOG,
-			 "macaddr_abbrev: cardinality %f after " INT64_FORMAT
-			 " values (%d rows)", abbr_card, uss->input_count, memtupcount);
-
 	return false;
 }
 
@@ -469,14 +394,13 @@ macaddr_abbrev_abort(int memtupcount, SortSupport ssup)
  * to abbreviated key representation.
  *
  * Packs the bytes of a 6-byte MAC address into a Datum and treats it as an
- * unsigned integer for purposes of comparison. On a 64-bit machine, there
- * will be two zeroed bytes of padding. The integer is converted to native
- * endianness to facilitate easy comparison.
+ * unsigned integer for purposes of comparison. There will be two zeroed bytes
+ * of padding. The integer is converted to native endianness to facilitate
+ * easy comparison.
  */
 static Datum
 macaddr_abbrev_convert(Datum original, SortSupport ssup)
 {
-	macaddr_sortsupport_state *uss = ssup->ssup_extra;
 	macaddr    *authoritative = DatumGetMacaddrP(original);
 	Datum		res;
 
@@ -489,21 +413,6 @@ macaddr_abbrev_convert(Datum original, SortSupport ssup)
 					 "Datum is too small for macaddr");
 	memset(&res, 0, sizeof(res));
 	memcpy(&res, authoritative, sizeof(macaddr));
-	uss->input_count += 1;
-
-	/*
-	 * Cardinality estimation. The estimate uses uint32, so XOR the two 32-bit
-	 * halves together to produce slightly more entropy. The two zeroed bytes
-	 * won't have any practical impact on this operation.
-	 */
-	if (uss->estimating)
-	{
-		uint32		tmp;
-
-		tmp = DatumGetUInt32(res) ^ (uint32) (DatumGetUInt64(res) >> 32);
-
-		addHyperLogLog(&uss->abbr_card, DatumGetUInt32(hash_uint32(tmp)));
-	}
 
 	/*
 	 * Byteswap on little-endian machines.
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: [PATCH] Simplify SortSupport implementation for macaddr
  2026-02-25 13:05 [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
  2026-03-05 18:02 ` Re: [PATCH] Simplify SortSupport implementation for macaddr Andrey Borodin <[email protected]>
  2026-03-09 13:08   ` Re: [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
@ 2026-03-25 08:00     ` Michael Paquier <[email protected]>
  2026-04-07 06:51       ` Re: [PATCH] Simplify SortSupport implementation for macaddr John Naylor <[email protected]>
  0 siblings, 1 reply; 5+ messages in thread

From: Michael Paquier @ 2026-03-25 08:00 UTC (permalink / raw)
  To: Aleksander Alekseev <[email protected]>; +Cc: pgsql-hackers; Andrey Borodin <[email protected]>; John Naylor <[email protected]>; Tom Lane <[email protected]>

On Mon, Mar 09, 2026 at 04:08:06PM +0300, Aleksander Alekseev wrote:
>> 1. The assignment ssup->ssup_extra = NULL can be removed. The
>>    SortSupport struct is zeroed before the callback is called (see
>>    sortsupport.h). There are about 22 similar assignments elsewhere;
>>    it does not seem to be established practice, many other places have
>>    no such assignment.
> 
> Agree. I removed this assignment in 0001 and added 0002 that removes
> if for the rest of *_sortsupport() functions.

Sounds sensible to get rid of the estimation with the Datum size
requirement and never give up with the abbreviated key sort, as done
in 0001.  I'd leave the code touched by 0002 remain as-is.

@Tom, why didn't you consider that as a continuation of 6aebedc38497?
Just to keep the changes with SIZEOF_DATUM < 8 leaner, or just because
it was not worth bothering based on your TODO list?  I am wondering if
there is a reason I may be missing here.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: [PATCH] Simplify SortSupport implementation for macaddr
  2026-02-25 13:05 [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
  2026-03-05 18:02 ` Re: [PATCH] Simplify SortSupport implementation for macaddr Andrey Borodin <[email protected]>
  2026-03-09 13:08   ` Re: [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
  2026-03-25 08:00     ` Re: [PATCH] Simplify SortSupport implementation for macaddr Michael Paquier <[email protected]>
@ 2026-04-07 06:51       ` John Naylor <[email protected]>
  0 siblings, 0 replies; 5+ messages in thread

From: John Naylor @ 2026-04-07 06:51 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Aleksander Alekseev <[email protected]>; pgsql-hackers; Andrey Borodin <[email protected]>; Tom Lane <[email protected]>

On Wed, Mar 25, 2026 at 3:01 PM Michael Paquier <[email protected]> wrote:
>
> On Mon, Mar 09, 2026 at 04:08:06PM +0300, Aleksander Alekseev wrote:
> >> 1. The assignment ssup->ssup_extra = NULL can be removed. The
> >>    SortSupport struct is zeroed before the callback is called (see
> >>    sortsupport.h). There are about 22 similar assignments elsewhere;
> >>    it does not seem to be established practice, many other places have
> >>    no such assignment.
> >
> > Agree. I removed this assignment in 0001 and added 0002 that removes
> > if for the rest of *_sortsupport() functions.
>
> Sounds sensible to get rid of the estimation with the Datum size
> requirement and never give up with the abbreviated key sort, as done
> in 0001.  I'd leave the code touched by 0002 remain as-is.

I've committed v2-0001, thanks Aleksander! Without 0002, 0001 made
this code inconsistent with the tree by removing the assignment, so I
left it as in master.

--
John Naylor
Amazon Web Services





^ permalink  raw  reply  [nested|flat] 5+ messages in thread


end of thread, other threads:[~2026-04-07 06:51 UTC | newest]

Thread overview: 5+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-02-25 13:05 [PATCH] Simplify SortSupport implementation for macaddr Aleksander Alekseev <[email protected]>
2026-03-05 18:02 ` Andrey Borodin <[email protected]>
2026-03-09 13:08   ` Aleksander Alekseev <[email protected]>
2026-03-25 08:00     ` Michael Paquier <[email protected]>
2026-04-07 06:51       ` John Naylor <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox