public inbox for [email protected]  
help / color / mirror / Atom feed
From: Haibo Yan <[email protected]>
To: PostgreSQL-development <[email protected]>
Subject: Re: [PATCH] DISTINCT in plain aggregate window functions
Date: Wed, 29 Apr 2026 18:57:07 -0700
Message-ID: <CABXr29EW_+XRD4dTcEN4+=K78OvetMWMqF0qgSv_JwG6D4Gu3g@mail.gmail.com> (raw)
In-Reply-To: <CABXr29H2X+HtaPw-R3EheZUgv9fM7nSBjQCCaWCRv62mDYdM3w@mail.gmail.com>
References: <CABXr29H2X+HtaPw-R3EheZUgv9fM7nSBjQCCaWCRv62mDYdM3w@mail.gmail.com>

On Tue, Apr 7, 2026 at 10:31 PM Haibo Yan <[email protected]> wrote:

> Hi Hackers
>
> I’d like to start a patch series to add support for DISTINCT in plain
> aggregate window functions.
>
> PostgreSQL currently rejects cases such as:
>
>
> ---------------------------------------------------------------------------------------------------------
>
> count(DISTINCT x) OVER (PARTITION BY p)
>
> sum(DISTINCT x)   OVER ()
>
>
> ---------------------------------------------------------------------------------------------------------
>
> My plan is to implement this incrementally, by frame class and by feature
> dimension, rather than trying to solve every case in a single patch.
>
> For the first step, I’m posting patches 1-2 only and would appreciate your
> review on those.
>
> Patch 1 is intentionally very small:
>
>
>    - add parse/deparse plumbing for DISTINCT in plain aggregate window
>    functions
>    - carry the information through WindowFunc
>    - preserve it in ruleutils / deparse
>    - but still reject execution
>
> Patch 1 by itself does not add user-visible execution support, so I think
> it is best reviewed together with patch 2.
>
> Patch 2 adds the first real executor support:
>
>
>    - plain aggregate window functions only
>    - single-argument DISTINCT only
>    - whole-partition frames only
>
> That means support for cases where the frame is effectively the entire
> partition, for example:
>
>
> ---------------------------------------------------------------------------------------------------------
>
> count(DISTINCT x) OVER (PARTITION BY p)
> sum(DISTINCT x)   OVER ()
> avg(DISTINCT x)   OVER (
>     PARTITION BY p
>     ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> )
>
>
> ---------------------------------------------------------------------------------------------------------
>
> The executor approach in patch 2 is deliberately conservative:
>
>
>    - collect the partition’s aggregate inputs
>    - sort and deduplicate them
>    - feed the distinct values into the aggregate transition function
>    - finalize once
>    - reuse the cached result for all rows in the partition
>
> This avoids the much harder moving-frame cases for now.
>
> My proposed overall roadmap is below:
>
> Patch 1
>
>
>    - parse/deparse plumbing only
>    - allow DISTINCT to be represented on plain aggregate window functions
>    - preserve it through deparse / view definition
>    - still reject execution
>
> Patch 2
>
>
>    - executor support for whole-partition frames
>    - plain aggregate window functions only
>    - single-argument DISTINCT only
>    - sort-and-dedup implementation
>
> Patch 3
>
>
>    - executor support for non-shrinking frames
>    - frames starting at UNBOUNDED PRECEDING with no EXCLUDE
>    - incremental hash-based seen-set
>    - covers default ORDER BY frame and supported ... CURRENT ROW / ...
>    FOLLOWING cases
>
> Patch 4
>
>
>    - executor support for sliding ROWS frames
>    - refcounted DISTINCT state
>    - add/remove distinct contributions as rows enter and leave the frame
>    - fallback to restart/recompute for aggregates without inverse
>    transition support
>
> Patch 5
>
>
>    - extend the sliding DISTINCT machinery to sliding RANGE and GROUPS
>    - keep the same refcounted model
>    - no EXCLUDE yet
>
> Patch 6
>
>
>    - support EXCLUDE clauses
>    - likely correctness-first, with restart/recompute where incremental
>    maintenance is too awkward
>
> Patch 7
>
>
>    - support multi-argument DISTINCT
>    - upgrade DISTINCT keys from single datum to tuple/composite key
>    representation
>
> Patch 8
>
>
>    - support aggregate ORDER BY inside window aggregates
>    - left until last because it is orthogonal to frame-shape support and
>    substantially complicates both parse representation and executor behavior
>
> In short, the roadmap is:
>
>
>    1. plumbing
>    2. whole-partition
>    3. non-shrinking
>    4. sliding ROWS
>    5. sliding RANGE / GROUPS
>    6. EXCLUDE
>    7. multi-arg DISTINCT
>    8. aggregate ORDER BY
>
> For this posting, I’d especially appreciate feedback on:
>
>
>    - whether patch 1 + patch 2 is a reasonable first split
>    - whether whole-partition-only executor support is a good first
>    executable step
>    - whether the proposed long-term breakdown seems sensible
>
> Thanks in advance for any review or comments.
>
> Best regards,
>
> Haibo Yan
>
>
> I’ve managed to finish the first sub-series adding initial support for
DISTINCT in plain aggregate window functions.

Patch 1 teaches the parser and deparser to accept DISTINCT in plain
window aggregates. This is representation-only and does not change
execution yet.

Patch 2 adds executor support for the simplest case, whole-partition
frames, using a sort-and-deduplicate path.

Patch 3 extends that support to non-shrinking frames, where rows only
enter the frame, by using an incremental hash-based seen-set instead of
restarting the aggregate for each row.

Please review.

Regards,
Haibo


Attachments:

  [application/octet-stream] v2-0003-Extend-DISTINCT-window-aggregates-to-grow-only-fr.patch (26.7K, 3-v2-0003-Extend-DISTINCT-window-aggregates-to-grow-only-fr.patch)
  download | inline diff:
From f3b5e1f410f3616271c246b0cb33d5f46ec8f58c Mon Sep 17 00:00:00 2001
From: Haibo Yan <[email protected]>
Date: Tue, 7 Apr 2026 14:50:12 -0700
Subject: [PATCH v2 03/10] Extend DISTINCT window aggregates to grow-only
 frames
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Support DISTINCT for frames starting at UNBOUNDED PRECEDING with no
EXCLUDE clause.  These frames only grow, so an incremental hash table
tracks values seen so far — new values invoke the transition function,
duplicates are skipped.  The argument type must be hashable.
Whole-partition frames keep the existing sort-based path.
---
 src/backend/executor/nodeWindowAgg.c | 333 +++++++++++++++++++++++++--
 src/test/regress/expected/window.out | 199 ++++++++++++++--
 src/test/regress/sql/window.sql      |  71 +++++-
 3 files changed, 558 insertions(+), 45 deletions(-)

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 155745b8952..dcb8949491e 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -181,6 +181,15 @@ typedef struct WindowStatePerAggData
 	Oid			sortCollation;	/* collation for sort/equality */
 	bool		sortNullsFirst; /* NULLS FIRST? */
 	FmgrInfo	equalfn;		/* equality comparison function */
+
+	/* Grow-only frame DISTINCT: hash-based seen-set */
+	bool		distinctIsHash;		/* using hash-based incremental DISTINCT? */
+	TupleHashTable distinctTable;	/* hash table of seen values, or NULL */
+	MemoryContext distinctContext;	/* BumpContext for hash table tuples */
+	TupleDesc	distinctTupleDesc;	/* single-column tuple descriptor */
+	TupleTableSlot *distinctSlot;	/* slot for hash table lookups */
+	Oid		   *hashEqFuncOids;		/* equality function OID for hash (palloc'd) */
+	FmgrInfo   *hashFunctions;		/* hash function info (palloc'd) */
 } WindowStatePerAggData;
 
 static void initialize_windowaggregate(WindowAggState *winstate,
@@ -218,9 +227,13 @@ static WindowStatePerAggData *initialize_peragg(WindowAggState *winstate,
 static Datum GetAggInitVal(Datum textInitVal, Oid transtype);
 
 static bool is_whole_partition_frame(WindowAggState *winstate);
+static bool is_grow_only_frame(WindowAggState *winstate);
 static void eval_windowaggregate_distinct(WindowAggState *winstate,
 										  WindowStatePerFunc perfuncstate,
 										  WindowStatePerAgg peraggstate);
+static void advance_windowaggregate_distinct(WindowAggState *winstate,
+											 WindowStatePerFunc perfuncstate,
+											 WindowStatePerAgg peraggstate);
 
 static bool are_peers(WindowAggState *winstate, TupleTableSlot *slot1,
 					  TupleTableSlot *slot2);
@@ -293,6 +306,43 @@ initialize_windowaggregate(WindowAggState *winstate,
 	peraggstate->transValueCount = 0;
 	peraggstate->resultValue = (Datum) 0;
 	peraggstate->resultValueIsNull = true;
+
+	/*
+	 * For hash-based DISTINCT (grow-only frames), create a fresh hash table
+	 * each time the aggregate is restarted (i.e., at the start of each new
+	 * partition).  The previous distinctContext (if any) was a child of
+	 * aggcontext and was already destroyed by the MemoryContextReset above
+	 * (for private aggcontexts) or by the caller's reset of the shared
+	 * aggcontext.
+	 */
+	if (peraggstate->distinctIsHash)
+	{
+		Oid			collations[1];
+		AttrNumber	keyColIdx[1] = {1};
+
+		collations[0] = peraggstate->sortCollation;
+
+		peraggstate->distinctContext =
+			BumpContextCreate(peraggstate->aggcontext,
+							  "WindowAgg Distinct Hash Tuples",
+							  ALLOCSET_DEFAULT_SIZES);
+
+		peraggstate->distinctTable =
+			BuildTupleHashTable(&winstate->ss.ps,
+								peraggstate->distinctTupleDesc,
+								&TTSOpsVirtual,
+								1,		/* numCols */
+								keyColIdx,
+								peraggstate->hashEqFuncOids,
+								peraggstate->hashFunctions,
+								collations,
+								256,	/* initial estimate */
+								0,		/* additionalsize */
+								peraggstate->aggcontext,
+								peraggstate->distinctContext,
+								winstate->tmpcontext->ecxt_per_tuple_memory,
+								false);
+	}
 }
 
 /*
@@ -750,6 +800,27 @@ is_whole_partition_frame(WindowAggState *winstate)
 	return false;
 }
 
+/*
+ * is_grow_only_frame
+ *
+ * Returns true if the window frame only grows as the current row advances,
+ * i.e. rows only enter the frame, never leave it.  This is the case when the
+ * start is UNBOUNDED PRECEDING and there is no EXCLUSION clause.
+ *
+ * Note: every whole-partition frame is also a grow-only frame.
+ */
+static bool
+is_grow_only_frame(WindowAggState *winstate)
+{
+	int			frameOptions = winstate->frameOptions;
+
+	if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
+		return false;
+	if (frameOptions & FRAMEOPTION_EXCLUSION)
+		return false;
+	return true;
+}
+
 /*
  * eval_windowaggregate_distinct
  *
@@ -1000,6 +1071,164 @@ remember_value:
 	tuplesort_end(sortstate);
 }
 
+/*
+ * advance_windowaggregate_distinct
+ *
+ * Per-row handler for hash-based DISTINCT window aggregates with grow-only
+ * frames.  Evaluates the single argument, checks the hash table for
+ * duplicates, and only advances the transition function for new distinct
+ * values.
+ */
+static void
+advance_windowaggregate_distinct(WindowAggState *winstate,
+								 WindowStatePerFunc perfuncstate,
+								 WindowStatePerAgg peraggstate)
+{
+	LOCAL_FCINFO(fcinfo, FUNC_MAX_ARGS);
+	WindowFuncExprState *wfuncstate = perfuncstate->wfuncstate;
+	int			numArguments = perfuncstate->numArguments;
+	ExprContext *econtext = winstate->tmpcontext;
+	ExprState  *filter = wfuncstate->aggfilter;
+	TupleTableSlot *slot = peraggstate->distinctSlot;
+	Datum		newVal;
+	bool		isnull;
+	bool		isnew;
+	MemoryContext oldContext;
+
+	oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+	/* Skip anything FILTERed out */
+	if (filter)
+	{
+		Datum		res = ExecEvalExpr(filter, econtext, &isnull);
+
+		if (isnull || !DatumGetBool(res))
+		{
+			MemoryContextSwitchTo(oldContext);
+			return;
+		}
+	}
+
+	/* Evaluate the single argument */
+	{
+		ExprState  *argstate = (ExprState *) linitial(wfuncstate->args);
+
+		newVal = ExecEvalExpr(argstate, econtext, &isnull);
+	}
+
+	/* Store in slot for hash lookup */
+	ExecClearTuple(slot);
+	slot->tts_values[0] = newVal;
+	slot->tts_isnull[0] = isnull;
+	ExecStoreVirtualTuple(slot);
+
+	/* Check if already seen */
+	LookupTupleHashEntry(peraggstate->distinctTable, slot, &isnew, NULL);
+
+	if (!isnew)
+	{
+		MemoryContextSwitchTo(oldContext);
+		return;		/* already seen, skip */
+	}
+
+	/*
+	 * New distinct value: advance the transition function.
+	 * This replicates the strict-handling + transfn logic from
+	 * advance_windowaggregate().
+	 */
+	if (peraggstate->transfn.fn_strict)
+	{
+		/* For strict transfn, skip NULL inputs */
+		if (isnull)
+		{
+			MemoryContextSwitchTo(oldContext);
+			return;
+		}
+
+		/*
+		 * For strict transition functions with initial value NULL, use the
+		 * first non-NULL input as the initial state.
+		 */
+		if (peraggstate->transValueCount == 0 && peraggstate->transValueIsNull)
+		{
+			MemoryContextSwitchTo(peraggstate->aggcontext);
+			peraggstate->transValue = datumCopy(newVal,
+												peraggstate->transtypeByVal,
+												peraggstate->transtypeLen);
+			peraggstate->transValueIsNull = false;
+			peraggstate->transValueCount = 1;
+			MemoryContextSwitchTo(oldContext);
+			return;
+		}
+
+		if (peraggstate->transValueIsNull)
+		{
+			/*
+			 * Don't call a strict function with NULL inputs.  Note it is
+			 * possible to get here despite the above tests, if the transfn
+			 * is strict *and* returned a NULL on a prior cycle.  If that
+			 * happens we will propagate the NULL all the way to the end.
+			 * Hash-based DISTINCT never uses moving-aggregate code, so
+			 * invtransfn_oid should always be invalid here.
+			 */
+			MemoryContextSwitchTo(oldContext);
+			Assert(!OidIsValid(peraggstate->invtransfn_oid));
+			return;
+		}
+	}
+
+	/* OK to call the transition function */
+	InitFunctionCallInfoData(*fcinfo, &(peraggstate->transfn),
+							 numArguments + 1,
+							 perfuncstate->winCollation,
+							 (Node *) winstate, NULL);
+	fcinfo->args[0].value = peraggstate->transValue;
+	fcinfo->args[0].isnull = peraggstate->transValueIsNull;
+	fcinfo->args[1].value = newVal;
+	fcinfo->args[1].isnull = isnull;
+	winstate->curaggcontext = peraggstate->aggcontext;
+
+	{
+		Datum		result;
+
+		result = FunctionCallInvoke(fcinfo);
+		winstate->curaggcontext = NULL;
+
+		peraggstate->transValueCount++;
+
+		if (!peraggstate->transtypeByVal &&
+			DatumGetPointer(result) != DatumGetPointer(peraggstate->transValue))
+		{
+			if (!fcinfo->isnull)
+			{
+				MemoryContextSwitchTo(peraggstate->aggcontext);
+				if (DatumIsReadWriteExpandedObject(result,
+												   false,
+												   peraggstate->transtypeLen) &&
+					MemoryContextGetParent(DatumGetEOHP(result)->eoh_context) == CurrentMemoryContext)
+					 /* do nothing */ ;
+				else
+					result = datumCopy(result,
+									   peraggstate->transtypeByVal,
+									   peraggstate->transtypeLen);
+			}
+			if (!peraggstate->transValueIsNull)
+			{
+				if (DatumIsReadWriteExpandedObject(peraggstate->transValue,
+												   false,
+												   peraggstate->transtypeLen))
+					DeleteExpandedObject(peraggstate->transValue);
+				else
+					pfree(DatumGetPointer(peraggstate->transValue));
+			}
+		}
+
+		MemoryContextSwitchTo(oldContext);
+		peraggstate->transValue = result;
+		peraggstate->transValueIsNull = fcinfo->isnull;
+	}
+}
+
 /*
  * eval_windowaggregates
  * evaluate plain aggregates being used as window functions
@@ -1256,13 +1485,16 @@ eval_windowaggregates(WindowAggState *winstate)
 	/*
 	 * Compute DISTINCT aggregates for the whole partition.  These are handled
 	 * separately via sort-based deduplication rather than the main
-	 * accumulation loop below.
+	 * accumulation loop below.  Hash-based DISTINCT (grow-only frames) is
+	 * handled incrementally in the main loop instead.
 	 */
 	for (i = 0; i < numaggs; i++)
 	{
 		peraggstate = &winstate->peragg[i];
 		if (!peraggstate->windistinct || !peraggstate->restart)
 			continue;
+		if (peraggstate->distinctIsHash)
+			continue;		/* handled incrementally in main loop */
 		wfuncno = peraggstate->wfuncno;
 		eval_windowaggregate_distinct(winstate,
 									  &winstate->perfunc[wfuncno],
@@ -1325,10 +1557,20 @@ eval_windowaggregates(WindowAggState *winstate)
 		{
 			peraggstate = &winstate->peragg[i];
 
-			/* DISTINCT aggregates are handled separately */
-			if (peraggstate->windistinct)
+			/* Sort-based DISTINCT aggregates are handled separately */
+			if (peraggstate->windistinct && !peraggstate->distinctIsHash)
 				continue;
 
+			/* Hash-based DISTINCT: advance via per-row dedup */
+			if (peraggstate->windistinct && peraggstate->distinctIsHash)
+			{
+				wfuncno = peraggstate->wfuncno;
+				advance_windowaggregate_distinct(winstate,
+												 &winstate->perfunc[wfuncno],
+												 peraggstate);
+				continue;
+			}
+
 			/* Non-restarted aggs skip until aggregatedupto_nonrestarted */
 			if (!peraggstate->restart &&
 				winstate->aggregatedupto < aggregatedupto_nonrestarted)
@@ -1497,18 +1739,26 @@ prepare_tuplestore(WindowAggState *winstate)
 		}
 
 		/*
-		 * If any aggregate uses DISTINCT, the read pointer also needs
-		 * BACKWARD capability.  The DISTINCT helper reads through the
-		 * entire partition to collect values for sorting, which advances
-		 * the read pointer to the end.  The main accumulation loop (for
-		 * non-DISTINCT aggregates in the same WindowAgg node) then needs
-		 * to rewind back to the frame head.
+		 * If any aggregate uses sort-based DISTINCT (whole-partition path),
+		 * the read pointer also needs BACKWARD capability.  The DISTINCT
+		 * helper reads through the entire partition to collect values for
+		 * sorting, which advances the read pointer to the end.  The main
+		 * accumulation loop (for non-DISTINCT aggregates in the same
+		 * WindowAgg node) then needs to rewind back to the frame head.
+		 *
+		 * Hash-based DISTINCT (grow-only frames) participates in the normal
+		 * forward main loop and does not need BACKWARD.
+		 *
+		 * NB: windistinct and distinctIsHash are set during
+		 * initialize_peragg() in ExecInitWindowAgg(), which runs before
+		 * any partition is started, so they are valid here.
 		 */
 		if (!(readptr_flags & EXEC_FLAG_BACKWARD))
 		{
 			for (int i = 0; i < winstate->numaggs; i++)
 			{
-				if (winstate->peragg[i].windistinct)
+				if (winstate->peragg[i].windistinct &&
+					!winstate->peragg[i].distinctIsHash)
 				{
 					readptr_flags |= EXEC_FLAG_BACKWARD;
 					break;
@@ -3277,7 +3527,8 @@ ExecReScanWindowAgg(WindowAggState *node)
  * initialize_peragg
  *
  * Almost same as in nodeAgg.c, except we only support DISTINCT for
- * whole-partition frames and single-argument aggregates.
+ * non-shrinking frames (UNBOUNDED PRECEDING, no EXCLUDE) and
+ * single-argument aggregates.
  */
 static WindowStatePerAggData *
 initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
@@ -3304,15 +3555,19 @@ initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
 	ListCell   *lc;
 
 	/*
-	 * Validate DISTINCT usage.  Currently we only support DISTINCT for
-	 * whole-partition frames (where the result is constant across the
-	 * partition) and single-argument aggregates only.
+	 * Validate DISTINCT usage.  We support DISTINCT for whole-partition frames
+	 * (sort-based deduplication) and grow-only frames (hash-based dedup).
+	 * A grow-only frame starts at UNBOUNDED PRECEDING with no EXCLUSION.
+	 * Only single-argument aggregates are supported.
 	 */
-	if (wfunc->windistinct && !is_whole_partition_frame(winstate))
+	if (wfunc->windistinct && !is_grow_only_frame(winstate))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DISTINCT is only supported for window functions with a frame that covers the entire partition"),
-				 errhint("Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.")));
+				 errmsg("DISTINCT is not supported for window functions "
+						"with frames that do not start at UNBOUNDED "
+						"PRECEDING or that use an EXCLUDE clause"),
+				 errhint("The frame must start at UNBOUNDED PRECEDING "
+						 "and must not have an EXCLUDE clause.")));
 
 	if (wfunc->windistinct && list_length(wfunc->args) != 1)
 		ereport(ERROR,
@@ -3590,6 +3845,50 @@ initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
 		get_typlenbyval(inputType,
 						&peraggstate->inputtypeLen,
 						&peraggstate->inputtypeByVal);
+
+		/*
+		 * For non-shrinking but non-whole-partition frames, set up hash-based
+		 * incremental DISTINCT.  Unlike the whole-partition path which uses
+		 * sort-based deduplication (requiring only btree operators), the
+		 * incremental path must track previously seen values across rows
+		 * using a hash table, so the argument type must support hashing.
+		 * If it does not, we reject here rather than silently falling back,
+		 * because a sort-based incremental approach would require re-sorting
+		 * on every row.
+		 *
+		 * The hash table itself is created per-partition in
+		 * initialize_windowaggregate().
+		 */
+		if (!is_whole_partition_frame(winstate))
+		{
+			Oid			hashfn_oid;
+
+			/* Grow-only path requires a hashable type */
+			if (!get_op_hash_functions(eqOpr, &hashfn_oid, NULL))
+				ereport(ERROR,
+						(errcode(ERRCODE_UNDEFINED_FUNCTION),
+						 errmsg("could not find hash function for window "
+								"DISTINCT aggregate"),
+						 errhint("The argument type must support hashing.")));
+
+			peraggstate->distinctIsHash = true;
+
+			/* Create a single-column TupleDesc for the input type */
+			peraggstate->distinctTupleDesc = CreateTemplateTupleDesc(1);
+			TupleDescInitEntry(peraggstate->distinctTupleDesc, 1,
+							   "distinct_val", inputType, -1, 0);
+
+			/* Create a dedicated slot for hash table lookups */
+			peraggstate->distinctSlot =
+				MakeSingleTupleTableSlot(peraggstate->distinctTupleDesc,
+										 &TTSOpsVirtual);
+
+			/* Pre-compute hash infrastructure (reused across partitions) */
+			execTuplesHashPrepare(1,
+								  &eqOpr,
+								  &peraggstate->hashEqFuncOids,
+								  &peraggstate->hashFunctions);
+		}
 	}
 
 	ReleaseSysCache(aggTuple);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 84e2ab052e5..e89fc57c063 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -6118,30 +6118,193 @@ FROM generate_series(1, 10) g(x);
  10 |     2 |   3
 (10 rows)
 
--- Error: non-whole-partition frame (has ORDER BY -> RANGE UNBOUNDED PRECEDING to CURRENT ROW)
+-- Non-shrinking frame: ORDER BY produces RANGE UNBOUNDED PRECEDING to CURRENT ROW
 SELECT count(DISTINCT x) OVER (PARTITION BY x > 5 ORDER BY x)
-FROM generate_series(1, 10) g(x); -- error
-ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
-HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
--- Error: partial ROWS frame
-SELECT count(DISTINCT x) OVER (ROWS 3 PRECEDING)
-FROM generate_series(1, 10) g(x); -- error
-ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
-HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
--- Error: EXCLUDE clause
-SELECT count(DISTINCT x) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING EXCLUDE CURRENT ROW)
-FROM generate_series(1, 10) g(x); -- error
-ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
-HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
--- Error: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (not whole-partition
--- even without ORDER BY, because ROWS CURRENT ROW means exactly one row)
+FROM generate_series(1, 10) g(x);
+ count 
+-------
+     1
+     2
+     3
+     4
+     5
+     1
+     2
+     3
+     4
+     5
+(10 rows)
+
+-- Non-shrinking frame: running count with ORDER BY (default RANGE frame)
+SELECT x, count(DISTINCT x % 3) OVER (ORDER BY x)
+FROM generate_series(1, 10) g(x);
+ x  | count 
+----+-------
+  1 |     1
+  2 |     2
+  3 |     3
+  4 |     3
+  5 |     3
+  6 |     3
+  7 |     3
+  8 |     3
+  9 |     3
+ 10 |     3
+(10 rows)
+
+-- Non-shrinking frame: explicit ROWS UNBOUNDED PRECEDING to CURRENT ROW
+SELECT x, count(DISTINCT x % 3) OVER (
+    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
+FROM generate_series(1, 10) g(x);
+ x  | count 
+----+-------
+  1 |     1
+  2 |     2
+  3 |     3
+  4 |     3
+  5 |     3
+  6 |     3
+  7 |     3
+  8 |     3
+  9 |     3
+ 10 |     3
+(10 rows)
+
+-- Non-shrinking frame: PARTITION BY and ORDER BY
+SELECT x, sum(DISTINCT x % 3) OVER (PARTITION BY x > 5 ORDER BY x)
+FROM generate_series(1, 10) g(x);
+ x  | sum 
+----+-----
+  1 |   1
+  2 |   3
+  3 |   3
+  4 |   3
+  5 |   3
+  6 |   0
+  7 |   1
+  8 |   3
+  9 |   3
+ 10 |   3
+(10 rows)
+
+-- Non-shrinking frame: FILTER with ORDER BY
+SELECT x, count(DISTINCT x % 3) FILTER (WHERE x > 2) OVER (ORDER BY x)
+FROM generate_series(1, 10) g(x);
+ x  | count 
+----+-------
+  1 |     0
+  2 |     0
+  3 |     1
+  4 |     2
+  5 |     3
+  6 |     3
+  7 |     3
+  8 |     3
+  9 |     3
+ 10 |     3
+(10 rows)
+
+-- Non-shrinking frame: mixed DISTINCT + non-DISTINCT with ORDER BY
+SELECT x,
+       count(DISTINCT x % 3) OVER w,
+       sum(x) OVER w
+FROM generate_series(1, 9) g(x)
+WINDOW w AS (ORDER BY x);
+ x | count | sum 
+---+-------+-----
+ 1 |     1 |   1
+ 2 |     2 |   3
+ 3 |     3 |   6
+ 4 |     3 |  10
+ 5 |     3 |  15
+ 6 |     3 |  21
+ 7 |     3 |  28
+ 8 |     3 |  36
+ 9 |     3 |  45
+(9 rows)
+
+-- Non-shrinking frame: verify monotonically increasing running count
+SELECT x, count(DISTINCT x) OVER (ORDER BY x)
+FROM generate_series(1, 5) g(x);
+ x | count 
+---+-------
+ 1 |     1
+ 2 |     2
+ 3 |     3
+ 4 |     4
+ 5 |     5
+(5 rows)
+
+-- Non-shrinking frame: ROWS UNBOUNDED PRECEDING to CURRENT ROW without ORDER BY
 SELECT x,
        count(DISTINCT x % 3) OVER (
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        )
+FROM generate_series(1, 10) g(x);
+ x  | count 
+----+-------
+  1 |     1
+  2 |     2
+  3 |     3
+  4 |     3
+  5 |     3
+  6 |     3
+  7 |     3
+  8 |     3
+  9 |     3
+ 10 |     3
+(10 rows)
+
+-- Non-shrinking frame: FOLLOWING bound (not just CURRENT ROW)
+SELECT x,
+       count(DISTINCT x % 3) OVER (
+           ORDER BY x
+           ROWS BETWEEN UNBOUNDED PRECEDING AND 2 FOLLOWING
+       )
+FROM generate_series(1, 10) g(x);
+ x  | count 
+----+-------
+  1 |     3
+  2 |     3
+  3 |     3
+  4 |     3
+  5 |     3
+  6 |     3
+  7 |     3
+  8 |     3
+  9 |     3
+ 10 |     3
+(10 rows)
+
+-- Error: sliding frame (start is not UNBOUNDED PRECEDING)
+SELECT count(DISTINCT x) OVER (ROWS 3 PRECEDING)
+FROM generate_series(1, 10) g(x); -- error
+ERROR:  DISTINCT is not supported for window functions with frames that do not start at UNBOUNDED PRECEDING or that use an EXCLUDE clause
+HINT:  The frame must start at UNBOUNDED PRECEDING and must not have an EXCLUDE clause.
+-- Error: sliding frame with explicit bounds
+SELECT count(DISTINCT x) OVER (ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)
+FROM generate_series(1, 10) g(x); -- error
+ERROR:  DISTINCT is not supported for window functions with frames that do not start at UNBOUNDED PRECEDING or that use an EXCLUDE clause
+HINT:  The frame must start at UNBOUNDED PRECEDING and must not have an EXCLUDE clause.
+-- Error: EXCLUDE clause with ORDER BY
+SELECT count(DISTINCT x) OVER (
+    ORDER BY x ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+    EXCLUDE CURRENT ROW)
+FROM generate_series(1, 10) g(x); -- error
+ERROR:  DISTINCT is not supported for window functions with frames that do not start at UNBOUNDED PRECEDING or that use an EXCLUDE clause
+HINT:  The frame must start at UNBOUNDED PRECEDING and must not have an EXCLUDE clause.
+-- Error: EXCLUDE clause with UNBOUNDED frame
+SELECT count(DISTINCT x) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING EXCLUDE CURRENT ROW)
 FROM generate_series(1, 10) g(x); -- error
-ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
-HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
+ERROR:  DISTINCT is not supported for window functions with frames that do not start at UNBOUNDED PRECEDING or that use an EXCLUDE clause
+HINT:  The frame must start at UNBOUNDED PRECEDING and must not have an EXCLUDE clause.
+-- Error: non-hashable type with non-shrinking frame (money has btree but no hash).
+-- Whole-partition DISTINCT uses sort-based dedup so money works there,
+-- but the incremental non-shrinking path requires hash support.
+SELECT count(DISTINCT x::money) OVER (ORDER BY x)
+FROM generate_series(1, 5) g(x); -- error
+ERROR:  could not find hash function for window DISTINCT aggregate
+HINT:  The argument type must support hashing.
 -- Error: multi-argument DISTINCT window aggregate (not yet supported)
 SELECT string_agg(DISTINCT four::text, ',') OVER (PARTITION BY ten)
 FROM tenk1; -- error
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 8e319c45461..7ec37ba3729 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -2210,26 +2210,77 @@ SELECT x,
        sum(DISTINCT x % 3) OVER (PARTITION BY x > 5)
 FROM generate_series(1, 10) g(x);
 
--- Error: non-whole-partition frame (has ORDER BY -> RANGE UNBOUNDED PRECEDING to CURRENT ROW)
+-- Non-shrinking frame: ORDER BY produces RANGE UNBOUNDED PRECEDING to CURRENT ROW
 SELECT count(DISTINCT x) OVER (PARTITION BY x > 5 ORDER BY x)
-FROM generate_series(1, 10) g(x); -- error
+FROM generate_series(1, 10) g(x);
 
--- Error: partial ROWS frame
-SELECT count(DISTINCT x) OVER (ROWS 3 PRECEDING)
-FROM generate_series(1, 10) g(x); -- error
+-- Non-shrinking frame: running count with ORDER BY (default RANGE frame)
+SELECT x, count(DISTINCT x % 3) OVER (ORDER BY x)
+FROM generate_series(1, 10) g(x);
 
--- Error: EXCLUDE clause
-SELECT count(DISTINCT x) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING EXCLUDE CURRENT ROW)
-FROM generate_series(1, 10) g(x); -- error
+-- Non-shrinking frame: explicit ROWS UNBOUNDED PRECEDING to CURRENT ROW
+SELECT x, count(DISTINCT x % 3) OVER (
+    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
+FROM generate_series(1, 10) g(x);
+
+-- Non-shrinking frame: PARTITION BY and ORDER BY
+SELECT x, sum(DISTINCT x % 3) OVER (PARTITION BY x > 5 ORDER BY x)
+FROM generate_series(1, 10) g(x);
+
+-- Non-shrinking frame: FILTER with ORDER BY
+SELECT x, count(DISTINCT x % 3) FILTER (WHERE x > 2) OVER (ORDER BY x)
+FROM generate_series(1, 10) g(x);
+
+-- Non-shrinking frame: mixed DISTINCT + non-DISTINCT with ORDER BY
+SELECT x,
+       count(DISTINCT x % 3) OVER w,
+       sum(x) OVER w
+FROM generate_series(1, 9) g(x)
+WINDOW w AS (ORDER BY x);
 
--- Error: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (not whole-partition
--- even without ORDER BY, because ROWS CURRENT ROW means exactly one row)
+-- Non-shrinking frame: verify monotonically increasing running count
+SELECT x, count(DISTINCT x) OVER (ORDER BY x)
+FROM generate_series(1, 5) g(x);
+
+-- Non-shrinking frame: ROWS UNBOUNDED PRECEDING to CURRENT ROW without ORDER BY
 SELECT x,
        count(DISTINCT x % 3) OVER (
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        )
+FROM generate_series(1, 10) g(x);
+
+-- Non-shrinking frame: FOLLOWING bound (not just CURRENT ROW)
+SELECT x,
+       count(DISTINCT x % 3) OVER (
+           ORDER BY x
+           ROWS BETWEEN UNBOUNDED PRECEDING AND 2 FOLLOWING
+       )
+FROM generate_series(1, 10) g(x);
+
+-- Error: sliding frame (start is not UNBOUNDED PRECEDING)
+SELECT count(DISTINCT x) OVER (ROWS 3 PRECEDING)
 FROM generate_series(1, 10) g(x); -- error
 
+-- Error: sliding frame with explicit bounds
+SELECT count(DISTINCT x) OVER (ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)
+FROM generate_series(1, 10) g(x); -- error
+
+-- Error: EXCLUDE clause with ORDER BY
+SELECT count(DISTINCT x) OVER (
+    ORDER BY x ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+    EXCLUDE CURRENT ROW)
+FROM generate_series(1, 10) g(x); -- error
+
+-- Error: EXCLUDE clause with UNBOUNDED frame
+SELECT count(DISTINCT x) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING EXCLUDE CURRENT ROW)
+FROM generate_series(1, 10) g(x); -- error
+
+-- Error: non-hashable type with non-shrinking frame (money has btree but no hash).
+-- Whole-partition DISTINCT uses sort-based dedup so money works there,
+-- but the incremental non-shrinking path requires hash support.
+SELECT count(DISTINCT x::money) OVER (ORDER BY x)
+FROM generate_series(1, 5) g(x); -- error
+
 -- Error: multi-argument DISTINCT window aggregate (not yet supported)
 SELECT string_agg(DISTINCT four::text, ',') OVER (PARTITION BY ten)
 FROM tenk1; -- error
-- 
2.52.0



  [application/octet-stream] v2-0002-Support-DISTINCT-in-whole-partition-window-aggreg.patch (24.1K, 4-v2-0002-Support-DISTINCT-in-whole-partition-window-aggreg.patch)
  download | inline diff:
From 531b815c80e02de9e62cd2ec4d7d0dc10b293b9b Mon Sep 17 00:00:00 2001
From: Haibo Yan <[email protected]>
Date: Tue, 7 Apr 2026 11:57:50 -0700
Subject: [PATCH v2 02/10] Support DISTINCT in whole-partition window
 aggregates

Add executor support for DISTINCT in plain aggregate window functions
when the frame covers the entire partition.  Collect argument values,
sort, deduplicate, and run the transition function once per distinct
value; reuse the result for every row.  Only single-argument DISTINCT
is handled; other frame shapes are still rejected.
---
 src/backend/executor/nodeWindowAgg.c | 393 ++++++++++++++++++++++++++-
 src/test/regress/expected/window.out | 166 ++++++++++-
 src/test/regress/sql/window.sql      |  64 ++++-
 3 files changed, 609 insertions(+), 14 deletions(-)

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 4087ab1b16e..155745b8952 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -47,6 +47,7 @@
 #include "optimizer/optimizer.h"
 #include "parser/parse_agg.h"
 #include "parser/parse_coerce.h"
+#include "parser/parse_oper.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
@@ -55,6 +56,7 @@
 #include "utils/memutils.h"
 #include "utils/regproc.h"
 #include "utils/syscache.h"
+#include "utils/tuplesort.h"
 #include "utils/tuplestore.h"
 #include "windowapi.h"
 
@@ -171,6 +173,14 @@ typedef struct WindowStatePerAggData
 
 	/* Data local to eval_windowaggregates() */
 	bool		restart;		/* need to restart this agg in this cycle? */
+
+	/* DISTINCT support */
+	bool		windistinct;	/* DISTINCT specified on this aggregate */
+	Oid			inputtypeOid;	/* OID of the single DISTINCT argument type */
+	Oid			sortOperator;	/* btree < operator for sorting */
+	Oid			sortCollation;	/* collation for sort/equality */
+	bool		sortNullsFirst; /* NULLS FIRST? */
+	FmgrInfo	equalfn;		/* equality comparison function */
 } WindowStatePerAggData;
 
 static void initialize_windowaggregate(WindowAggState *winstate,
@@ -207,6 +217,11 @@ static WindowStatePerAggData *initialize_peragg(WindowAggState *winstate,
 												WindowStatePerAgg peraggstate);
 static Datum GetAggInitVal(Datum textInitVal, Oid transtype);
 
+static bool is_whole_partition_frame(WindowAggState *winstate);
+static void eval_windowaggregate_distinct(WindowAggState *winstate,
+										  WindowStatePerFunc perfuncstate,
+										  WindowStatePerAgg peraggstate);
+
 static bool are_peers(WindowAggState *winstate, TupleTableSlot *slot1,
 					  TupleTableSlot *slot2);
 static bool window_gettupleslot(WindowObject winobj, int64 pos,
@@ -694,6 +709,297 @@ finalize_windowaggregate(WindowAggState *winstate,
 	MemoryContextSwitchTo(oldContext);
 }
 
+/*
+ * is_whole_partition_frame
+ *
+ * Returns true if the window frame is guaranteed to cover the entire
+ * partition.  This is the case when the start is UNBOUNDED PRECEDING,
+ * there is no EXCLUSION clause, and the end is either UNBOUNDED FOLLOWING
+ * or CURRENT ROW with no ORDER BY in RANGE or GROUPS mode (which means
+ * all rows are peers, so CURRENT ROW extends to the partition boundary).
+ * In ROWS mode, CURRENT ROW always means exactly the current row.
+ */
+static bool
+is_whole_partition_frame(WindowAggState *winstate)
+{
+	WindowAgg  *node = (WindowAgg *) winstate->ss.ps.plan;
+	int			frameOptions = winstate->frameOptions;
+
+	/* Must start at UNBOUNDED PRECEDING */
+	if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
+		return false;
+
+	/* Must not have an EXCLUSION clause */
+	if (frameOptions & FRAMEOPTION_EXCLUSION)
+		return false;
+
+	/* End must be UNBOUNDED FOLLOWING ... */
+	if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
+		return true;
+
+	/*
+	 * ... or CURRENT ROW with no ORDER BY (all rows are peers), but only
+	 * for RANGE or GROUPS mode.  In ROWS mode, CURRENT ROW means exactly
+	 * the current row regardless of peers.
+	 */
+	if ((frameOptions & FRAMEOPTION_END_CURRENT_ROW) &&
+		!(frameOptions & FRAMEOPTION_ROWS) &&
+		node->ordNumCols == 0)
+		return true;
+
+	return false;
+}
+
+/*
+ * eval_windowaggregate_distinct
+ *
+ * Compute a single-argument DISTINCT window aggregate over the whole
+ * partition.  We collect all argument values (applying any FILTER clause),
+ * sort them, skip duplicates, and feed the distinct values into the
+ * aggregate's transition function.
+ *
+ * This follows the pattern of process_ordered_aggregate_single() in
+ * nodeAgg.c.
+ */
+static void
+eval_windowaggregate_distinct(WindowAggState *winstate,
+							  WindowStatePerFunc perfuncstate,
+							  WindowStatePerAgg peraggstate)
+{
+	WindowObject agg_winobj = winstate->agg_winobj;
+	TupleTableSlot *temp_slot = winstate->temp_slot_1;
+	ExprContext *econtext = winstate->tmpcontext;
+	WindowFuncExprState *wfuncstate = perfuncstate->wfuncstate;
+	ExprState  *filter = wfuncstate->aggfilter;
+	int			numArguments = perfuncstate->numArguments;
+	LOCAL_FCINFO(fcinfo, FUNC_MAX_ARGS);
+	Tuplesortstate *sortstate;
+	Datum		newVal;
+	bool		newIsNull;
+	Datum		newAbbrevVal;
+	Datum		oldVal = (Datum) 0;
+	bool		oldIsNull = true;
+	bool		haveOldVal = false;
+	Datum		oldAbbrevVal = (Datum) 0;
+	MemoryContext oldContext;
+	int64		total_rows;
+	int64		row;
+
+	/* Ensure all partition rows are spooled */
+	spool_tuples(winstate, -1);
+	total_rows = winstate->spooled_rows;
+
+	/* Create a tuplesort for the single DISTINCT argument */
+	sortstate = tuplesort_begin_datum(peraggstate->inputtypeOid,
+									  peraggstate->sortOperator,
+									  peraggstate->sortCollation,
+									  peraggstate->sortNullsFirst,
+									  work_mem, NULL, TUPLESORT_NONE);
+
+	/*
+	 * Loop over all rows in the partition, evaluate FILTER and the argument,
+	 * and feed values into the sort.
+	 */
+	for (row = 0; row < total_rows; row++)
+	{
+		if (!window_gettupleslot(agg_winobj, row, temp_slot))
+			break;
+
+		econtext->ecxt_outertuple = temp_slot;
+
+		oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+		/* Skip anything FILTERed out */
+		if (filter)
+		{
+			bool		isnull;
+			Datum		res = ExecEvalExpr(filter, econtext, &isnull);
+
+			if (isnull || !DatumGetBool(res))
+			{
+				MemoryContextSwitchTo(oldContext);
+				ResetExprContext(econtext);
+				ExecClearTuple(temp_slot);
+				continue;
+			}
+		}
+
+		/* Evaluate the single argument */
+		{
+			ExprState  *argstate = (ExprState *) linitial(wfuncstate->args);
+			Datum		val;
+			bool		isnull;
+
+			val = ExecEvalExpr(argstate, econtext, &isnull);
+
+			MemoryContextSwitchTo(oldContext);
+
+			/* Feed into sort */
+			tuplesort_putdatum(sortstate, val, isnull);
+		}
+
+		ResetExprContext(econtext);
+		ExecClearTuple(temp_slot);
+	}
+
+	/* Sort */
+	tuplesort_performsort(sortstate);
+
+	/*
+	 * Read back sorted values, skip duplicates, and feed distinct values
+	 * into the transition function.  This mirrors
+	 * process_ordered_aggregate_single() in nodeAgg.c.
+	 */
+	while (tuplesort_getdatum(sortstate, true, false,
+							  &newVal, &newIsNull, &newAbbrevVal))
+	{
+		ResetExprContext(econtext);
+		oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+		/*
+		 * If DISTINCT mode, skip if not distinct from prior value.
+		 */
+		if (haveOldVal &&
+			((oldIsNull && newIsNull) ||
+			 (!oldIsNull && !newIsNull &&
+			  oldAbbrevVal == newAbbrevVal &&
+			  DatumGetBool(FunctionCall2Coll(&peraggstate->equalfn,
+											 peraggstate->sortCollation,
+											 oldVal, newVal)))))
+		{
+			MemoryContextSwitchTo(oldContext);
+			continue;
+		}
+
+		/*
+		 * Advance the transition function with this distinct value.
+		 * This replicates the strict-function handling from
+		 * advance_windowaggregate().
+		 */
+		if (peraggstate->transfn.fn_strict)
+		{
+			/* For strict transfn, skip NULL inputs */
+			if (newIsNull)
+			{
+				MemoryContextSwitchTo(oldContext);
+				goto remember_value;
+			}
+
+			/*
+			 * For strict transition functions with initial value NULL,
+			 * use the first non-NULL input as the initial state.
+			 */
+			if (peraggstate->transValueCount == 0 &&
+				peraggstate->transValueIsNull)
+			{
+				MemoryContextSwitchTo(peraggstate->aggcontext);
+				peraggstate->transValue = datumCopy(newVal,
+													peraggstate->transtypeByVal,
+													peraggstate->transtypeLen);
+				peraggstate->transValueIsNull = false;
+				peraggstate->transValueCount = 1;
+				MemoryContextSwitchTo(oldContext);
+				goto remember_value;
+			}
+
+			if (peraggstate->transValueIsNull)
+			{
+				/*
+				 * Don't call a strict function with NULL inputs.
+				 */
+				MemoryContextSwitchTo(oldContext);
+				goto remember_value;
+			}
+		}
+
+		/* OK to call the transition function */
+		InitFunctionCallInfoData(*fcinfo, &(peraggstate->transfn),
+								 numArguments + 1,
+								 perfuncstate->winCollation,
+								 (Node *) winstate, NULL);
+		fcinfo->args[0].value = peraggstate->transValue;
+		fcinfo->args[0].isnull = peraggstate->transValueIsNull;
+		fcinfo->args[1].value = newVal;
+		fcinfo->args[1].isnull = newIsNull;
+		winstate->curaggcontext = peraggstate->aggcontext;
+
+		{
+			Datum		result;
+
+			result = FunctionCallInvoke(fcinfo);
+			winstate->curaggcontext = NULL;
+
+			peraggstate->transValueCount++;
+
+			/*
+			 * If pass-by-ref datatype, must copy the new value into
+			 * aggcontext and free the prior transValue.  But if transfn
+			 * returned a pointer to its first input, we don't need to do
+			 * anything.  Also, if transfn returned a pointer to a R/W
+			 * expanded object that is already a child of the aggcontext,
+			 * assume we can adopt that value without copying it.
+			 *
+			 * This must match advance_windowaggregate's logic exactly.
+			 */
+			if (!peraggstate->transtypeByVal &&
+				DatumGetPointer(result) != DatumGetPointer(peraggstate->transValue))
+			{
+				if (!fcinfo->isnull)
+				{
+					MemoryContextSwitchTo(peraggstate->aggcontext);
+					if (DatumIsReadWriteExpandedObject(result,
+													   false,
+													   peraggstate->transtypeLen) &&
+						MemoryContextGetParent(DatumGetEOHP(result)->eoh_context) == CurrentMemoryContext)
+						 /* do nothing */ ;
+					else
+						result = datumCopy(result,
+										   peraggstate->transtypeByVal,
+										   peraggstate->transtypeLen);
+				}
+				if (!peraggstate->transValueIsNull)
+				{
+					if (DatumIsReadWriteExpandedObject(peraggstate->transValue,
+													   false,
+													   peraggstate->transtypeLen))
+						DeleteExpandedObject(peraggstate->transValue);
+					else
+						pfree(DatumGetPointer(peraggstate->transValue));
+				}
+			}
+
+			MemoryContextSwitchTo(oldContext);
+			peraggstate->transValue = result;
+			peraggstate->transValueIsNull = fcinfo->isnull;
+		}
+
+remember_value:
+		/*
+		 * Remember the current value for subsequent duplicate checks.
+		 */
+		if (!peraggstate->inputtypeByVal)
+		{
+			if (!oldIsNull)
+				pfree(DatumGetPointer(oldVal));
+			if (!newIsNull)
+				oldVal = datumCopy(newVal, peraggstate->inputtypeByVal,
+								   peraggstate->inputtypeLen);
+			else
+				oldVal = (Datum) 0;
+		}
+		else
+			oldVal = newVal;
+		oldAbbrevVal = newAbbrevVal;
+		oldIsNull = newIsNull;
+		haveOldVal = true;
+	}
+
+	if (!oldIsNull && !peraggstate->inputtypeByVal)
+		pfree(DatumGetPointer(oldVal));
+
+	tuplesort_end(sortstate);
+}
+
 /*
  * eval_windowaggregates
  * evaluate plain aggregates being used as window functions
@@ -947,6 +1253,22 @@ eval_windowaggregates(WindowAggState *winstate)
 		}
 	}
 
+	/*
+	 * Compute DISTINCT aggregates for the whole partition.  These are handled
+	 * separately via sort-based deduplication rather than the main
+	 * accumulation loop below.
+	 */
+	for (i = 0; i < numaggs; i++)
+	{
+		peraggstate = &winstate->peragg[i];
+		if (!peraggstate->windistinct || !peraggstate->restart)
+			continue;
+		wfuncno = peraggstate->wfuncno;
+		eval_windowaggregate_distinct(winstate,
+									  &winstate->perfunc[wfuncno],
+									  peraggstate);
+	}
+
 	/*
 	 * Non-restarted aggregates now contain the rows between aggregatedbase
 	 * (i.e., frameheadpos) and aggregatedupto, while restarted aggregates
@@ -1003,6 +1325,10 @@ eval_windowaggregates(WindowAggState *winstate)
 		{
 			peraggstate = &winstate->peragg[i];
 
+			/* DISTINCT aggregates are handled separately */
+			if (peraggstate->windistinct)
+				continue;
+
 			/* Non-restarted aggs skip until aggregatedupto_nonrestarted */
 			if (!peraggstate->restart &&
 				winstate->aggregatedupto < aggregatedupto_nonrestarted)
@@ -1170,6 +1496,26 @@ prepare_tuplestore(WindowAggState *winstate)
 			readptr_flags |= EXEC_FLAG_BACKWARD;
 		}
 
+		/*
+		 * If any aggregate uses DISTINCT, the read pointer also needs
+		 * BACKWARD capability.  The DISTINCT helper reads through the
+		 * entire partition to collect values for sorting, which advances
+		 * the read pointer to the end.  The main accumulation loop (for
+		 * non-DISTINCT aggregates in the same WindowAgg node) then needs
+		 * to rewind back to the frame head.
+		 */
+		if (!(readptr_flags & EXEC_FLAG_BACKWARD))
+		{
+			for (int i = 0; i < winstate->numaggs; i++)
+			{
+				if (winstate->peragg[i].windistinct)
+				{
+					readptr_flags |= EXEC_FLAG_BACKWARD;
+					break;
+				}
+			}
+		}
+
 		agg_winobj->readptr = tuplestore_alloc_read_pointer(winstate->buffer,
 															readptr_flags);
 	}
@@ -2930,7 +3276,8 @@ ExecReScanWindowAgg(WindowAggState *node)
 /*
  * initialize_peragg
  *
- * Almost same as in nodeAgg.c, except we don't support DISTINCT currently.
+ * Almost same as in nodeAgg.c, except we only support DISTINCT for
+ * whole-partition frames and single-argument aggregates.
  */
 static WindowStatePerAggData *
 initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
@@ -2957,13 +3304,20 @@ initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
 	ListCell   *lc;
 
 	/*
-	 * Temporary: reject DISTINCT window aggregates until executor support
-	 * lands.  Patch 2 will replace this with actual DISTINCT handling.
+	 * Validate DISTINCT usage.  Currently we only support DISTINCT for
+	 * whole-partition frames (where the result is constant across the
+	 * partition) and single-argument aggregates only.
 	 */
-	if (wfunc->windistinct)
+	if (wfunc->windistinct && !is_whole_partition_frame(winstate))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DISTINCT is not yet implemented for window aggregates")));
+				 errmsg("DISTINCT is only supported for window functions with a frame that covers the entire partition"),
+				 errhint("Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.")));
+
+	if (wfunc->windistinct && list_length(wfunc->args) != 1)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("DISTINCT is not supported for window aggregate functions with more than one argument")));
 
 	numArguments = list_length(wfunc->args);
 
@@ -3209,6 +3563,35 @@ initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
 	else
 		peraggstate->aggcontext = winstate->aggcontext;
 
+	/*
+	 * Set up DISTINCT state if needed.  We need sort and equality operators
+	 * for the single argument type, plus its type length and by-value info
+	 * for datum copying during the dedup loop.
+	 */
+	if (wfunc->windistinct)
+	{
+		Oid			ltOpr,
+					eqOpr;
+		Oid			inputType = inputTypes[0];
+
+		peraggstate->windistinct = true;
+		peraggstate->inputtypeOid = inputType;
+
+		get_sort_group_operators(inputType,
+								 true, true, false,
+								 &ltOpr, &eqOpr, NULL,
+								 NULL);
+
+		peraggstate->sortOperator = ltOpr;
+		peraggstate->sortCollation = wfunc->inputcollid;
+		peraggstate->sortNullsFirst = false;
+		fmgr_info(get_opcode(eqOpr), &peraggstate->equalfn);
+
+		get_typlenbyval(inputType,
+						&peraggstate->inputtypeLen,
+						&peraggstate->inputtypeByVal);
+	}
+
 	ReleaseSysCache(aggTuple);
 
 	return peraggstate;
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 7e9212c4677..84e2ab052e5 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5968,8 +5968,7 @@ WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);
 DROP TABLE planets CASCADE;
 NOTICE:  drop cascades to view planets_view
 --
--- Test DISTINCT in window aggregates (parse/deparse plumbing only;
--- execution support is not yet implemented)
+-- Test DISTINCT in window aggregates
 --
 -- Should parse successfully and round-trip through a view definition
 CREATE TEMP VIEW window_distinct_view AS
@@ -5987,6 +5986,163 @@ SELECT ntile(DISTINCT 4) OVER () FROM tenk1; -- error
 ERROR:  DISTINCT is not implemented for non-aggregate window functions
 LINE 1: SELECT ntile(DISTINCT 4) OVER () FROM tenk1;
                ^
--- Execution fails with a clear executor-side error
-SELECT count(DISTINCT four) OVER (PARTITION BY ten) FROM tenk1; -- error
-ERROR:  DISTINCT is not yet implemented for window aggregates
+-- Basic DISTINCT whole-partition cases (should succeed)
+SELECT count(DISTINCT four) OVER (PARTITION BY ten)
+FROM tenk1 LIMIT 20;
+ count 
+-------
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+     2
+(20 rows)
+
+-- DISTINCT with no PARTITION BY (whole single partition)
+SELECT x, sum(DISTINCT x % 3) OVER ()
+FROM generate_series(1, 9) g(x);
+ x | sum 
+---+-----
+ 1 |   3
+ 2 |   3
+ 3 |   3
+ 4 |   3
+ 5 |   3
+ 6 |   3
+ 7 |   3
+ 8 |   3
+ 9 |   3
+(9 rows)
+
+-- DISTINCT with explicit UNBOUNDED PRECEDING to UNBOUNDED FOLLOWING
+SELECT x, avg(DISTINCT x % 4) OVER (PARTITION BY x > 5
+  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
+FROM generate_series(1, 10) g(x);
+ x  |        avg         
+----+--------------------
+  1 | 1.5000000000000000
+  2 | 1.5000000000000000
+  3 | 1.5000000000000000
+  4 | 1.5000000000000000
+  5 | 1.5000000000000000
+  6 | 1.5000000000000000
+  7 | 1.5000000000000000
+  8 | 1.5000000000000000
+  9 | 1.5000000000000000
+ 10 | 1.5000000000000000
+(10 rows)
+
+-- DISTINCT with FILTER
+SELECT x,
+       count(DISTINCT x % 3) FILTER (WHERE x > 3) OVER (PARTITION BY x > 5)
+FROM generate_series(1, 10) g(x);
+ x  | count 
+----+-------
+  1 |     2
+  2 |     2
+  3 |     2
+  4 |     2
+  5 |     2
+  6 |     3
+  7 |     3
+  8 |     3
+  9 |     3
+ 10 |     3
+(10 rows)
+
+-- NULL handling
+SELECT x,
+       count(DISTINCT x) OVER ()
+FROM (VALUES (1),(2),(NULL),(2),(NULL),(1),(3)) v(x);
+ x | count 
+---+-------
+ 1 |     3
+ 2 |     3
+   |     3
+ 2 |     3
+   |     3
+ 1 |     3
+ 3 |     3
+(7 rows)
+
+-- Mixed DISTINCT and non-DISTINCT aggregates in same window
+SELECT x,
+       count(DISTINCT x % 3) OVER w,
+       sum(x) OVER w
+FROM generate_series(1, 9) g(x)
+WINDOW w AS (PARTITION BY x > 5);
+ x | count | sum 
+---+-------+-----
+ 1 |     3 |  15
+ 2 |     3 |  15
+ 3 |     3 |  15
+ 4 |     3 |  15
+ 5 |     3 |  15
+ 6 |     3 |  30
+ 7 |     3 |  30
+ 8 |     3 |  30
+ 9 |     3 |  30
+(9 rows)
+
+-- Multiple DISTINCT aggregates in same query
+SELECT x,
+       count(DISTINCT x % 2) OVER (PARTITION BY x > 5),
+       sum(DISTINCT x % 3) OVER (PARTITION BY x > 5)
+FROM generate_series(1, 10) g(x);
+ x  | count | sum 
+----+-------+-----
+  1 |     2 |   3
+  2 |     2 |   3
+  3 |     2 |   3
+  4 |     2 |   3
+  5 |     2 |   3
+  6 |     2 |   3
+  7 |     2 |   3
+  8 |     2 |   3
+  9 |     2 |   3
+ 10 |     2 |   3
+(10 rows)
+
+-- Error: non-whole-partition frame (has ORDER BY -> RANGE UNBOUNDED PRECEDING to CURRENT ROW)
+SELECT count(DISTINCT x) OVER (PARTITION BY x > 5 ORDER BY x)
+FROM generate_series(1, 10) g(x); -- error
+ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
+HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
+-- Error: partial ROWS frame
+SELECT count(DISTINCT x) OVER (ROWS 3 PRECEDING)
+FROM generate_series(1, 10) g(x); -- error
+ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
+HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
+-- Error: EXCLUDE clause
+SELECT count(DISTINCT x) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING EXCLUDE CURRENT ROW)
+FROM generate_series(1, 10) g(x); -- error
+ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
+HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
+-- Error: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (not whole-partition
+-- even without ORDER BY, because ROWS CURRENT ROW means exactly one row)
+SELECT x,
+       count(DISTINCT x % 3) OVER (
+           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+       )
+FROM generate_series(1, 10) g(x); -- error
+ERROR:  DISTINCT is only supported for window functions with a frame that covers the entire partition
+HINT:  Remove ORDER BY from the window definition, or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
+-- Error: multi-argument DISTINCT window aggregate (not yet supported)
+SELECT string_agg(DISTINCT four::text, ',') OVER (PARTITION BY ten)
+FROM tenk1; -- error
+ERROR:  DISTINCT is not supported for window aggregate functions with more than one argument
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 2c211e67052..8e319c45461 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -2161,8 +2161,7 @@ WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);
 DROP TABLE planets CASCADE;
 
 --
--- Test DISTINCT in window aggregates (parse/deparse plumbing only;
--- execution support is not yet implemented)
+-- Test DISTINCT in window aggregates
 --
 
 -- Should parse successfully and round-trip through a view definition
@@ -2175,5 +2174,62 @@ DROP VIEW window_distinct_view;
 -- DISTINCT on a non-aggregate window function is still a parse error
 SELECT ntile(DISTINCT 4) OVER () FROM tenk1; -- error
 
--- Execution fails with a clear executor-side error
-SELECT count(DISTINCT four) OVER (PARTITION BY ten) FROM tenk1; -- error
+-- Basic DISTINCT whole-partition cases (should succeed)
+SELECT count(DISTINCT four) OVER (PARTITION BY ten)
+FROM tenk1 LIMIT 20;
+
+-- DISTINCT with no PARTITION BY (whole single partition)
+SELECT x, sum(DISTINCT x % 3) OVER ()
+FROM generate_series(1, 9) g(x);
+
+-- DISTINCT with explicit UNBOUNDED PRECEDING to UNBOUNDED FOLLOWING
+SELECT x, avg(DISTINCT x % 4) OVER (PARTITION BY x > 5
+  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
+FROM generate_series(1, 10) g(x);
+
+-- DISTINCT with FILTER
+SELECT x,
+       count(DISTINCT x % 3) FILTER (WHERE x > 3) OVER (PARTITION BY x > 5)
+FROM generate_series(1, 10) g(x);
+
+-- NULL handling
+SELECT x,
+       count(DISTINCT x) OVER ()
+FROM (VALUES (1),(2),(NULL),(2),(NULL),(1),(3)) v(x);
+
+-- Mixed DISTINCT and non-DISTINCT aggregates in same window
+SELECT x,
+       count(DISTINCT x % 3) OVER w,
+       sum(x) OVER w
+FROM generate_series(1, 9) g(x)
+WINDOW w AS (PARTITION BY x > 5);
+
+-- Multiple DISTINCT aggregates in same query
+SELECT x,
+       count(DISTINCT x % 2) OVER (PARTITION BY x > 5),
+       sum(DISTINCT x % 3) OVER (PARTITION BY x > 5)
+FROM generate_series(1, 10) g(x);
+
+-- Error: non-whole-partition frame (has ORDER BY -> RANGE UNBOUNDED PRECEDING to CURRENT ROW)
+SELECT count(DISTINCT x) OVER (PARTITION BY x > 5 ORDER BY x)
+FROM generate_series(1, 10) g(x); -- error
+
+-- Error: partial ROWS frame
+SELECT count(DISTINCT x) OVER (ROWS 3 PRECEDING)
+FROM generate_series(1, 10) g(x); -- error
+
+-- Error: EXCLUDE clause
+SELECT count(DISTINCT x) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING EXCLUDE CURRENT ROW)
+FROM generate_series(1, 10) g(x); -- error
+
+-- Error: ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (not whole-partition
+-- even without ORDER BY, because ROWS CURRENT ROW means exactly one row)
+SELECT x,
+       count(DISTINCT x % 3) OVER (
+           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+       )
+FROM generate_series(1, 10) g(x); -- error
+
+-- Error: multi-argument DISTINCT window aggregate (not yet supported)
+SELECT string_agg(DISTINCT four::text, ',') OVER (PARTITION BY ten)
+FROM tenk1; -- error
-- 
2.52.0



  [application/octet-stream] v2-0001-Parse-and-deparse-DISTINCT-in-window-aggregate-ca.patch (6.6K, 5-v2-0001-Parse-and-deparse-DISTINCT-in-window-aggregate-ca.patch)
  download | inline diff:
From 346bdfac78a69a67c790c61b20add74807f3418d Mon Sep 17 00:00:00 2001
From: Haibo Yan <[email protected]>
Date: Mon, 6 Apr 2026 23:50:50 -0700
Subject: [PATCH v2 01/10] Parse and deparse DISTINCT in window aggregate calls

Add a windistinct field to WindowFunc to carry the DISTINCT qualifier
through parse analysis, node serialization, and ruleutils deparse.
Move the rejection from parse time to executor init so that stored
rules and view definitions preserve the syntax.  Execution still
raises FEATURE_NOT_SUPPORTED until later patches add real support.
---
 src/backend/executor/nodeWindowAgg.c |  9 +++++++++
 src/backend/optimizer/util/clauses.c |  1 +
 src/backend/parser/parse_func.c      |  6 ++++--
 src/backend/utils/adt/ruleutils.c    |  2 ++
 src/include/nodes/primnodes.h        |  2 ++
 src/test/regress/expected/window.out | 23 +++++++++++++++++++++++
 src/test/regress/sql/window.sql      | 18 ++++++++++++++++++
 7 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index f52a7aae843..4087ab1b16e 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2956,6 +2956,15 @@ initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
 	int			i;
 	ListCell   *lc;
 
+	/*
+	 * Temporary: reject DISTINCT window aggregates until executor support
+	 * lands.  Patch 2 will replace this with actual DISTINCT handling.
+	 */
+	if (wfunc->windistinct)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("DISTINCT is not yet implemented for window aggregates")));
+
 	numArguments = list_length(wfunc->args);
 
 	i = 0;
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index fcf6d7fff2a..62d62d7fdef 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2816,6 +2816,7 @@ eval_const_expressions_mutator(Node *node,
 				newexpr->winref = expr->winref;
 				newexpr->winstar = expr->winstar;
 				newexpr->winagg = expr->winagg;
+				newexpr->windistinct = expr->windistinct;
 				newexpr->ignore_nulls = expr->ignore_nulls;
 				newexpr->location = expr->location;
 
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 35ff6427147..d216d53e530 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -847,6 +847,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
 		/* winref will be set by transformWindowFuncCall */
 		wfunc->winstar = agg_star;
 		wfunc->winagg = (fdresult == FUNCDETAIL_AGGREGATE);
+		wfunc->windistinct = agg_distinct;
 		wfunc->aggfilter = agg_filter;
 		wfunc->ignore_nulls = ignore_nulls;
 		wfunc->runCondition = NIL;
@@ -854,11 +855,12 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
 
 		/*
 		 * agg_star is allowed for aggregate functions but distinct isn't
+		 * allowed for non-aggregate window functions.
 		 */
-		if (agg_distinct)
+		if (agg_distinct && !wfunc->winagg)
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("DISTINCT is not implemented for window functions"),
+					 errmsg("DISTINCT is not implemented for non-aggregate window functions"),
 					 parser_errposition(pstate, location)));
 
 		/*
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index c781cdc84d3..73dc9f85b40 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -11659,6 +11659,8 @@ get_windowfunc_expr_helper(WindowFunc *wfunc, deparse_context *context,
 		appendStringInfoChar(buf, '*');
 	else
 	{
+		if (wfunc->windistinct)
+			appendStringInfoString(buf, "DISTINCT ");
 		if (is_json_objectagg)
 		{
 			get_rule_expr((Node *) linitial(wfunc->args), context, false);
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 6dfc946c20b..1e984dfbfda 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -614,6 +614,8 @@ typedef struct WindowFunc
 	bool		winstar pg_node_attr(query_jumble_ignore);
 	/* is function a simple aggregate? */
 	bool		winagg pg_node_attr(query_jumble_ignore);
+	/* true if aggregate arguments were marked DISTINCT */
+	bool		windistinct;
 	/* ignore nulls. One of the Null Treatment options */
 	int			ignore_nulls;
 	/* token location, or -1 if unknown */
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index e6aac27a2a9..7e9212c4677 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5967,3 +5967,26 @@ WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);
 --cleanup
 DROP TABLE planets CASCADE;
 NOTICE:  drop cascades to view planets_view
+--
+-- Test DISTINCT in window aggregates (parse/deparse plumbing only;
+-- execution support is not yet implemented)
+--
+-- Should parse successfully and round-trip through a view definition
+CREATE TEMP VIEW window_distinct_view AS
+SELECT count(DISTINCT four) OVER (PARTITION BY ten) AS cnt
+FROM tenk1;
+SELECT pg_get_viewdef('window_distinct_view') LIKE '%DISTINCT%' AS has_distinct;
+ has_distinct 
+--------------
+ t
+(1 row)
+
+DROP VIEW window_distinct_view;
+-- DISTINCT on a non-aggregate window function is still a parse error
+SELECT ntile(DISTINCT 4) OVER () FROM tenk1; -- error
+ERROR:  DISTINCT is not implemented for non-aggregate window functions
+LINE 1: SELECT ntile(DISTINCT 4) OVER () FROM tenk1;
+               ^
+-- Execution fails with a clear executor-side error
+SELECT count(DISTINCT four) OVER (PARTITION BY ten) FROM tenk1; -- error
+ERROR:  DISTINCT is not yet implemented for window aggregates
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 305549b104d..2c211e67052 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -2159,3 +2159,21 @@ WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);
 
 --cleanup
 DROP TABLE planets CASCADE;
+
+--
+-- Test DISTINCT in window aggregates (parse/deparse plumbing only;
+-- execution support is not yet implemented)
+--
+
+-- Should parse successfully and round-trip through a view definition
+CREATE TEMP VIEW window_distinct_view AS
+SELECT count(DISTINCT four) OVER (PARTITION BY ten) AS cnt
+FROM tenk1;
+SELECT pg_get_viewdef('window_distinct_view') LIKE '%DISTINCT%' AS has_distinct;
+DROP VIEW window_distinct_view;
+
+-- DISTINCT on a non-aggregate window function is still a parse error
+SELECT ntile(DISTINCT 4) OVER () FROM tenk1; -- error
+
+-- Execution fails with a clear executor-side error
+SELECT count(DISTINCT four) OVER (PARTITION BY ten) FROM tenk1; -- error
-- 
2.52.0



view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: [PATCH] DISTINCT in plain aggregate window functions
  In-Reply-To: <CABXr29EW_+XRD4dTcEN4+=K78OvetMWMqF0qgSv_JwG6D4Gu3g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox