public inbox for [email protected]
help / color / mirror / Atom feedRe: nth_value out of more than n values returns null
6+ messages / 4 participants
[nested] [flat]
* Re: nth_value out of more than n values returns null
@ 2024-11-04 23:36 Erik Wienhold <[email protected]>
2024-11-04 23:38 ` Re: nth_value out of more than n values returns null Guyren Howe <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Erik Wienhold @ 2024-11-04 23:36 UTC (permalink / raw)
To: Guyren Howe <[email protected]>; +Cc: pgsql-general
On 2024-11-05 00:17 +0100, Guyren Howe wrote:
> This query:
>
> SELECT NTH_VALUE(id, 5000000) OVER (ORDER BY created_at, id ASC) FROM table
>
> in a table where SELECT COUNT(*) returns a value a few thousand over 5
> million, where id is the primary key, returns null.
>
> The inclusion of the primary key should make the order by a total
> order. So there should be a 5 millionth row.
>
> How can this happen?
https://www.postgresql.org/docs/current/functions-window.html explains
it:
Note that first_value, last_value, and nth_value consider only the rows
within the "window frame", which by default contains the rows from the
start of the partition through the last peer of the current row. This is
likely to give unhelpful results for last_value and sometimes also
nth_value. You can redefine the frame by adding a suitable frame
specification (RANGE, ROWS or GROUPS) to the OVER clause. See Section
4.2.8 for more information about frame specifications.
You probably want to extend the window frame with this:
SELECT NTH_VALUE(id, 5000000) OVER (
ORDER BY created_at, id ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) FROM table
--
Erik
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: nth_value out of more than n values returns null
2024-11-04 23:36 Re: nth_value out of more than n values returns null Erik Wienhold <[email protected]>
@ 2024-11-04 23:38 ` Guyren Howe <[email protected]>
2024-11-04 23:44 ` Re: nth_value out of more than n values returns null David G. Johnston <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Guyren Howe @ 2024-11-04 23:38 UTC (permalink / raw)
To: Erik Wienhold <[email protected]>; +Cc: pgsql-general
I’m trying to get the id of the 5,000,000th record, so I can join against it to get a name. I didn’t fully understand what the docs say there. What am I missing?
On 4 Nov 2024 at 15:36 -0800, Erik Wienhold <[email protected]>, wrote:
> On 2024-11-05 00:17 +0100, Guyren Howe wrote:
> > This query:
> >
> > SELECT NTH_VALUE(id, 5000000) OVER (ORDER BY created_at, id ASC) FROM table
> >
> > in a table where SELECT COUNT(*) returns a value a few thousand over 5
> > million, where id is the primary key, returns null.
> >
> > The inclusion of the primary key should make the order by a total
> > order. So there should be a 5 millionth row.
> >
> > How can this happen?
>
> https://www.postgresql.org/docs/current/functions-window.html explains
> it:
>
> Note that first_value, last_value, and nth_value consider only the rows
> within the "window frame", which by default contains the rows from the
> start of the partition through the last peer of the current row. This is
> likely to give unhelpful results for last_value and sometimes also
> nth_value. You can redefine the frame by adding a suitable frame
> specification (RANGE, ROWS or GROUPS) to the OVER clause. See Section
> 4.2.8 for more information about frame specifications.
>
> You probably want to extend the window frame with this:
>
> SELECT NTH_VALUE(id, 5000000) OVER (
> ORDER BY created_at, id ASC
> ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> ) FROM table
>
> --
> Erik
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: nth_value out of more than n values returns null
2024-11-04 23:36 Re: nth_value out of more than n values returns null Erik Wienhold <[email protected]>
2024-11-04 23:38 ` Re: nth_value out of more than n values returns null Guyren Howe <[email protected]>
@ 2024-11-04 23:44 ` David G. Johnston <[email protected]>
2024-11-04 23:46 ` Re: nth_value out of more than n values returns null Guyren Howe <[email protected]>
2024-11-05 00:49 ` Re: nth_value out of more than n values returns null Tom Lane <[email protected]>
0 siblings, 2 replies; 6+ messages in thread
From: David G. Johnston @ 2024-11-04 23:44 UTC (permalink / raw)
To: Guyren Howe <[email protected]>; +Cc: Erik Wienhold <[email protected]>; pgsql-general
On Monday, November 4, 2024, Guyren Howe <[email protected]> wrote:
> I’m trying to get the id of the 5,000,000th record, so I can join against
> it to get a name. I didn’t fully understand what the docs say there. What
> am I missing?
>
So just use “offset 5_000_000 limit 1”. Bringing in a window function here
seems unhelpful.
David J.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: nth_value out of more than n values returns null
2024-11-04 23:36 Re: nth_value out of more than n values returns null Erik Wienhold <[email protected]>
2024-11-04 23:38 ` Re: nth_value out of more than n values returns null Guyren Howe <[email protected]>
2024-11-04 23:44 ` Re: nth_value out of more than n values returns null David G. Johnston <[email protected]>
@ 2024-11-04 23:46 ` Guyren Howe <[email protected]>
2024-11-04 23:53 ` Re: nth_value out of more than n values returns null David G. Johnston <[email protected]>
1 sibling, 1 reply; 6+ messages in thread
From: Guyren Howe @ 2024-11-04 23:46 UTC (permalink / raw)
To: David G. Johnston <[email protected]>; +Cc: Erik Wienhold <[email protected]>; pgsql-general
Wouldn’t it be offset 4_999_999?
I’d still like to understand why nth_value doesn’t work.
On 4 Nov 2024 at 15:44 -0800, David G. Johnston <[email protected]>, wrote:
> On Monday, November 4, 2024, Guyren Howe <[email protected]> wrote:
> > I’m trying to get the id of the 5,000,000th record, so I can join against it to get a name. I didn’t fully understand what the docs say there. What am I missing?
>
> So just use “offset 5_000_000 limit 1”. Bringing in a window function here seems unhelpful.
>
> David J.
>
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: nth_value out of more than n values returns null
2024-11-04 23:36 Re: nth_value out of more than n values returns null Erik Wienhold <[email protected]>
2024-11-04 23:38 ` Re: nth_value out of more than n values returns null Guyren Howe <[email protected]>
2024-11-04 23:44 ` Re: nth_value out of more than n values returns null David G. Johnston <[email protected]>
2024-11-04 23:46 ` Re: nth_value out of more than n values returns null Guyren Howe <[email protected]>
@ 2024-11-04 23:53 ` David G. Johnston <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: David G. Johnston @ 2024-11-04 23:53 UTC (permalink / raw)
To: Guyren Howe <[email protected]>; +Cc: Erik Wienhold <[email protected]>; pgsql-general
On Monday, November 4, 2024, Guyren Howe <[email protected]> wrote:
> Wouldn’t it be offset 4_999_999?
>
Probably. I tend to expect off-by-one for these kinds of things and test
my way out.
>
> I’d still like to understand why nth_value doesn’t work.
>
When you perform an order by in a window clause the frame you get by
default ends at the current row. Consider “count(*) over ()” versus
“count(*) over (order by id)”.
You need to not use defaults for the window frame if this doesn’t suit you.
David J.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: nth_value out of more than n values returns null
2024-11-04 23:36 Re: nth_value out of more than n values returns null Erik Wienhold <[email protected]>
2024-11-04 23:38 ` Re: nth_value out of more than n values returns null Guyren Howe <[email protected]>
2024-11-04 23:44 ` Re: nth_value out of more than n values returns null David G. Johnston <[email protected]>
@ 2024-11-05 00:49 ` Tom Lane <[email protected]>
1 sibling, 0 replies; 6+ messages in thread
From: Tom Lane @ 2024-11-05 00:49 UTC (permalink / raw)
To: David G. Johnston <[email protected]>; +Cc: Guyren Howe <[email protected]>; Erik Wienhold <[email protected]>; pgsql-general
"David G. Johnston" <[email protected]> writes:
> So just use “offset 5_000_000 limit 1”. Bringing in a window function here
> seems unhelpful.
Yeah, that. A bite-size example might help clarify what the window
function is doing:
regression=# create table zed(f1 int) ;
CREATE TABLE
regression=# insert into zed select generate_series(1, 10);
INSERT 0 10
regression=# select f1, nth_value(f1, 5) over (order by f1) from zed;
f1 | nth_value
----+-----------
1 |
2 |
3 |
4 |
5 | 5
6 | 5
7 | 5
8 | 5
9 | 5
10 | 5
(10 rows)
For the first four rows, the window frame doesn't include the row
you want, so you get NULL. You can fix that with a non-default
window frame:
regression=# select f1, nth_value(f1, 5) over (order by f1 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) from zed;
f1 | nth_value
----+-----------
1 | 5
2 | 5
3 | 5
4 | 5
5 | 5
6 | 5
7 | 5
8 | 5
9 | 5
10 | 5
(10 rows)
So yeah, you can get the fifth (or five million'th) row this way, but
you'll get N copies of it, which I assume is not what you want.
Better
regression=# select f1 from zed order by f1 offset 4 limit 1;
f1
----
5
(1 row)
which gets you just the one row and is a lot cheaper too.
regards, tom lane
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2024-11-05 00:49 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-11-04 23:36 Re: nth_value out of more than n values returns null Erik Wienhold <[email protected]>
2024-11-04 23:38 ` Guyren Howe <[email protected]>
2024-11-04 23:44 ` David G. Johnston <[email protected]>
2024-11-04 23:46 ` Guyren Howe <[email protected]>
2024-11-04 23:53 ` David G. Johnston <[email protected]>
2024-11-05 00:49 ` Tom Lane <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox