public inbox for [email protected]  
help / color / mirror / Atom feed
Finding "most recent" using daterange
2+ messages / 2 participants
[nested] [flat]

* Finding "most recent" using daterange
@ 2024-05-22 07:58  Rob Foehl <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: Rob Foehl @ 2024-05-22 07:58 UTC (permalink / raw)
  To: [email protected]

Coming back to PostgreSQL after a (decades-)long absence...  If I have
something like:

CREATE TABLE example (
	id integer NOT NULL,
	value text NOT NULL,
	dates daterange NOT NULL
);

INSERT INTO example VALUES
	(1, 'a', '[2010-01-01,2020-01-01)'),
	(1, 'b', '[2010-01-01,)'),
	(1, 'c', '[,2021-01-01)'),
	(2, 'd', '[2010-01-01,2021-01-01)'),
	(2, 'e', '[2015-01-01,2020-01-01)'),
	(3, 'f', '[2014-01-01,2016-01-01)'),
	(3, 'g', '[2013-01-01,)'),
	(3, 'h', '[2012-01-01,)'),
	(3, 'i', '[2013-01-01,2017-01-01)'),
	(4, 'j', '[2010-01-01,2015-01-01)');

and I want to find the "most recent" value out of each group, meaning
that having the greatest upper bound followed by the greatest lower
bound, what I've managed to come up with thus far is:

WITH intermediate AS (
	SELECT e.id, e.value, e.dates
	FROM example AS e
	JOIN (
		SELECT id, max(coalesce(upper(dates), 'infinity')) AS max_date
		FROM example GROUP BY id
	) AS max_upper ON e.id = max_upper.id
		AND coalesce(upper(dates), 'infinity') = max_upper.max_date
)
SELECT i.id, i.value, i.dates
FROM intermediate AS i
JOIN (
	SELECT id, max(coalesce(lower(dates), '-infinity')) AS max_date
	FROM intermediate GROUP BY id
) AS max_lower ON i.id = max_lower.id
	AND coalesce(lower(dates), '-infinity') = max_lower.max_date;

which produces the desired result for this minimal example:

 id | value |          dates
----+-------+-------------------------
  1 | b     | [2010-01-01,)
  2 | d     | [2010-01-01,2021-01-01)
  3 | g     | [2013-01-01,)
  4 | j     | [2010-01-01,2015-01-01)

I pretty quickly discovered that there's no max(daterange) -- although
it isn't obvious what that would do, anyway -- and the "intermediate"
CTE is what followed.  Is there a better way?

(Note that this doesn't try to handle duplicate ranges -- I haven't
decided whether that'll be necessary in the real case.  Assume it'll
have something beyond daterange NOT NULL and/or some _agg() magic,
eventually.)

-Rob






^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Finding "most recent" using daterange
@ 2024-05-22 18:07  Alban Hertroys <[email protected]>
  parent: Rob Foehl <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Alban Hertroys @ 2024-05-22 18:07 UTC (permalink / raw)
  To: Rob Foehl <[email protected]>; +Cc: pgsql-general <[email protected]>


> On 22 May 2024, at 09:58, Rob Foehl <[email protected]> wrote:
> 
> Coming back to PostgreSQL after a (decades-)long absence...  If I have
> something like:
> 
> CREATE TABLE example (
> id integer NOT NULL,
> value text NOT NULL,
> dates daterange NOT NULL
> );
> 
> INSERT INTO example VALUES
> (1, 'a', '[2010-01-01,2020-01-01)'),
> (1, 'b', '[2010-01-01,)'),
> (1, 'c', '[,2021-01-01)'),
> (2, 'd', '[2010-01-01,2021-01-01)'),
> (2, 'e', '[2015-01-01,2020-01-01)'),
> (3, 'f', '[2014-01-01,2016-01-01)'),
> (3, 'g', '[2013-01-01,)'),
> (3, 'h', '[2012-01-01,)'),
> (3, 'i', '[2013-01-01,2017-01-01)'),
> (4, 'j', '[2010-01-01,2015-01-01)');
> 
> and I want to find the "most recent" value out of each group, meaning
> that having the greatest upper bound followed by the greatest lower
> bound, what I've managed to come up with thus far is:

Sounds like a good candidate for using EXISTS to prove that no more recent value exists for a given id:

SELECT e.id, e.value, e.dates
FROM example AS e
WHERE NOT EXISTS (
        SELECT 1
        FROM example AS i
        WHERE i.id = e.id
        AND (coalesce(upper(i.dates), 'infinity') > coalesce(upper(e.dates), 'infinity')
                OR (coalesce(upper(i.dates), 'infinity') = coalesce(upper(e.dates), 'infinity')
                AND coalesce(lower(i.dates), '-infinity') > coalesce(lower(e.dates), '-infinity'))
        )
);

 id | value |          dates          
----+-------+-------------------------
  1 | b     | [2010-01-01,)
  2 | d     | [2010-01-01,2021-01-01)
  3 | g     | [2013-01-01,)
  4 | j     | [2010-01-01,2015-01-01)
(4 rows)

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.







^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2024-05-22 18:07 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-05-22 07:58 Finding "most recent" using daterange Rob Foehl <[email protected]>
2024-05-22 18:07 ` Alban Hertroys <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox