public inbox for [email protected]
help / color / mirror / Atom feedPlanning of sub partitions
2+ messages / 1 participants
[nested] [flat]
* Planning of sub partitions
@ 2024-07-05 15:24 Matt Hughes <[email protected]>
2024-07-05 16:06 ` Re: Planning of sub partitions Matt Hughes <[email protected]>
0 siblings, 1 reply; 2+ messages in thread
From: Matt Hughes @ 2024-07-05 15:24 UTC (permalink / raw)
To: [email protected]
I am trying to model a stateful `event` table using partitions. An event
has date_raised (not null) and date_cleared (nullable) columns. An event
is in the "open" state when it has a null date_cleared; it is in the
"closed" state when date_cleared is set. Once date_cleared is set, it
won't change.
While most events close after a short period of time, a few stragglers stay
open for weeks or months. I'm trying to optimize my system for the
following things:
- I need to drop events older than N days; I want to use partitions so I
can just drop the table rather than an expensive DELETE
- Users want to see all open events OR closed events within the past N days
To do this, I came up with the following schema:
create table event (
id uuid not null,
cleared boolean not null,
date_raised timestamp without time zone not null,
date_cleared timestamp without time zone,
primary key (id, date_raised, cleared)
) PARTITION BY LIST (cleared);
CREATE TABLE event_open PARTITION OF event FOR VALUES IN (false);
CREATE TABLE event_closed PARTITION OF event FOR VALUES IN (true) partition
by range(date_raised);
CREATE TABLE event_closed_y2024_m01 PARTITION OF event_closed FOR VALUES
FROM ('2024-01-01') to ('2024-02-01');
CREATE TABLE event_closed_y2024_m02 PARTITION OF event_closed FOR VALUES
FROM ('2024-02-01') to ('2024-03-01');
This works for the most part but the plan for my compound query does not
behave as I would expect:
-- 1. correctly only picks event_open partition
explain select * from event where cleared is false;
-- 2. correctly picks all event_closed_* partitions
explain select * from event where cleared is true;
-- 3. correctly picks just the event_closed_y2024_m01 partition
explain select * from event where cleared is true and date_raised >
'2024-01-01' AND date_raised < '2024-01-02';
-- 4. uses all partitions; should exclude event_closed_y2024_m02
explain select * from event
where
cleared is false OR
(cleared is true and date_raised > '2024-01-01' AND date_raised <
'2024-01-02');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..110.91 rows=2322 width=25)
-> Seq Scan on event_open event_1 (cost=0.00..33.10 rows=774 width=25)
Filter: ((cleared IS FALSE) OR ((date_raised > '2024-01-01
00:00:00'::timestamp without time zone) AND (date_raised < '2024-01-02
00:00:00'::timestamp without time zone)))
-> Seq Scan on event_closed_y2024_m01 event_2 (cost=0.00..33.10
rows=774 width=25)
Filter: ((cleared IS FALSE) OR ((date_raised > '2024-01-01
00:00:00'::timestamp without time zone) AND (date_raised < '2024-01-02
00:00:00'::timestamp without time zone)))
-> Seq Scan on event_closed_y2024_m02 event_3 (cost=0.00..33.10
rows=774 width=25)
Filter: ((cleared IS FALSE) OR ((date_raised > '2024-01-01
00:00:00'::timestamp without time zone) AND (date_raised < '2024-01-02
00:00:00'::timestamp without time zone)))
Is this a bug? Query 4 is just a union of queries 2/3 which pick the right
partitions. Do you see anything else wrong with this approach?
Note that `closed` column is somewhat of a hack as it is completely
computeable from `date_cleared is not null`. However, as I understand it,
partitions can only be declared on values that are part of the primary
key.
^ permalink raw reply [nested|flat] 2+ messages in thread
* Re: Planning of sub partitions
2024-07-05 15:24 Planning of sub partitions Matt Hughes <[email protected]>
@ 2024-07-05 16:06 ` Matt Hughes <[email protected]>
0 siblings, 0 replies; 2+ messages in thread
From: Matt Hughes @ 2024-07-05 16:06 UTC (permalink / raw)
To: [email protected]
One workaround seems to be using a CTE with a union:
with openAndRecentlyRaisedEvents as (
select * from event where cleared is false
union all
select * from event
where cleared is true and date_raised > '2024-01-01' AND date_raised <
'2024-01-02';
)
select * from openAndREcentlyRaisedEvents
On Fri, Jul 5, 2024 at 11:24 AM Matt Hughes <[email protected]> wrote:
> I am trying to model a stateful `event` table using partitions. An event
> has date_raised (not null) and date_cleared (nullable) columns. An event
> is in the "open" state when it has a null date_cleared; it is in the
> "closed" state when date_cleared is set. Once date_cleared is set, it
> won't change.
>
> While most events close after a short period of time, a few stragglers
> stay open for weeks or months. I'm trying to optimize my system for the
> following things:
>
> - I need to drop events older than N days; I want to use partitions so I
> can just drop the table rather than an expensive DELETE
> - Users want to see all open events OR closed events within the past N days
>
> To do this, I came up with the following schema:
>
> create table event (
> id uuid not null,
> cleared boolean not null,
> date_raised timestamp without time zone not null,
> date_cleared timestamp without time zone,
> primary key (id, date_raised, cleared)
> ) PARTITION BY LIST (cleared);
>
> CREATE TABLE event_open PARTITION OF event FOR VALUES IN (false);
> CREATE TABLE event_closed PARTITION OF event FOR VALUES IN (true)
> partition by range(date_raised);
> CREATE TABLE event_closed_y2024_m01 PARTITION OF event_closed FOR VALUES
> FROM ('2024-01-01') to ('2024-02-01');
> CREATE TABLE event_closed_y2024_m02 PARTITION OF event_closed FOR VALUES
> FROM ('2024-02-01') to ('2024-03-01');
>
> This works for the most part but the plan for my compound query does not
> behave as I would expect:
>
> -- 1. correctly only picks event_open partition
> explain select * from event where cleared is false;
>
> -- 2. correctly picks all event_closed_* partitions
> explain select * from event where cleared is true;
>
> -- 3. correctly picks just the event_closed_y2024_m01 partition
> explain select * from event where cleared is true and date_raised >
> '2024-01-01' AND date_raised < '2024-01-02';
>
> -- 4. uses all partitions; should exclude event_closed_y2024_m02
> explain select * from event
> where
> cleared is false OR
> (cleared is true and date_raised > '2024-01-01' AND date_raised <
> '2024-01-02');
>
> QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Append (cost=0.00..110.91 rows=2322 width=25)
> -> Seq Scan on event_open event_1 (cost=0.00..33.10 rows=774 width=25)
> Filter: ((cleared IS FALSE) OR ((date_raised > '2024-01-01
> 00:00:00'::timestamp without time zone) AND (date_raised < '2024-01-02
> 00:00:00'::timestamp without time zone)))
> -> Seq Scan on event_closed_y2024_m01 event_2 (cost=0.00..33.10
> rows=774 width=25)
> Filter: ((cleared IS FALSE) OR ((date_raised > '2024-01-01
> 00:00:00'::timestamp without time zone) AND (date_raised < '2024-01-02
> 00:00:00'::timestamp without time zone)))
> -> Seq Scan on event_closed_y2024_m02 event_3 (cost=0.00..33.10
> rows=774 width=25)
> Filter: ((cleared IS FALSE) OR ((date_raised > '2024-01-01
> 00:00:00'::timestamp without time zone) AND (date_raised < '2024-01-02
> 00:00:00'::timestamp without time zone)))
>
>
> Is this a bug? Query 4 is just a union of queries 2/3 which pick the
> right partitions. Do you see anything else wrong with this approach?
>
> Note that `closed` column is somewhat of a hack as it is completely
> computeable from `date_cleared is not null`. However, as I understand it,
> partitions can only be declared on values that are part of the primary
> key.
>
>
^ permalink raw reply [nested|flat] 2+ messages in thread
end of thread, other threads:[~2024-07-05 16:06 UTC | newest]
Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-07-05 15:24 Planning of sub partitions Matt Hughes <[email protected]>
2024-07-05 16:06 ` Matt Hughes <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox