Postgres DB design Question (High Level)

public inbox for [email protected]  
help / color / mirror / Atom feed

Postgres DB design Question (High Level)
3+ messages / 2 participants
[nested] [flat]

* Postgres DB design Question (High Level)
@ 2025-06-19 04:50 Al Grant <[email protected]>
  2025-06-19 05:00 ` Re: Postgres DB design Question (High Level) Laurenz Albe <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Al Grant @ 2025-06-19 04:50 UTC (permalink / raw)
  To: [email protected]

Introduction:

I have a question about the design of a project as it relates to
databases, and the scale-ability of the design. The project is
volunteer, so there is no commercial interest.

But first a bit of background:

Background:

I have programmed a rasp pi to record radio beeps from wildlife
trackers, where the beep rate per minute (bpm) can be either 80, 40,
or 30. The rate can only change once every 24 hours. The beeps are
transmitted on up to 100 channels and the animals go in an out of
range on a given day. This data is written to a Sqlite3 db on the Rpi.

Since the beep rate will not change in a given 24 hour period, and
since the rasp pi runs on a solar/battery setup it wakes up for 2
hours every day to record the radio signals and shuts down, so for a
given 24 hour period I only get 2 hours of data (anywhere between
about 5-15,000 beeps depending on beep rate and assuming the animal
stays within range).

The rpi Sqlite3 DB is sync'd over cellular to a postgresql database on
my server at the end of each days 2 hour recording period.

Since I am processing radio signals there is always the chance of
random interference being decoded as a valid beep. To avoid a small
amount of interference being detected as a valid signal, I check for
quantity of valid beeps within a given 1 hour window - so for example
if the beep rate is 80 it checks that there are 50% of the maximum
beep rate detected (ie 80*60*0.5) - if there is only a handful of
beeps it is discarded.

Database design:

The BPM table is very simple:

Id
Bpm_rate Integer
dt DateTime

I want to create a web based dashboard for all the currently detected
signals, where the dashboard contains a graph of the daily beep rate
for each channel (max 100 channels) over user selectable periods from
1 week to 1 year - that query does not scale well if I query the bpm
table.

To avoid this I have created a bpm summary table which is generated
periodically (hourly) off the bpm table. The bpm summary table
contains the dominant beep rate for a given hour (so 2 records per day
per channel assuming a signal is detected).

Does this summary table approach make sense?

I have noted that I am periodically syncing from MySQL to the server,
and then periodically updating the summary table - its multi stage
syncing and I wonder if that makes this approach fragile (although I
don't see any alternative).

Thanks in advance for any input.

Al

^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Postgres DB design Question (High Level)
  2025-06-19 04:50 Postgres DB design Question (High Level) Al Grant <[email protected]>
@ 2025-06-19 05:00 ` Laurenz Albe <[email protected]>
  2025-06-19 05:08   ` Re: Postgres DB design Question (High Level) Al Grant <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Laurenz Albe @ 2025-06-19 05:00 UTC (permalink / raw)
  To: Al Grant <[email protected]>; [email protected]

On Thu, 2025-06-19 at 16:50 +1200, Al Grant wrote:
> I want to create a web based dashboard for all the currently detected
> signals, where the dashboard contains a graph of the daily beep rate
> for each channel (max 100 channels) over user selectable periods from
> 1 week to 1 year - that query does not scale well if I query the bpm
> table.
> 
> To avoid this I have created a bpm summary table which is generated
> periodically (hourly) off the bpm table. The bpm summary table
> contains the dominant beep rate for a given hour (so 2 records per day
> per channel assuming a signal is detected).
> 
> Does this summary table approach make sense?

Yes.  Pre-aggregation to reduce the data volume is common in a
data warehouse.  Ideally, you don't have to scan the entire base
table to create the summary.  You can consider a trigger for that,
but more elegant might be a solution based on partitioning, where
you aggregate a partition once it is complete.

> I have noted that I am periodically syncing from MySQL to the server,
> and then periodically updating the summary table - its multi stage
> syncing and I wonder if that makes this approach fragile (although I
> don't see any alternative).

"MySQL" and "fragile" in the same sentence.  Sounds about right.

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Postgres DB design Question (High Level)
  2025-06-19 04:50 Postgres DB design Question (High Level) Al Grant <[email protected]>
  2025-06-19 05:00 ` Re: Postgres DB design Question (High Level) Laurenz Albe <[email protected]>
@ 2025-06-19 05:08   ` Al Grant <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Al Grant @ 2025-06-19 05:08 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: [email protected]

Not sure if it makes a big difference to the answer, but MySQL was a
typo - the edge db on the Rpi is Sqlite3.

On Thu, Jun 19, 2025 at 5:00 PM Laurenz Albe <[email protected]> wrote:
>
> On Thu, 2025-06-19 at 16:50 +1200, Al Grant wrote:
> > I want to create a web based dashboard for all the currently detected
> > signals, where the dashboard contains a graph of the daily beep rate
> > for each channel (max 100 channels) over user selectable periods from
> > 1 week to 1 year - that query does not scale well if I query the bpm
> > table.
> >
> > To avoid this I have created a bpm summary table which is generated
> > periodically (hourly) off the bpm table. The bpm summary table
> > contains the dominant beep rate for a given hour (so 2 records per day
> > per channel assuming a signal is detected).
> >
> > Does this summary table approach make sense?
>
> Yes.  Pre-aggregation to reduce the data volume is common in a
> data warehouse.  Ideally, you don't have to scan the entire base
> table to create the summary.  You can consider a trigger for that,
> but more elegant might be a solution based on partitioning, where
> you aggregate a partition once it is complete.
>
> > I have noted that I am periodically syncing from MySQL to the server,
> > and then periodically updating the summary table - its multi stage
> > syncing and I wonder if that makes this approach fragile (although I
> > don't see any alternative).
>
> "MySQL" and "fragile" in the same sentence.  Sounds about right.
>
> Yours,
> Laurenz Albe



-- 
"Beat it punk!"
- Clint Eastwood






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

end of thread, other threads:[~2025-06-19 05:08 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-06-19 04:50 Postgres DB design Question (High Level) Al Grant <[email protected]>
2025-06-19 05:00 ` Laurenz Albe <[email protected]>
2025-06-19 05:08   ` Al Grant <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox