[Architecture] Multi-Environment pg_cron for Automated Partition Management

public inbox for [email protected]  
help / color / mirror / Atom feed

[Architecture] Multi-Environment pg_cron for Automated Partition Management
3+ messages / 2 participants
[nested] [flat]

* [Architecture] Multi-Environment pg_cron for Automated Partition Management
@ 2025-09-17 15:04 Jishnu Sygal <[email protected]>
  2025-09-17 18:26 ` Re: [Architecture] Multi-Environment pg_cron for Automated Partition Management bertrand HARTWIG <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Jishnu Sygal @ 2025-09-17 15:04 UTC (permalink / raw)
  To: [email protected]

Hello Postgres Community,

I am writing to get your expert opinion on a proposed architecture for
managing pg_partman automated partitioning across a multi-environment
PostgreSQL setup. While this solution is primarily for cost savings and is
used in our non-production environments, *stability is a must*. Our core
goal is to create a universal scheduling solution that works identically
across cloud-managed databases (AWS RDS, Google Cloud SQL, Azure Database),
as well as on-premises shared and dedicated hosts.
Business Context & Core Requirements

We need to consistently automate pg_partman maintenance tasks across our
PostgreSQL 17 environments. The primary challenge is maintaining a single,
identical application interface so that the same SQL commands work without
modification, regardless of the deployment model. A key requirement is that
our in-house database schema migration tools will be used to roll out SQL
scripts that directly handle job maintenance and alterations during releases.
Our jobs are low frequency (monthly/quarterly/annually) but are critically
important, as failures can cause serious operational issues.

Our current constraints include:

   -

   PostgreSQL 17 (with a planned migration to 18).
   -

   A mix of AWS RDS, Google Cloud SQL, Azure Database, and on-premises
   deployments.
   -

   Shared hosts with 50-60 databases and dedicated hosts with a single
   database.

Our Proposed Architecture

As the pg_cron extension requires a dedicated database to schedule jobs for
other databases, we have designed an abstraction layer to address our
multi-environment challenges. Our architecture places pg_cron in a
designated database and uses dblink to execute partition maintenance jobs
in the target application databases.

Key elements of this design are:

   -

   *Universal API Layer:* Every application database would have an
   identical cron schema with wrapper functions.
   -

   *Identical Application Interface:* Applications use the exact same SQL
   statement, for example: SELECT
   cron.schedule('monthly_partition_maintenance', '0 2 1 * *', 'SELECT
   partman.run_maintenance_proc()');
   -

   *Environment-Adaptive Communication:* dblink is used for multi-database
   environments, while direct calls are used for single-database setups.
   -

   *PostgreSQL 18 Future-Proofing:* We plan to leverage SCRAM pass-through
   authentication for dblink to eliminate the need for storing credentials.

Technical Questions & Concerns

We have detailed several technical questions below and would greatly
appreciate your insights and validation.

*1. Architecture Validation*

   -

   Is this universal abstraction layer a sound architectural pattern for
   managing pg_partman across diverse environments, especially given our
   focus on using in-house tools for job management?
   -

   How should the architecture specifically adapt for AWS RDS vs. Google
   Cloud SQL vs. Azure Database, especially concerning their limitations on
   pg_cron or connection management?
   -

   Does this design scale appropriately from a single dedicated database to
   a shared host with 50-60 databases and 400-900 partition jobs?

*2. Connection & Performance*

   -

   Is our proposed temporary-connection-with-guaranteed-cleanup pattern
   robust enough for production? We are concerned about dblink connection
   exhaustion risks and connection leaks, especially in shared environments
   with many jobs.
   -

   Given the low frequency of jobs, will the dblink overhead be
   significant? Are there specific performance optimizations we should
   consider for cloud-managed services vs. on-premises?

*3. Alternative Approaches*

   -

   Should we abandon this PostgreSQL-native approach and instead consider
   cloud-native job schedulers (e.g, AWS EventBridge, Google Cloud
   Scheduler, Azure Logic Apps) to trigger maintenance jobs?
   -

   Are there existing enterprise scheduling solutions that are
   purpose-built for this kind of multi-cloud, on-premises PostgreSQL
   automation?

We are most concerned about connection leaks, unnoticed maintenance
failures, and ensuring a single, identical SQL interface is truly
achievable. We believe this solution could significantly simplify our
operations, but we want to validate its viability and get feedback on
potential pitfalls from the community.

Thank you for your time and any insights you can provide.
Best Regards,
Jishnu Sygal

^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: [Architecture] Multi-Environment pg_cron for Automated Partition Management
  2025-09-17 15:04 [Architecture] Multi-Environment pg_cron for Automated Partition Management Jishnu Sygal <[email protected]>
@ 2025-09-17 18:26 ` bertrand HARTWIG <[email protected]>
  2025-09-17 19:23   ` Re: [Architecture] Multi-Environment pg_cron for Automated Partition Management Jishnu Sygal <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: bertrand HARTWIG @ 2025-09-17 18:26 UTC (permalink / raw)
  To: Jishnu Sygal <[email protected]>; +Cc: [email protected]

Hello,

Interesting proposal — thanks for sharing so much context! 

From my perspective, I’d encourage you to keep it simple. pg_partman jobs are relatively low frequency, which makes monitoring and reliability more important than architectural elegance. 

Relying on pg_cron + dblink across multiple environments adds complexity and risks (connection leaks, security management, debugging difficulties).

If you really want a universal SQL API, consider a thin wrapper in each database that can be triggered externally, rather than a cross-db dblink layer.

My 2 cents: stability and observability usually win over architectural purity in the long run.

Best regards,

Bertrand 

PS: Full disclosure — I’ve had way too many bad adventures with dblink, so maybe I’m a bit biased!



> Le 17 sept. 2025 à 17:04, Jishnu Sygal <[email protected]> a écrit :
> 
> Hello Postgres Community,
> 
> I am writing to get your expert opinion on a proposed architecture for managing pg_partman automated partitioning across a multi-environment PostgreSQL setup. While this solution is primarily for cost savings and is used in our non-production environments, stability is a must. Our core goal is to create a universal scheduling solution that works identically across cloud-managed databases (AWS RDS, Google Cloud SQL, Azure Database), as well as on-premises shared and dedicated hosts.
> 
> Business Context & Core Requirements
> 
> 
> We need to consistently automate pg_partman maintenance tasks across our PostgreSQL 17 environments. The primary challenge is maintaining a single, identical application interface so that the same SQL commands work without modification, regardless of the deployment model. A key requirement is that our in-house database schema migration tools will be used to roll out SQL scripts that directly handle job maintenance and alterations during releases. Our jobs are low frequency (monthly/quarterly/annually) but are critically important, as failures can cause serious operational issues. 
> 
> Our current constraints include:
> 
> PostgreSQL 17 (with a planned migration to 18).
> 
> A mix of AWS RDS, Google Cloud SQL, Azure Database, and on-premises deployments.
> 
> Shared hosts with 50-60 databases and dedicated hosts with a single database.
> 
> Our Proposed Architecture
> 
> 
> As the pg_cron extension requires a dedicated database to schedule jobs for other databases, we have designed an abstraction layer to address our multi-environment challenges. Our architecture places pg_cron in a designated database and uses dblink to execute partition maintenance jobs in the target application databases.
> 
> Key elements of this design are:
> 
> Universal API Layer: Every application database would have an identical cron schema with wrapper functions.
> 
> Identical Application Interface: Applications use the exact same SQL statement, for example: SELECT cron.schedule('monthly_partition_maintenance', '0 2 1 * *', 'SELECT partman.run_maintenance_proc()');
> 
> Environment-Adaptive Communication: dblink is used for multi-database environments, while direct calls are used for single-database setups.
> 
> PostgreSQL 18 Future-Proofing: We plan to leverage SCRAM pass-through authentication for dblink to eliminate the need for storing credentials.
> 
> Technical Questions & Concerns
> 
> 
> We have detailed several technical questions below and would greatly appreciate your insights and validation.
> 
> 1. Architecture Validation
> 
> Is this universal abstraction layer a sound architectural pattern for managing pg_partman across diverse environments, especially given our focus on using in-house tools for job management?
> 
> How should the architecture specifically adapt for AWS RDS vs. Google Cloud SQL vs. Azure Database, especially concerning their limitations on pg_cron or connection management?
> 
> Does this design scale appropriately from a single dedicated database to a shared host with 50-60 databases and 400-900 partition jobs?
> 
> 2. Connection & Performance
> 
> Is our proposed temporary-connection-with-guaranteed-cleanup pattern robust enough for production? We are concerned about dblink connection exhaustion risks and connection leaks, especially in shared environments with many jobs.
> 
> Given the low frequency of jobs, will the dblink overhead be significant? Are there specific performance optimizations we should consider for cloud-managed services vs. on-premises?
> 
> 3. Alternative Approaches
> 
> Should we abandon this PostgreSQL-native approach and instead consider cloud-native job schedulers (e.g, AWS EventBridge, Google Cloud Scheduler, Azure Logic Apps) to trigger maintenance jobs?
> 
> Are there existing enterprise scheduling solutions that are purpose-built for this kind of multi-cloud, on-premises PostgreSQL automation?
> 
> We are most concerned about connection leaks, unnoticed maintenance failures, and ensuring a single, identical SQL interface is truly achievable. We believe this solution could significantly simplify our operations, but we want to validate its viability and get feedback on potential pitfalls from the community.
> 
> Thank you for your time and any insights you can provide.
> 
> Best Regards,
> Jishnu Sygal



^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: [Architecture] Multi-Environment pg_cron for Automated Partition Management
  2025-09-17 15:04 [Architecture] Multi-Environment pg_cron for Automated Partition Management Jishnu Sygal <[email protected]>
  2025-09-17 18:26 ` Re: [Architecture] Multi-Environment pg_cron for Automated Partition Management bertrand HARTWIG <[email protected]>
@ 2025-09-17 19:23   ` Jishnu Sygal <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Jishnu Sygal @ 2025-09-17 19:23 UTC (permalink / raw)
  To: bertrand HARTWIG <[email protected]>; +Cc: [email protected]

Hello Bertrand,

Thank you for your response and for sharing your insights. I appreciate the
emphasis on simplicity, stability, and observability, especially given the
low frequency of these jobs.

Your suggestion to consider a thin wrapper in each database as an
alternative to relying heavily on dblink across environments is very
practical. I will definitely look into how that could work and if it can
still provide the unified API we are aiming for without the potential
pitfalls you have encountered.

Thanks again for the valuable input.

Regards,
Jishnu

On Wed, 17 Sept, 2025, 8:27 pm bertrand HARTWIG, <[email protected]>
wrote:

> Hello*,*
>
> Interesting proposal — thanks for sharing so much context!
>
> From my perspective, *I’d encourage you to* *keep it simple.* pg_partman
> jobs are relatively low frequency, which makes monitoring and reliability
> more important than architectural elegance.
>
> Relying on pg_cron + dblink across multiple environments adds complexity
> and risks (connection leaks, security management, debugging difficulties).
>
> If you really want a universal SQL API, consider a *thin wrapper* in each
> database that can be triggered externally, rather than a cross-db dblink
> layer.
>
> My 2 cents: stability and observability usually win over architectural
> purity in the long run.
>
> Best regards,
>
> Bertrand
>
> *PS: Full disclosure — I’ve had way too many bad adventures with dblink,
> so maybe I’m a bit biased!*
>
>
> Le 17 sept. 2025 à 17:04, Jishnu Sygal <[email protected]> a écrit :
>
> Hello Postgres Community,
>
> I am writing to get your expert opinion on a proposed architecture for
> managing pg_partman automated partitioning across a multi-environment
> PostgreSQL setup. While this solution is primarily for cost savings and is
> used in our non-production environments, *stability is a must*. Our core
> goal is to create a universal scheduling solution that works identically
> across cloud-managed databases (AWS RDS, Google Cloud SQL, Azure Database),
> as well as on-premises shared and dedicated hosts.
> Business Context & Core Requirements
>
> We need to consistently automate pg_partman maintenance tasks across our
> PostgreSQL 17 environments. The primary challenge is maintaining a single,
> identical application interface so that the same SQL commands work without
> modification, regardless of the deployment model. A key requirement is that
> our in-house database schema migration tools will be used to roll out SQL
> scripts that directly handle job maintenance and alterations during
> releases. Our jobs are low frequency (monthly/quarterly/annually) but are
> critically important, as failures can cause serious operational issues.
>
> Our current constraints include:
>
>    -
>
>    PostgreSQL 17 (with a planned migration to 18).
>    -
>
>    A mix of AWS RDS, Google Cloud SQL, Azure Database, and on-premises
>    deployments.
>    -
>
>    Shared hosts with 50-60 databases and dedicated hosts with a single
>    database.
>
> Our Proposed Architecture
>
> As the pg_cron extension requires a dedicated database to schedule jobs
> for other databases, we have designed an abstraction layer to address our
> multi-environment challenges. Our architecture places pg_cron in a
> designated database and uses dblink to execute partition maintenance jobs
> in the target application databases.
>
> Key elements of this design are:
>
>    -
>
>    *Universal API Layer:* Every application database would have an
>    identical cron schema with wrapper functions.
>    -
>
>    *Identical Application Interface:* Applications use the exact same SQL
>    statement, for example: SELECT
>    cron.schedule('monthly_partition_maintenance', '0 2 1 * *', 'SELECT
>    partman.run_maintenance_proc()');
>    -
>
>    *Environment-Adaptive Communication:* dblink is used for
>    multi-database environments, while direct calls are used for
>    single-database setups.
>    -
>
>    *PostgreSQL 18 Future-Proofing:* We plan to leverage SCRAM
>    pass-through authentication for dblink to eliminate the need for
>    storing credentials.
>
> Technical Questions & Concerns
>
> We have detailed several technical questions below and would greatly
> appreciate your insights and validation.
>
> *1. Architecture Validation*
>
>    -
>
>    Is this universal abstraction layer a sound architectural pattern for
>    managing pg_partman across diverse environments, especially given our
>    focus on using in-house tools for job management?
>    -
>
>    How should the architecture specifically adapt for AWS RDS vs. Google
>    Cloud SQL vs. Azure Database, especially concerning their limitations on
>    pg_cron or connection management?
>    -
>
>    Does this design scale appropriately from a single dedicated database
>    to a shared host with 50-60 databases and 400-900 partition jobs?
>
> *2. Connection & Performance*
>
>    -
>
>    Is our proposed temporary-connection-with-guaranteed-cleanup pattern
>    robust enough for production? We are concerned about dblink connection
>    exhaustion risks and connection leaks, especially in shared environments
>    with many jobs.
>    -
>
>    Given the low frequency of jobs, will the dblink overhead be
>    significant? Are there specific performance optimizations we should
>    consider for cloud-managed services vs. on-premises?
>
> *3. Alternative Approaches*
>
>    -
>
>    Should we abandon this PostgreSQL-native approach and instead consider
>    cloud-native job schedulers (e.g, AWS EventBridge, Google Cloud
>    Scheduler, Azure Logic Apps) to trigger maintenance jobs?
>    -
>
>    Are there existing enterprise scheduling solutions that are
>    purpose-built for this kind of multi-cloud, on-premises PostgreSQL
>    automation?
>
> We are most concerned about connection leaks, unnoticed maintenance
> failures, and ensuring a single, identical SQL interface is truly
> achievable. We believe this solution could significantly simplify our
> operations, but we want to validate its viability and get feedback on
> potential pitfalls from the community.
>
> Thank you for your time and any insights you can provide.
> Best Regards,
> Jishnu Sygal
>
>
>


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

end of thread, other threads:[~2025-09-17 19:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-09-17 15:04 [Architecture] Multi-Environment pg_cron for Automated Partition Management Jishnu Sygal <[email protected]>
2025-09-17 18:26 ` bertrand HARTWIG <[email protected]>
2025-09-17 19:23   ` Jishnu Sygal <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox