public inbox for [email protected]  
help / color / mirror / Atom feed
From: Manan Kansara <[email protected]>
To: [email protected]
Subject: About replication minimal disk space usage
Date: Sat, 24 Aug 2024 17:48:19 +0530
Message-ID: <CANz4VOO+GSdUYVrid+YvpxPzOZ6SWiB_T9FJL2H-PqAFVa9AsA@mail.gmail.com> (raw)

Hello All,
I have my self hosted postgres server on aws with 16gb disk space
attached to it for ml stuff and analysis stuff we are using vertex ai so i
have setup live replication of postgres using data stream service to
BigQuery table.  We use BigQuery table as data warehouse because we have so
many different data source so our data analysis and ml can happened at one
place.
but problem is there When i am starting replication in there pg_wal take
whole space about 15.8gb in some days of starting replication

*Question *:  how can i setup something like that that optimally use disk
space so old pg_wal data that are not usable can we delete  i think i
should create one cron job which taken care whole that things but i don't
know any approach can you please guide
In future if as data grew i will attached more disk space to that instance
but i want to make optimal setup so my whole disk is not in full usage any
time and my server crash again.


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: About replication minimal disk space usage
  In-Reply-To: <CANz4VOO+GSdUYVrid+YvpxPzOZ6SWiB_T9FJL2H-PqAFVa9AsA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox