MIME-Version: 1.0
References: <CAFeSbqh0Mj3bm9+aCaz5g4NhKn8+t4aGF=p5vOPc5oVssveATQ@mail.gmail.com>
 <0ba329ef-62aa-4ab3-aefd-141baabced3b@aklaver.com> <CAFeSbqijFCW9xFOfapTzebbPcv2sWgpgrS1kVfFNJ+F7sA8R=A@mail.gmail.com>
In-Reply-To: <CAFeSbqijFCW9xFOfapTzebbPcv2sWgpgrS1kVfFNJ+F7sA8R=A@mail.gmail.com>
From: Saul Perdomo <saul.perdomo@gmail.com>
Date: Thu, 23 Jan 2025 09:26:51 -0500
Message-ID: <CAN3jBgFEX-fhXuNkrMYwCeWjtYK2_zSrvEmefkUZciLPHK7Psw@mail.gmail.com>
Subject: Re: Return of the pg_wal issue..
To: Paul Brindusa <paulbrindusa88@gmail.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Content-Type: multipart/alternative; boundary="000000000000a9ce06062c606707"
Archived-At: <https://www.postgresql.org/message-id/CAN3jBgFEX-fhXuNkrMYwCeWjtYK2_zSrvEmefkUZciLPHK7Psw%40mail.gmail.com>
Precedence: bulk

--000000000000a9ce06062c606707
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hey Paul,

Regarding

*"I've not managed to test the queries out yet. But I am planning to test
out in my lab environment.*
*Sorry am really cautious about this as those are the main production
databases."*

As a dispassionate third-party observer, I can confirm that all SELECT and
SHOW queries from Laurenz's blog post are read-only. They're completely
safe to run in the affected environment.

On Thu, Jan 23, 2025 at 6:40=E2=80=AFAM Paul Brindusa <paulbrindusa88@gmail=
.com>
wrote:

> Hopefully the below is going to give a little bit more insight on the
> issue.
> I will mention as well that the cluster also replicates data to another
> mysql database if it's relevant at all.
> Also worth noting this is our production cluster and we have another
> pre-production cluster with basically the same settings and the issue the=
re
> does not occur.
>
> A good deal more information is needed to troubleshoot this:
>
> 1) Postgres version(s).
>
> postgres (PostgreSQL) 15.10
>
> 2) The Patroni version.
>
> patroni 4.0.4
>
> 3) The Patroni configuration.
>
> scope: postgres-cluster
> name: db01
> namespace: /service/
>
> log:
>   level: INFO
>   traceback_level: ERROR
>   format: "%(asctime)s %(levelname)s: %(message)s"
>   dateformat: ""
>   max_queue_size: 1000
>   dir: /var/log/patroni
>   file_num: 4
>   file_size: 25000000
>   loggers:
>     patroni.postmaster: WARNING
>     urllib3: WARNING
>
> restapi:
>   listen: x.x.x.98:8008
>   connect_address: x.x.x.98:8008
>
> etcd3:
>   hosts: db01.local:2379,db02.local:2379,db03.local:2379
>
>
> bootstrap:
>   dcs:
>     ttl: 30
>     loop_wait: 10
>     retry_timeout: 10
>     maximum_lag_on_failover: 1048576
>     postgresql:
>       use_pg_rewind: true
>       use_slots: true
>       parameters:
>         max_connections: 500
>         superuser_reserved_connections: 5
>         password_encryption: scram-sha-256
>         max_locks_per_transaction: 512
>         max_prepared_transactions: 0
>         huge_pages: try
>         shared_buffers: 128MB
>         effective_cache_size: 4GB
>         work_mem: 128MB
>         maintenance_work_mem: 256MB
>         checkpoint_timeout: 15min
>         checkpoint_completion_target: 0.9
>         min_wal_size: 80MB
>         max_wal_size: 1GB
>         wal_buffers: 32MB
>         default_statistics_target: 1000
>         seq_page_cost: 1
>         random_page_cost: 4
>         effective_io_concurrency: 2
>         synchronous_commit: on
>         autovacuum: on
>         autovacuum_max_workers: 5
>         autovacuum_vacuum_scale_factor: 0.01
>         autovacuum_analyze_scale_factor: 0.01
>         autovacuum_vacuum_cost_limit: 500
>         autovacuum_vacuum_cost_delay: 2
>         autovacuum_naptime: 1s
>         max_files_per_process: 4096
>         archive_mode: on
>         archive_timeout: 1800s
>         archive_command: cd .
>         wal_level: replica
>         wal_keep_size: 2GB
>         max_wal_senders: 10
>         max_replication_slots: 10
>         hot_standby: on
>         wal_log_hints: on
>         wal_compression: on
>         shared_preload_libraries: pgaudit
>         track_io_timing: on
>         log_lock_waits: on
>         log_temp_files: 0
>         track_activities: on
>         track_counts: on
>         track_functions: all
>         log_checkpoints: on
>         logging_collector: on
>         log_truncate_on_rotation: on
>         log_rotation_age: 1d
>         log_rotation_size: 1GB
>         log_line_prefix: '%m [%p]: [%l-1] db=3D%d,user=3D%u,app=3D%a,clie=
nt=3D%h '
>         log_filename: postgresql-%Y-%m-%d.log
>         log_directory: /var/log/pgsql
>         log_connections: on
>         log_disconnections: on
>         log_statement: ddl
>         log_error_verbosity: verbose
>         hot_standby_feedback: on
>         max_standby_streaming_delay: 30s
>         wal_receiver_status_interval: 10s
>         idle_in_transaction_session_timeout: 10min
>         jit: off
>         max_worker_processes: 24
>         max_parallel_workers: 8
>         max_parallel_workers_per_gather: 2
>         max_parallel_maintenance_workers: 2
>
>   initdb:
>   - encoding: UTF8
>   - data-checksums
>
>   pg_hba:
>   - host replication replicator 127.0.0.1/32 md5
>
>   - host replication replicator x.x.x.98/27 scram-sha-256
>
>
>
>   - host replication replicator x.x.x.99/27 scram-sha-256
>
>
>
>   - host replication replicator x.x.x.100/27 scram-sha-256
>
>
>   - host all all 0.0.0.0/0 md5
>
> postgresql:
>   listen: x.x.x.98:5432
>   connect_address: x.x.x.98:5432
>   data_dir: /var/lib/pgsql/data
>   bin_dir: /usr/bin
>   pgpass: /var/lib/pgsql/.pgpass_patroni
>   authentication:
>     replication:
>       username: replicator
>       password: password
>     superuser:
>       username: postgres
>       password: password
>   parameters:
>     unix_socket_directories: /var/run/postgresql
>
>   remove_data_directory_on_rewind_failure: false
>   remove_data_directory_on_diverged_timelines: false
>
>   create_replica_methods:
>     - basebackup
>   basebackup:
>     max-rate: '100M'
>     checkpoint: 'fast'
>
> watchdog:
>   mode: required
>   device: /dev/watchdog
>   safety_margin: 5
>
> tags:
>   nofailover: false
>   noloadbalance: false
>   clonefrom: false
>   nosync: false
>
> 4) Definition of 'ridiculous rate'.
>
> 1GB / day
>
> 5) Relevant information from the logs.
>
> Below entry is something taken off today's log  until this point in time
> which I think it might be relevant. I cannot see any specifics. If there =
is
> anything else please let me know.
>
> 2<REDACTED>:<REDACTED> GMT [186889]: [863-1] db=3D,user=3D,app=3D,client=
=3D LOG:
>  00000: checkpoint starting: time
> 2<REDACTED>:<REDACTED> GMT [186889]: [864-1] db=3D,user=3D,app=3D,client=
=3D
> LOCATION:  LogCheckpointStart, xlog.c:6121
> 2<REDACTED>:<REDACTED> GMT [186889]: [865-1] db=3D,user=3D,app=3D,client=
=3D LOG:
>  00000: checkpoint complete: wrote 66 buffers (0.4%); 0 WAL file(s) added=
,
> 0 removed, 0 recycled; write=3D6.563 s, sync=3D0.003 s, total=3D6.619 s; =
sync
> files=3D22, longest=3D0.002 s, average=3D0.001 s; distance=3D776 kB, esti=
mate=3D56426
> kB
> 2<REDACTED>:<REDACTED> GMT [186889]: [866-1] db=3D,user=3D,app=3D,client=
=3D
> LOCATION:  LogCheckpointEnd, xlog.c:6202
> 2<REDACTED>:<REDACTED> GMT [2439188]: [7-1]
> db=3Ddocumentation-database,user=3Ddocumentation-database-user,app=3DPost=
greSQL
> JDBC Driver,client=3D<REDACTED> LOG:  00000: disconnection: session time:
> 0:<REDACTED> user=3Ddocumentation-database-user
> database=3Ddocumentation-database host=3D<REDACTED> port=3D56170
>
>
> @Laurenz
>
> I guess you are referring to
> https://www.cybertec-postgresql.com/en/why-does-my-pg_wal-keep-growing/
>
> *Yes, that is the one.*
>
> I listed all the reasons I know for your predicament.
> Did you do some research along these lines?
>
> *I've had a look at the things that you have mentioned in the guide. *
>
> If yes, what did you find?
>
> *I've not managed to test the queries out yet. But I am planning to test
> out in my lab environment.*
> *Sorry am really cautious about this as those are the main production
> databases.*
>
> *Hope the above is going to give a bit of insight on the root cause of th=
e
> problem.*
>
>
>
> Yours,
> Laurenz Albe
>
>
>
> On Wed, Jan 22, 2025 at 6:03=E2=80=AFPM Adrian Klaver <adrian.klaver@akla=
ver.com>
> wrote:
>
>> On 1/22/25 09:33, Paul Brindusa wrote:
>> > Good afternoon,
>> >
>> > Following below we are facing a similar issue and im getting a real
>> buzz
>> > to get this working myself, speaking to my DBA  in the  company has
>> > actually left me a bit cold as he is not good with postgres.
>> >
>> > So I want to try and get a solution for this and fix this issue with
>> the
>> > pg_wal files filling up the drive at a ridiculous rate. I have been
>> > manually moving logs to a different directory but have had no luck in
>> > finding an actual solution.
>> >
>> > The cluster is a 3 node cluster with HA which is running wirth patroni=
.
>> >
>> > Please help me out, I will mention that I have test cluster spun up in
>> > case something needs testing.
>> >
>> > Also want to give a shout out to Lorenz Albe's for posting stuff about
>> > wal files on his company blog.
>> >
>> > Again any help will be greatly appreciated.
>>
>> A good deal more information is needed to troubleshoot this:
>>
>> 1) Postgres version(s).
>>
>> 2) The Patroni version.
>>
>> 3) The Patroni configuration.
>>
>> 4) Definition of 'ridiculous rate'.
>>
>> 5) Relevant information from the logs.
>>
>> >
>> >
>> > " On one of our postgres instances we have the pg_wal/data folder up t=
o
>> > 196GB, out of 200GB disk filled up.
>> > This has stopped the posgresql.service this morning causing two
>> > applications to crash.
>> > Unfortunately our database admin is on leave today, and we are trying
>> to
>> > figure out how to get the disk down?
>> > Any ideas or suggestions are more than welcome.
>> >
>> > Thank you in advance."
>> >
>> >
>> > --
>> > Kind Regards,
>> > Paul Brindusa
>> > paulbrindusa88@gmail.com <mailto:paulbrindusa88@gmail.com>
>> >
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>>
>
> --
> Kind Regards,
> Paul Brindusa
> paulbrindusa88@gmail.com
>
>

--000000000000a9ce06062c606707
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">Hey Paul,<div><br></div><div>Regarding=C2=
=A0<br><br><div><b><i>&quot;I&#39;ve not managed to test the queries out ye=
t. But I am planning to test out in my lab environment.</i></b></div><div><=
b><i>Sorry am really cautious about this as those are the main production d=
atabases.&quot;</i></b></div><br class=3D"gmail-Apple-interchange-newline">=
<div>As a dispassionate third-party observer, I can confirm that all SELECT=
 and SHOW queries from Laurenz&#39;s blog post are read-only. They&#39;re c=
ompletely safe to run in the affected environment.</div><div>=C2=A0=C2=A0</=
div></div></div><div class=3D"gmail_quote gmail_quote_container"><div dir=
=3D"ltr" class=3D"gmail_attr">On Thu, Jan 23, 2025 at 6:40=E2=80=AFAM Paul =
Brindusa &lt;<a href=3D"mailto:paulbrindusa88@gmail.com">paulbrindusa88@gma=
il.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"m=
argin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left=
:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><div>Hopefully the below is going t=
o give a little bit more insight on the issue.</div><div>I will mention as =
well that the cluster also replicates data to another mysql database if it&=
#39;s relevant at all.</div><div>Also worth noting this is our production c=
luster and we have another pre-production cluster with basically the same s=
ettings and the issue there does not occur.</div><div><br></div>A good deal=
 more information is needed to troubleshoot this:<br><br>1) Postgres versio=
n(s).<div><br></div><div>postgres (PostgreSQL) 15.10<br><br>2) The Patroni =
version.</div><div><br></div><div>patroni 4.0.4<br><br>3) The Patroni confi=
guration.<br><br>scope: postgres-cluster<br>name: db01<br>namespace: /servi=
ce/<br><br>log:<br>=C2=A0 level: INFO<br>=C2=A0 traceback_level: ERROR<br>=
=C2=A0 format: &quot;%(asctime)s %(levelname)s: %(message)s&quot;<br>=C2=A0=
 dateformat: &quot;&quot;<br>=C2=A0 max_queue_size: 1000<br>=C2=A0 dir: /va=
r/log/patroni<br>=C2=A0 file_num: 4<br>=C2=A0 file_size: 25000000<br>=C2=A0=
 loggers:<br>=C2=A0 =C2=A0 patroni.postmaster: WARNING<br>=C2=A0 =C2=A0 url=
lib3: WARNING<br><br>restapi:<br>=C2=A0 listen: x.x.x.98:8008<br>=C2=A0 con=
nect_address: x.x.x.98:8008<br><br>etcd3:<br>=C2=A0 hosts: db01.local:2379,=
db02.local:2379,db03.local:2379<br><br><br>bootstrap:<br>=C2=A0 dcs:<br>=C2=
=A0 =C2=A0 ttl: 30<br>=C2=A0 =C2=A0 loop_wait: 10<br>=C2=A0 =C2=A0 retry_ti=
meout: 10<br>=C2=A0 =C2=A0 maximum_lag_on_failover: 1048576<br>=C2=A0 =C2=
=A0 postgresql:<br>=C2=A0 =C2=A0 =C2=A0 use_pg_rewind: true<br>=C2=A0 =C2=
=A0 =C2=A0 use_slots: true<br>=C2=A0 =C2=A0 =C2=A0 parameters:<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 max_connections: 500<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 su=
peruser_reserved_connections: 5<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 password_enc=
ryption: scram-sha-256<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 max_locks_per_transac=
tion: 512<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 max_prepared_transactions: 0<br>=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 huge_pages: try =C2=A0 =C2=A0 =C2=A0<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 shared_buffers: 128MB<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 e=
ffective_cache_size: 4GB<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 work_mem: 128MB<br>=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 maintenance_work_mem: 256MB<br>=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 checkpoint_timeout: 15min<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 chec=
kpoint_completion_target: 0.9<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 min_wal_size: =
80MB<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 max_wal_size: 1GB<br>=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 wal_buffers: 32MB<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 default_statist=
ics_target: 1000<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 seq_page_cost: 1<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 random_page_cost: 4<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 eff=
ective_io_concurrency: 2<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 synchronous_commit:=
 on<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 autovacuum: on<br>=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 autovacuum_max_workers: 5<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 autovacuum_=
vacuum_scale_factor: 0.01<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 autovacuum_analyze=
_scale_factor: 0.01<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 autovacuum_vacuum_cost_l=
imit: 500<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 autovacuum_vacuum_cost_delay: 2<br=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 autovacuum_naptime: 1s<br>=C2=A0 =C2=A0 =C2=A0=
 =C2=A0 max_files_per_process: 4096<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 archive_=
mode: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 archive_timeout: 1800s<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 archive_command: cd .<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 w=
al_level: replica<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 wal_keep_size: 2GB<br>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 max_wal_senders: 10<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0=
 max_replication_slots: 10<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 hot_standby: on<b=
r>=C2=A0 =C2=A0 =C2=A0 =C2=A0 wal_log_hints: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 wal_compression: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 shared_preload_libra=
ries: pgaudit<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 track_io_timing: on<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 log_lock_waits: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_=
temp_files: 0<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 track_activities: on<br>=C2=A0=
 =C2=A0 =C2=A0 =C2=A0 track_counts: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 track=
_functions: all<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_checkpoints: on<br>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 logging_collector: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 log_truncate_on_rotation: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_rotatio=
n_age: 1d<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_rotation_size: 1GB<br>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 log_line_prefix: &#39;%m [%p]: [%l-1] db=3D%d,user=3D%=
u,app=3D%a,client=3D%h &#39;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_filename: p=
ostgresql-%Y-%m-%d.log<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_directory: /var/l=
og/pgsql<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_connections: on <br>=C2=A0 =C2=
=A0 =C2=A0 =C2=A0 log_disconnections: on<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log=
_statement: ddl<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 log_error_verbosity: verbose=
<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 hot_standby_feedback: on<br>=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 max_standby_streaming_delay: 30s<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 wal_receiver_status_interval: 10s<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 idle_i=
n_transaction_session_timeout: 10min<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 jit: of=
f<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 max_worker_processes: 24<br>=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 max_parallel_workers: 8<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 max_pa=
rallel_workers_per_gather: 2<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 max_parallel_ma=
intenance_workers: 2 =C2=A0 =C2=A0 =C2=A0<br><br>=C2=A0 initdb:<br>=C2=A0 -=
 encoding: UTF8<br>=C2=A0 - data-checksums<br><br>=C2=A0 pg_hba:<br>=C2=A0 =
- host replication replicator <a href=3D"http://127.0.0.1/32" target=3D"_bl=
ank">127.0.0.1/32</a> md5<br>=C2=A0 <br>=C2=A0 - host replication replicato=
r x.x.x.98/27 scram-sha-256<br>=C2=A0 <br>=C2=A0 <br>=C2=A0 <br>=C2=A0 - ho=
st replication replicator x.x.x.99/27 scram-sha-256<br>=C2=A0 <br>=C2=A0 <b=
r>=C2=A0 <br>=C2=A0 - host replication replicator x.x.x.100/27 scram-sha-25=
6<br>=C2=A0 <br>=C2=A0 =C2=A0 <br>=C2=A0 - host all all <a href=3D"http://0=
.0.0.0/0" target=3D"_blank">0.0.0.0/0</a> md5<br><br>postgresql:<br>=C2=A0 =
listen: x.x.x.98:5432<br>=C2=A0 connect_address: x.x.x.98:5432<br>=C2=A0 da=
ta_dir: /var/lib/pgsql/data<br>=C2=A0 bin_dir: /usr/bin<br>=C2=A0 pgpass: /=
var/lib/pgsql/.pgpass_patroni<br>=C2=A0 authentication:<br>=C2=A0 =C2=A0 re=
plication:<br>=C2=A0 =C2=A0 =C2=A0 username: replicator<br>=C2=A0 =C2=A0 =
=C2=A0 password: password<br>=C2=A0 =C2=A0 superuser:<br>=C2=A0 =C2=A0 =C2=
=A0 username: postgres<br>=C2=A0 =C2=A0 =C2=A0 password: password<br>=C2=A0=
 parameters:<br>=C2=A0 =C2=A0 unix_socket_directories: /var/run/postgresql<=
br><br>=C2=A0 remove_data_directory_on_rewind_failure: false<br>=C2=A0 remo=
ve_data_directory_on_diverged_timelines: false<br><br>=C2=A0 create_replica=
_methods:<br>=C2=A0 =C2=A0 - basebackup<br>=C2=A0 basebackup:<br>=C2=A0 =C2=
=A0 max-rate: &#39;100M&#39;<br>=C2=A0 =C2=A0 checkpoint: &#39;fast&#39; =
=C2=A0 =C2=A0 =C2=A0<br><br>watchdog:<br>=C2=A0 mode: required<br>=C2=A0 de=
vice: /dev/watchdog<br>=C2=A0 safety_margin: 5<br><br>tags:<br>=C2=A0 nofai=
lover: false<br>=C2=A0 noloadbalance: false<br>=C2=A0 clonefrom: false<br>=
=C2=A0 nosync: false<br><br>4) Definition of &#39;ridiculous rate&#39;.<br>=
<br>1GB / day<br><br>5) Relevant information from the logs.</div><div><br><=
/div><div>Below entry is something taken off today&#39;s log=C2=A0 until th=
is point in time which I think it might be relevant. I cannot see any speci=
fics. If there is anything else please let me know.=C2=A0</div><div><br></d=
iv><div>2&lt;REDACTED&gt;:&lt;REDACTED&gt; GMT [186889]: [863-1] db=3D,user=
=3D,app=3D,client=3D LOG: =C2=A000000: checkpoint starting: time<br>2&lt;RE=
DACTED&gt;:&lt;REDACTED&gt; GMT [186889]: [864-1] db=3D,user=3D,app=3D,clie=
nt=3D LOCATION: =C2=A0LogCheckpointStart, xlog.c:6121<br>2&lt;REDACTED&gt;:=
&lt;REDACTED&gt; GMT [186889]: [865-1] db=3D,user=3D,app=3D,client=3D LOG: =
=C2=A000000: checkpoint complete: wrote 66 buffers (0.4%); 0 WAL file(s) ad=
ded, 0 removed, 0 recycled; write=3D6.563 s, sync=3D0.003 s, total=3D6.619 =
s; sync files=3D22, longest=3D0.002 s, average=3D0.001 s; distance=3D776 kB=
, estimate=3D56426 kB<br>2&lt;REDACTED&gt;:&lt;REDACTED&gt; GMT [186889]: [=
866-1] db=3D,user=3D,app=3D,client=3D LOCATION: =C2=A0LogCheckpointEnd, xlo=
g.c:6202<br>2&lt;REDACTED&gt;:&lt;REDACTED&gt; GMT [2439188]: [7-1] db=3Ddo=
cumentation-database,user=3Ddocumentation-database-user,app=3DPostgreSQL JD=
BC Driver,client=3D&lt;REDACTED&gt; LOG: =C2=A000000: disconnection: sessio=
n time: 0:&lt;REDACTED&gt; user=3Ddocumentation-database-user database=3Ddo=
cumentation-database host=3D&lt;REDACTED&gt; port=3D56170</div><div><br></d=
iv><div><br></div><div><span style=3D"color:rgb(80,0,80)"><a class=3D"gmail=
_plusreply" id=3D"m_349589038388769837gmail-plusReplyChip-1">@Laurenz=C2=A0=
</a></span></div><div><br></div><div>I guess you are referring to<br><a hre=
f=3D"https://www.cybertec-postgresql.com/en/why-does-my-pg_wal-keep-growing=
/" rel=3D"noreferrer" target=3D"_blank">https://www.cybertec-postgresql.com=
/en/why-does-my-pg_wal-keep-growing/</a><br><br><b>Yes, that is the one.</b=
><br><br>I listed all the reasons I know for your predicament.<br>Did you d=
o some research along these lines?</div><div><br></div><div><b>I&#39;ve had=
 a look at the things that you have mentioned in the guide.=C2=A0</b></div>=
<div><br></div><div>If yes, what did you find?<br><br></div><div><b>I&#39;v=
e not managed to test the queries out yet. But I am planning to test out in=
 my lab environment.</b></div><div><b>Sorry am really cautious about this a=
s those are the main production databases.</b></div><div><b><br></b></div><=
div><b>Hope the above is going to give a bit of insight on the root cause o=
f the problem.</b></div><div><br></div><div><br></div><div><br></div><div>Y=
ours,<br>Laurenz Albe<div></div><br></div><div><br></div></div><br><div cla=
ss=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Jan 22, 20=
25 at 6:03=E2=80=AFPM Adrian Klaver &lt;<a href=3D"mailto:adrian.klaver@akl=
aver.com" target=3D"_blank">adrian.klaver@aklaver.com</a>&gt; wrote:<br></d=
iv><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bord=
er-left:1px solid rgb(204,204,204);padding-left:1ex">On 1/22/25 09:33, Paul=
 Brindusa wrote:<br>
&gt; Good afternoon,<br>
&gt; <br>
&gt; Following below we are facing a similar issue and im getting a real bu=
zz <br>
&gt; to get this working myself, speaking to my DBA=C2=A0 in the=C2=A0 comp=
any has <br>
&gt; actually left me a bit cold as he is not good with postgres.<br>
&gt; <br>
&gt; So I want to try and get a solution for this and fix this issue with t=
he <br>
&gt; pg_wal files filling up the drive at a ridiculous rate. I have been <b=
r>
&gt; manually moving logs to a different directory but have had no luck in =
<br>
&gt; finding an actual solution.<br>
&gt; <br>
&gt; The cluster is a 3 node cluster with HA which is running wirth patroni=
.<br>
&gt; <br>
&gt; Please help me out, I will mention that I have test cluster spun up in=
 <br>
&gt; case something needs testing.<br>
&gt; <br>
&gt; Also want to give a shout out to Lorenz Albe&#39;s for posting stuff a=
bout <br>
&gt; wal files on his company blog.<br>
&gt; <br>
&gt; Again any help will be greatly appreciated.<br>
<br>
A good deal more information is needed to troubleshoot this:<br>
<br>
1) Postgres version(s).<br>
<br>
2) The Patroni version.<br>
<br>
3) The Patroni configuration.<br>
<br>
4) Definition of &#39;ridiculous rate&#39;.<br>
<br>
5) Relevant information from the logs.<br>
<br>
&gt; <br>
&gt; <br>
&gt; &quot; On one of our postgres instances we have the pg_wal/data folder=
 up to <br>
&gt; 196GB, out of 200GB disk filled up.<br>
&gt; This has stopped the posgresql.service this morning causing two <br>
&gt; applications to crash.<br>
&gt; Unfortunately our database admin is on leave today, and we are trying =
to <br>
&gt; figure out how to get the disk down?<br>
&gt; Any ideas or suggestions are more than welcome.<br>
&gt; <br>
&gt; Thank you in advance.&quot;<br>
&gt; <br>
&gt; <br>
&gt; -- <br>
&gt; Kind Regards,<br>
&gt; Paul Brindusa<br>
&gt; <a href=3D"mailto:paulbrindusa88@gmail.com" target=3D"_blank">paulbrin=
dusa88@gmail.com</a> &lt;mailto:<a href=3D"mailto:paulbrindusa88@gmail.com"=
 target=3D"_blank">paulbrindusa88@gmail.com</a>&gt;<br>
&gt; <br>
<br>
-- <br>
Adrian Klaver<br>
<a href=3D"mailto:adrian.klaver@aklaver.com" target=3D"_blank">adrian.klave=
r@aklaver.com</a><br>
<br>
</blockquote></div><div><br clear=3D"all"></div><div><br></div><span class=
=3D"gmail_signature_prefix">-- </span><br><div dir=3D"ltr" class=3D"gmail_s=
ignature"><div dir=3D"ltr"><div>Kind Regards,</div><div>Paul Brindusa</div>=
<div><a href=3D"mailto:paulbrindusa88@gmail.com" target=3D"_blank">paulbrin=
dusa88@gmail.com</a></div><div><br></div></div></div></div>
</blockquote></div></div>

--000000000000a9ce06062c606707--