Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wVae8-0024CF-0b for pgsql-admin@arkaria.postgresql.org; Fri, 05 Jun 2026 19:56:00 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wVae6-00EJ2s-0b for pgsql-admin@arkaria.postgresql.org; Fri, 05 Jun 2026 19:55:58 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wVHqg-009NPn-0A for pgsql-admin@lists.postgresql.org; Thu, 04 Jun 2026 23:51:42 +0000 Received: from sender4-op-o12.zoho.com ([136.143.188.12]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wVHqc-000000019Iw-11B0 for pgsql-admin@lists.postgresql.org; Thu, 04 Jun 2026 23:51:41 +0000 ARC-Seal: i=1; a=rsa-sha256; t=1780617088; cv=none; d=zohomail.com; s=zohoarc; b=I+t9EOarW+PeYRkmrSZoLTFqWaxvYjHJu+FnVB/lBMALa6irlZdnsdj5BpRaPW8eGUPen5k37MucnltcWeFXmnZx+VC6uBrLDfhPQOROgw3LfTSoIMHEHLxwOnClr/J+6YqajEuacveOcIb3pKnWG09keyr7BHYUhhgSwwX125Q= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1780617088; h=Content-Type:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To; bh=U4PJ12yuh8//Umulw2Dm1zhkjS3xk+Rd2Pl8XaSqirQ=; b=ioduqbz/qqF/vWYgmOBbop9Ud2BBAz2Yc6LjwO6EGisE13c3t31e38UH9r6Em74ZpzAOVgYkN60RKbMHUz8+beW1Miv4AqSh6XtCqHk7tgu+8u4OY7dtaXRBkfkN0biA/7RscLw+IoTJg05Vjt/sSUs8ApXISBQ6JfrQbkffZ/Q= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=burnsideproject.ai; spf=pass smtp.mailfrom=hello@burnsideproject.ai; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1780617088; s=zmail; d=burnsideproject.ai; i=hello@burnsideproject.ai; h=Date:Date:From:From:To:To:Cc:Cc:Message-Id:Message-Id:In-Reply-To:References:Subject:Subject:MIME-Version:Content-Type:Reply-To; bh=U4PJ12yuh8//Umulw2Dm1zhkjS3xk+Rd2Pl8XaSqirQ=; b=abOhd6Xu0zxU0s/a6u9Z47XiCYDwPpy7SQwB4lzivxwL0I83k/BD3PixP62u6VBO nYp5bDIvUzidM/lGbHoDBmKFLyYeCPdbgYO57+GAoVDtRR0LX/3XtzjAuOEhS7zpsxZ zyZ/9hiefIl6FzjEhfrYxgi+3TwLTaaJ99nlZVng= Received: from mail.zoho.com by mx.zohomail.com with SMTP id 1780617085780794.6485813917524; Thu, 4 Jun 2026 16:51:25 -0700 (PDT) Date: Thu, 04 Jun 2026 16:51:25 -0700 From: hello from Burnside Project To: "Jorge Daniel" Cc: "pgsql-admin@lists.postgresql.org" Message-Id: <19e950c8337.61bdc2a03840616.4821016053268419825@burnsideproject.ai> In-Reply-To: References: Subject: Re:Pg14 replication issue , recovery stucks in a random file without advancing while streaming from primary MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_13096318_1715792670.1780617085752" Importance: Medium Disposition-Notification-To: "hello from Burnside Project" User-Agent: Zoho Mail X-Mailer: Zoho Mail List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk ------=_Part_13096318_1715792670.1780617085752 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Replication status On the primary: SELECT * FROM pg_stat_replication; Shows: * Is the standby connected? * What LSN has been sent? * What LSN has been replayed? 2. WAL receiver status On the standby: SELECT * FROM pg_stat_wal_receiver; Shows: * Is WAL still being received? * From which server? * Latest WAL location received? 3. Recovery progress On the standby: SELECT =C2=A0=C2=A0=C2=A0 pg_last_wal_receive_lsn(), =C2=A0=C2=A0=C2=A0 pg_last_wal_replay_lsn(), =C2=A0=C2=A0=C2=A0 pg_last_xact_replay_timestamp(); =C2=A0=C2=A0=C2=A0 This tells you whether: * WAL is arriving but not replaying. * WAL is replaying slowly. * Replay has completely stopped. 4. PostgreSQL logs Look for messages such as: invalid record length PANIC could not read WAL requested timeline waiting for WAL Best regards,=C2=A0=20 Arif Rahman=C2=A0 mailto:arif.rahman@burnsideproject.ai =C2=A0 https://burnsideproject.ai =C2=A0 https://github.com/orgs/burnside-project =C2=A0 Capture, transform, and learn from PostgreSQL Email Disclaimer: This email and any attachments may contain privileged and confidential info= rmation intended solely for the use of the individual or entity to whom it = is addressed. If you are not the intended recipient, you are hereby notifie= d that any dissemination, distribution, copying, or use of this email or it= s contents is strictly prohibited. If you have received this email in error= , please notify the sender immediately by replying to this message and dele= te it from your system. Please note that any views or opinions expressed in this email are solely t= hose of the author and do not necessarily represent those of Burnside Proje= ct LLC. Although we have taken precautions to ensure this email is free of = viruses or other malicious software, we cannot guarantee the security or in= tegrity of email communications. Recipients should verify attachments for p= ossible threats. Thank you. Burnside Project From: Jorge Daniel To: "pgsql-admin@lists.postgresql.org" Date: Thu, 04 Jun 2026 14:13:45 -0700 Subject: Pg14 replication issue , recovery stucks in a random file without = advancing while streaming from primary Good day to everyone=20 =20 We're asking the PG-comunity for some help if it is possible.=20 =20 We have a primary with 2 secondaries: the primary went down and one of the = secondaries was promoted. The orphaned Secondary reconnected to the new pri= mary and is replicating ok.=20 We had to reconstruct a new secondary, we did it as we always do with the b= asic and dependable:=20 =20 pg_basebackup -h uspgvento14r.us.local -U replicator -p 5432 -D $PGDATA -Fp= -Xs -P -R --checkpoint=3Dfast --create-slot --slot=3Dus_vento_replica_slot= _aux=20 =20 It ran for 5hrs without problem. When it finished:=20 $ pg_ctl start=20 =20 The recovery was running until a consistent point was reached, the database= opened and started streaming the rest from the Primary.=20 After an hour or so, the recovery got stuck in a certain wal file. No more = log entries about it (debug 1) and after some hours the stream connection g= ot disconnected.=20 The secondary is hung on that wal file and not going forward with the rest = of the wal file list.=20 =20 We re-tried this several times, changing the storage (just in case), with a= new box with the same original Ubuntu 22.04 instead of 24.04 (just in case= ), but the result was the same.=20 Even though we have the 22.04 and 24.04 in parallel, we saw both replica en= gines freeze on the same file (everytime we re-created the stuck-wal-file c= hanged, clearly).=20 We're out of ideas of what's happening.=20 Could you please shed some light here?=20 =20 =20 Primary: uspgvento14r=20 =20 Version=20 server_version | 14.11 (Ubuntu 14.11-0ubuntu0.22.04.1)=20 =20 pg_stat_replication:=20 =20 pid | 4177197=20 usesysid | 16474=20 usename | replicator=20 application_name | 14_replica=20 client_addr | 192.168.11.33=20 client_hostname |=20 client_port | 53534=20 backend_start | 2026-06-04 01:25:18.100137-07=20 backend_xmin |=20 state | streaming=20 sent_lsn | 8EBA/CDA68930=20 write_lsn | 8EBA/CDA68930=20 flush_lsn | 8EBA/CDA68930=20 replay_lsn | 8EB7/968FEBE8=20 write_lag | 00:00:00.001543=20 flush_lag | 00:00:00.00225=20 replay_lag | 01:00:22.530041=20 sync_priority | 0=20 sync_state | async=20 reply_time | 2026-06-04 03:33:28.037703-07=20 =20 Log Primary=20 =20 2026-06-04 01:25:18 PDT [unknown] [unknown] 192.168.12.34 [unknown] [417719= 7] LOG: connection received: host=3D192.168.12.34 port=3D53534=20 2026-06-04 01:25:18 PDT [2858] LOG: background worker "logical replica= tion worker" (PID 4177182) exited with exit code 1=20 2026-06-04 01:25:18 PDT [unknown] replicator 192.168.12.34 [unknown] [41771= 97] LOG: connection authenticated: identity=3D"replicator" method=3Dmd5 (/= etc/postgresql/14/main/pg_hba.conf:95)=20 2026-06-04 01:25:18 PDT [unknown] replicator 192.168.12.34 [unknown] [41771= 97] LOG: replication connection authorized: user=3Dreplicator application_= name=3D14_uspgvento14Rb SSL enabled (protocol=3DTLSv1.3, cipher=3DTLS_AES_2= 56_GCM_SHA384, bits=3D256)=20 --=20 2026-06-04 04:02:26 PDT [unknown] replicator 192.168.12.34 14_uspgvento14Rb= [4177197] LOG: disconnection: session time: 2:37:08.575 user=3Dreplicator= database=3D host=3D192.168.12.34 port=3D53534=20 =20 =20 =20 =20 Secondary : 14_uspgvento14Rb 192.168.12.34=20 =20 Version:=20 server_version | 14.11 (Ubuntu 14.11-0ubuntu0.24.04.1)=20 =20 Ubuntu 14.23-1.pgdg24.04+1=20 =20 =20 .....=20 2026-06-04 08:25:14 UTC [197740] DEBUG: got WAL segment from archive= =20 2026-06-04 08:25:14 UTC [197740] LOG: restored log file "0000000200008= EA40000002D" from archive=20 2026-06-04 08:25:14 UTC [197740] DEBUG: got WAL segment from archive= =20 2026-06-04 08:25:14 UTC [197740] LOG: restored log file "0000000200008= EA40000002E" from archive=20 2026-06-04 08:25:14 UTC [197740] DEBUG: got WAL segment from archive= =20 2026-06-04 08:25:14 UTC [197740] LOG: restored log file "0000000200008= EA40000002F" from archive=20 2026-06-04 08:25:14 UTC [197740] DEBUG: got WAL segment from archive= =20 2026-06-04 08:25:14 UTC [197740] LOG: restored log file "0000000200008= EA400000030" from archive=20 2026-06-04 08:25:14 UTC [197740] DEBUG: got WAL segment from archive= =20 026-06-04 08:25:15 UTC [197747] DEBUG: checkpoint sync: number=3D4 fil= e=3Dbase/6176124/2840_fsm time=3D0.011 ms=20 2026-06-04 08:25:15 UTC [197747] DEBUG: checkpoint sync: number=3D5 fi= le=3Dbase/6176124/6599312.33 time=3D1.493 ms=20 ......=20 2026-06-04 08:25:18 UTC [197740] LOG: restored log file "0000000200008= EA400000044" from archive=20 2026-06-04 08:25:18 UTC [197740] DEBUG: got WAL segment from archive= =20 2026-06-04 08:25:18 UTC [197740] DEBUG: end of backup reached=20 2026-06-04 08:25:18 UTC [197740] CONTEXT: WAL redo at 8EA4/44680C08 fo= r XLOG/BACKUP_END: 8E91/8B065578=20 2026-06-04 08:25:18 UTC [197740] LOG: consistent recovery state reache= d at 8EA4/44680C30=20 2026-06-04 08:25:18 UTC [197738] LOG: database system is ready to acce= pt read-only connections=20 cp: cannot stat '/pg_data/pg14_wal_archive/0000000200008EA400000045': No su= ch file or directory=20 2026-06-04 08:25:18 UTC [207957] LOG: started streaming WAL from prima= ry at 8EA4/45000000 on timeline 2=20 2026-06-04 08:25:39 UTC [unknown] [unknown] [local] [unknown] [208039] LOG:= connection received: host=3D[local]=20 2026-06-04 08:25:39 UTC postgres postgres [local] [unknown] [208039] LOG: = connection authorized: user=3Dpostgres database=3Dpostgres application_name= =3Dpsql=20 2026-06-04 08:26:15 UTC [197747] LOG: restartpoint starting: time=20 2026-06-04 08:26:15 UTC [197747] DEBUG: performing replication slot ch= eckpoint=20 ......=20 22026-06-04 08:30:15 UTC [197747] DEBUG: checkpoint sync: number=3D4 f= ile=3Dbase/6176124/2840_fsm time=3D0.006 ms=20 2026-06-04 08:30:15 UTC [197747] DEBUG: checkpoint sync: number=3D5 fi= le=3Dbase/6176124/6599312.33 time=3D1.572 ms=20 2026-06-04 08:30:15 UTC [197747] DEBUG: checkpoint sync: number=3D6 fi= le=3Dbase/6176124/6602460.56 time=3D1.791 ms=20 .....=20 2026-06-04 09:35:15 UTC [197747] DEBUG: checkpoint sync: number=3D782 = file=3Dbase/6176124/6601067.15 time=3D0.003 ms=20 2026-06-04 09:35:15 UTC [197747] DEBUG: checkpoint sync: number=3D783 = file=3Dbase/6176124/6374431.10 time=3D0.006 ms=20 2026-06-04 09:35:15 UTC [197747] LOG: restartpoint complete: wrote 995= 00 buffers (19.0%); 0 WAL file(s) added, 0 removed, 25 recycled; write=3D23= 9.934 s, sync=3D0.039 s, total=3D239.989 s; sync files=3D783, longest=3D0.0= 01 s, average=3D0.001 s; distance=3D250897 kB, estimate=3D13117629 kB=20 2026-06-04 09:35:15 UTC [197747] LOG: recovery restart point at 8EB7/7= 550F228=20 2026-06-04 09:35:15 UTC [197747] DETAIL: Last completed transaction wa= s at log time 2026-06-04 02:33:05.505911-07.=20 ^@^@^@^@^@^@^@^@=E2=80=94=E2=80=94> Forever<=E2=80=94=E2=80=94=E2=80=94=C2= =A0=C2=A0=C2=A0=C2=A0=20 =20 =20 =20 =20 =20 postgres=3D# select pg_last_wal_replay_lsn();=20 pg_last_wal_replay_lsn=20 ------------------------=20 8EB7/968FEBE8=20 =20 =20 postgres@uspgvento14Rb:/pg_data/data14/pg_wal$ ps -ef |grep wal=20 postgres 207957 197738 6 08:25 ? 00:04:52 postgres: 14_uspgvento1= 4Rb: walreceiver streaming 8EB7/A39875E8=20 postgres 211309 163479 0 09:36 pts/12 00:00:00 grep wal=20 postgres@uspgvento14Rb:/pg_data/data14/pg_wal$ ps -ef |grep reco=20 postgres 197740 197738 16 08:11 ? 00:14:26 postgres: 14_uspgvento1= 4Rb: startup recovering 0000000200008EB700000096=20 postgres 211311 163479 0 09:36 pts/12 00:00:00 grep reco=20 =20 =20 =20 Content of the wal :=20 =20 postgres@uspgvento14b:/pg_data/data14/pg_wal$ pg_waldump 0000000200008EB700= 000096 |grep 8FEBE8=20 rmgr: MultiXact len (rec/tot): 54/ 54, tx: 2681967401, lsn: 8EB7/9= 68FEBE8, prev 8EB7/968FEBA8, desc: CREATE_ID 2439167 offset 5217620 nmember= s 2: 2681967400 (keysh) 2681967401 (keysh)=20 rmgr: Heap len (rec/tot): 54/ 54, tx: 2681967401, lsn: 8EB7/9= 68FEC20, prev 8EB7/968FEBE8, desc: LOCK off 35: xid 2439167: flags 0x00 IS_= MULTI LOCK_ONLY KEYSHR_LOCK , blkref #0: rel 1663/6176124/6474188 blk 29758= 258=20 =20 =20 There are still plenty of files to process:=20 postgres@uspgvento14Rb:/pg_data/data14/pg_wal$ ls -ltr |grep -A5 -B5 000000= 0200008EB700000096=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:32 0000000200008EB7000000= 91=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:32 0000000200008EB7000000= 92=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:32 0000000200008EB7000000= 93=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:32 0000000200008EB7000000= 94=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:32 0000000200008EB7000000= 95=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:33 0000000200008EB7000000= 96=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:33 0000000200008EB7000000= 97=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:33 0000000200008EB7000000= 98=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:33 0000000200008EB7000000= 99=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:34 0000000200008EB7000000= 9A=20 -rw------- 1 postgres postgres 16777216 Jun 4 09:34 0000000200008EB7000000= 9B=20 =20 =20 Postgresql.conf Secondary (primary is the same, but with more memory)=20 =20 All non-default values:=20 =20 shared_buffers =3D 4GB # min 128kB=20 temp_buffers =3D 512MB # min 800kB=20 work_mem =3D 512MB # min 64kB=20 maintenance_work_mem =3D 4GB # min 1MB=20 autovacuum_work_mem =3D 512MB # min 1MB, or -1 to use maintenan= ce_work_mem=20 max_stack_depth =3D 7MB=20 bgwriter_lru_maxpages =3D 1000 # max buffers written/round, 0 di= sables=20 =20 wal_level =3D logical # minimal, replica, or logical=20 wal_log_hints =3D on # also do full page writes of non= -critical updates=20 =20 checkpoint_completion_target =3D 0.8 # checkpoint target duration, 0.0= - 1.0=20 checkpoint_warning =3D 600s # 0 disables=20 max_wal_size =3D 500GB=20 min_wal_size =3D 50GB=20 =20 max_wal_senders =3D 10 # max number of walsender processes=20 max_replication_slots =3D 30 # max number of replication slots=20 wal_keep_size =3D 150GB # in megabytes; 0 disables 30720=3D30GB = 1.5 day=20 max_slot_wal_keep_size =3D 500GB # in megabytes; -1 disables 307200=3D300G= B 4 days=20 wal_sender_timeout =3D 600s # in milliseconds; 0 disables=20 =20 max_standby_archive_delay =3D 48h # max delay before canceling quer= ies=20 max_standby_streaming_delay =3D 48h # max delay before canceling quer= ies=20 hot_standby_feedback =3D on # send info from standby to preve= nt=20 wal_receiver_timeout =3D 600s # time that receiver waits for ------=_Part_13096318_1715792670.1780617085752 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =
Replication status

On= the primary:

SELECT * FROM pg_stat_replicatio= n;

Shows:

* Is th= e standby connected?
* What LSN has been sent?
= * What LSN has been replayed?


2= . WAL receiver status

On the standby:
SELECT * FROM pg_stat_wal_receiver;

Show= s:

* Is WAL still being received?
* From which server?
* Latest WAL location received?

3. Recovery progress

On= the standby:

SELECT
  = ;  pg_last_wal_receive_lsn(),
    pg_last= _wal_replay_lsn(),
    pg_last_xact_replay_tim= estamp();
   

Thi= s tells you whether:

* WAL is arriving but not= replaying.
* WAL is replaying slowly.
* Replay= has completely stopped.

4. PostgreSQL logs

Look for messages such as:

invalid record length
PANIC
could not r= ead WAL
requested timeline
waiting for WAL
<= /div>


Capture, transform, and learn from PostgreSQL
<= br>

Email Disclaimer:


=

This email and any at= tachments may contain privileged and confidential information intended sole= ly for the use of the individual or entity to whom it is addressed. If you = are not the intended recipient, you are hereby notified that any disseminat= ion, distribution, copying, or use of this email or its contents is strictl= y prohibited. If you have received this email in error, please notify the s= ender immediately by replying to this message and delete it from your syste= m.


Please note that any views or opinions expressed in this email are solely= those of the author and do not necessarily represent those of Burnside Pro= ject LLC. Although we have taken precautions to ensure this email is free o= f viruses or other malicious software, we cannot guarantee the security or = integrity of email communications. Recipients should verify attachments for= possible threats.


Thank you.

Burnside Project




From: Jorge Daniel <elgaita@hotmail.com>
To:= "pgsql-admin@lists.postgresql.org"<pgsql-admin@lists.postgresql.org>=
Date: Thu, 04 Jun 2026 14:13:45 -0700
Subject: Pg14 replication issu= e , recovery stucks in a random file without advancing while streaming from= primary

Good day to everyone

We're asking the PG-= comunity for some help if it is possible.

We have a primary with 2= secondaries: the primary went down and one of the secondaries was promoted= . The orphaned Secondary reconnected to the new primary and is replicating = ok.
We had to reconstruct a new secondary, we did it as we always do wi= th the basic and dependable:

pg_basebackup -h uspgvento14r.us.loca= l -U replicator -p 5432 -D $PGDATA -Fp -Xs -P -R --checkpoint=3Dfast --crea= te-slot --slot=3Dus_vento_replica_slot_aux

It ran for 5hrs without= problem. When it finished:
$ pg_ctl start

The recovery was ru= nning until a consistent point was reached, the database opened and started= streaming the rest from the Primary.
After an hour or so, the recovery= got stuck in a certain wal file. No more log entries about it (debug 1) an= d after some hours the stream connection got disconnected.
The secondar= y is hung on that wal file and not going forward with the rest of the wal f= ile list.

We re-tried this several times, changing the storage (ju= st in case), with a new box with the same original Ubuntu 22.04 instead of = 24.04 (just in case), but the result was the same.
Even though we have= the 22.04 and 24.04 in parallel, we saw both replica engines freeze on the= same file (everytime we re-created the stuck-wal-file changed, clearly). <= br>We're out of ideas of what's happening.
Could you please shed some l= ight here?


Primary: uspgvento14r

Version
server_= version | 14.11 (Ubuntu 14.11-0ubuntu0.22.04.1)

pg_stat_replicatio= n:

pid | 4177197
usesysid | 16474
use= name | replicator
application_name | 14_replica
client_add= r | 192.168.11.33
client_hostname |
client_port | 53534 =
backend_start | 2026-06-04 01:25:18.100137-07
backend_xmin |=
state | streaming
sent_lsn | 8EBA/CDA68930
= write_lsn | 8EBA/CDA68930
flush_lsn | 8EBA/CDA68930
r= eplay_lsn | 8EB7/968FEBE8
write_lag | 00:00:00.001543
= flush_lag | 00:00:00.00225
replay_lag | 01:00:22.530041 sync_priority | 0
sync_state | async
reply_time | = 2026-06-04 03:33:28.037703-07

Log Primary

2026-06-04 01:2= 5:18 PDT [unknown] [unknown] 192.168.12.34 [unknown] [4177197] LOG: connec= tion received: host=3D192.168.12.34 port=3D53534
2026-06-04 01:25:18 PD= T [2858] LOG: background worker "logical replication worker" (PID 4177= 182) exited with exit code 1
2026-06-04 01:25:18 PDT [unknown] replicat= or 192.168.12.34 [unknown] [4177197] LOG: connection authenticated: identi= ty=3D"replicator" method=3Dmd5 (/etc/postgresql/14/main/pg_hba.conf:95) 2026-06-04 01:25:18 PDT [unknown] replicator 192.168.12.34 [unknown] [4177= 197] LOG: replication connection authorized: user=3Dreplicator application= _name=3D14_uspgvento14Rb SSL enabled (protocol=3DTLSv1.3, cipher=3DTLS_AES_= 256_GCM_SHA384, bits=3D256)
--
2026-06-04 04:02:26 PDT [unknown] re= plicator 192.168.12.34 14_uspgvento14Rb [4177197] LOG: disconnection: sess= ion time: 2:37:08.575 user=3Dreplicator database=3D host=3D192.168.12.34 po= rt=3D53534




Secondary : 14_uspgvento14Rb 192.168.12= .34

Version:
server_version | 14.11 (Ubuntu 14.11-0ubuntu0.24.= 04.1)

Ubuntu 14.23-1.pgdg24.04+1


.....
2026-06-0= 4 08:25:14 UTC [197740] DEBUG: got WAL segment from archive
2026-0= 6-04 08:25:14 UTC [197740] LOG: restored log file "0000000200008EA4000= 0002D" from archive
2026-06-04 08:25:14 UTC [197740] DEBUG: got WA= L segment from archive
2026-06-04 08:25:14 UTC [197740] LOG: resto= red log file "0000000200008EA40000002E" from archive
2026-06-04 08:25:1= 4 UTC [197740] DEBUG: got WAL segment from archive
2026-06-04 08:2= 5:14 UTC [197740] LOG: restored log file "0000000200008EA40000002F" fr= om archive
2026-06-04 08:25:14 UTC [197740] DEBUG: got WAL segment= from archive
2026-06-04 08:25:14 UTC [197740] LOG: restored log f= ile "0000000200008EA400000030" from archive
2026-06-04 08:25:14 UTC = [197740] DEBUG: got WAL segment from archive
026-06-04 08:25:15 UTC = [197747] DEBUG: checkpoint sync: number=3D4 file=3Dbase/6176124/2840_fs= m time=3D0.011 ms
2026-06-04 08:25:15 UTC [197747] DEBUG: checkpoi= nt sync: number=3D5 file=3Dbase/6176124/6599312.33 time=3D1.493 ms
....= ..
2026-06-04 08:25:18 UTC [197740] LOG: restored log file "000000= 0200008EA400000044" from archive
2026-06-04 08:25:18 UTC [197740] D= EBUG: got WAL segment from archive
2026-06-04 08:25:18 UTC [197740= ] DEBUG: end of backup reached
2026-06-04 08:25:18 UTC [197740] CO= NTEXT: WAL redo at 8EA4/44680C08 for XLOG/BACKUP_END: 8E91/8B065578
20= 26-06-04 08:25:18 UTC [197740] LOG: consistent recovery state reached = at 8EA4/44680C30
2026-06-04 08:25:18 UTC [197738] LOG: database sy= stem is ready to accept read-only connections
cp: cannot stat '/pg_data= /pg14_wal_archive/0000000200008EA400000045': No such file or directory
= 2026-06-04 08:25:18 UTC [207957] LOG: started streaming WAL from prima= ry at 8EA4/45000000 on timeline 2
2026-06-04 08:25:39 UTC [unknown] [un= known] [local] [unknown] [208039] LOG: connection received: host=3D[local]=
2026-06-04 08:25:39 UTC postgres postgres [local] [unknown] [208039] L= OG: connection authorized: user=3Dpostgres database=3Dpostgres application= _name=3Dpsql
2026-06-04 08:26:15 UTC [197747] LOG: restartpoint st= arting: time
2026-06-04 08:26:15 UTC [197747] DEBUG: performing re= plication slot checkpoint
......
22026-06-04 08:30:15 UTC [1977= 47] DEBUG: checkpoint sync: number=3D4 file=3Dbase/6176124/2840_fsm time= =3D0.006 ms
2026-06-04 08:30:15 UTC [197747] DEBUG: checkpoint syn= c: number=3D5 file=3Dbase/6176124/6599312.33 time=3D1.572 ms
2026-06-04= 08:30:15 UTC [197747] DEBUG: checkpoint sync: number=3D6 file=3Dbase/= 6176124/6602460.56 time=3D1.791 ms
.....
2026-06-04 09:35:15 UTC = [197747] DEBUG: checkpoint sync: number=3D782 file=3Dbase/6176124/660106= 7.15 time=3D0.003 ms
2026-06-04 09:35:15 UTC [197747] DEBUG: check= point sync: number=3D783 file=3Dbase/6176124/6374431.10 time=3D0.006 ms 2026-06-04 09:35:15 UTC [197747] LOG: restartpoint complete: wrote 99= 500 buffers (19.0%); 0 WAL file(s) added, 0 removed, 25 recycled; write=3D2= 39.934 s, sync=3D0.039 s, total=3D239.989 s; sync files=3D783, longest=3D0.= 001 s, average=3D0.001 s; distance=3D250897 kB, estimate=3D13117629 kB
= 2026-06-04 09:35:15 UTC [197747] LOG: recovery restart point at 8EB7/7= 550F228
2026-06-04 09:35:15 UTC [197747] DETAIL: Last completed tr= ansaction was at log time 2026-06-04 02:33:05.505911-07.
^@^@^@^@^@^@^@= ^@=E2=80=94=E2=80=94> Forever<=E2=80=94=E2=80=94=E2=80=94  =   





postgres=3D# select pg_last_wal_r= eplay_lsn();
pg_last_wal_replay_lsn
------------------------
= 8EB7/968FEBE8


postgres@uspgvento14Rb:/pg_data/data14/pg_wal$ = ps -ef |grep wal
postgres 207957 197738 6 08:25 ? 00:04:52 po= stgres: 14_uspgvento14Rb: walreceiver streaming 8EB7/A39875E8
postgres = 211309 163479 0 09:36 pts/12 00:00:00 grep wal
postgres@uspgvento1= 4Rb:/pg_data/data14/pg_wal$ ps -ef |grep reco
postgres 197740 197738 = 16 08:11 ? 00:14:26 postgres: 14_uspgvento14Rb: startup recovering 0= 000000200008EB700000096
postgres 211311 163479 0 09:36 pts/12 00:0= 0:00 grep reco



Content of the wal :

postgres@us= pgvento14b:/pg_data/data14/pg_wal$ pg_waldump 0000000200008EB700000096 |gre= p 8FEBE8
rmgr: MultiXact len (rec/tot): 54/ 54, tx: 2681967401= , lsn: 8EB7/968FEBE8, prev 8EB7/968FEBA8, desc: CREATE_ID 2439167 offset 52= 17620 nmembers 2: 2681967400 (keysh) 2681967401 (keysh)
rmgr: Heap = len (rec/tot): 54/ 54, tx: 2681967401, lsn: 8EB7/968FEC20, prev 8= EB7/968FEBE8, desc: LOCK off 35: xid 2439167: flags 0x00 IS_MULTI LOCK_ONLY= KEYSHR_LOCK , blkref #0: rel 1663/6176124/6474188 blk 29758258

<= br>There are still plenty of files to process:
postgres@uspgvento14Rb:/= pg_data/data14/pg_wal$ ls -ltr |grep -A5 -B5 0000000200008EB700000096
-= rw------- 1 postgres postgres 16777216 Jun 4 09:32 0000000200008EB70000009= 1
-rw------- 1 postgres postgres 16777216 Jun 4 09:32 0000000200008EB7= 00000092
-rw------- 1 postgres postgres 16777216 Jun 4 09:32 000000020= 0008EB700000093
-rw------- 1 postgres postgres 16777216 Jun 4 09:32 00= 00000200008EB700000094
-rw------- 1 postgres postgres 16777216 Jun 4 0= 9:32 0000000200008EB700000095
-rw------- 1 postgres postgres 16777216 J= un 4 09:33 0000000200008EB700000096
-rw------- 1 postgres postgres 167= 77216 Jun 4 09:33 0000000200008EB700000097
-rw------- 1 postgres postg= res 16777216 Jun 4 09:33 0000000200008EB700000098
-rw------- 1 postgre= s postgres 16777216 Jun 4 09:33 0000000200008EB700000099
-rw------- 1 = postgres postgres 16777216 Jun 4 09:34 0000000200008EB70000009A
-rw---= ---- 1 postgres postgres 16777216 Jun 4 09:34 0000000200008EB70000009B

Postgresql.conf Secondary (primary is the same, but with more me= mory)

All non-default values:

shared_buffers =3D 4GB = # min 128kB
temp_buffers =3D 512MB # = min 800kB
work_mem =3D 512MB # min 64kB =
maintenance_work_mem =3D 4GB # min 1MB
autovacuum_work= _mem =3D 512MB # min 1MB, or -1 to use maintenance_work_mem max_stack_depth =3D 7MB
bgwriter_lru_maxpages =3D 1000 # ma= x buffers written/round, 0 disables

wal_level =3D logical = # minimal, replica, or logical
wal_log_hints =3D on = # also do full page writes of non-critical updates

c= heckpoint_completion_target =3D 0.8 # checkpoint target duration, 0.0 = - 1.0
checkpoint_warning =3D 600s # 0 disables
max_wa= l_size =3D 500GB
min_wal_size =3D 50GB

max_wal_senders =3D 10 = # max number of walsender processes
max_replication_slots = =3D 30 # max number of replication slots
wal_keep_size =3D 150GB = # in megabytes; 0 disables 30720=3D30GB 1.5 day
max_slot_wal= _keep_size =3D 500GB # in megabytes; -1 disables 307200=3D300GB 4 days wal_sender_timeout =3D 600s # in milliseconds; 0 disables

m= ax_standby_archive_delay =3D 48h # max delay before canceling queri= es
max_standby_streaming_delay =3D 48h # max delay before canceli= ng queries
hot_standby_feedback =3D on # send info from s= tandby to prevent
wal_receiver_timeout =3D 600s # time that= receiver waits for


=
------=_Part_13096318_1715792670.1780617085752--