public inbox for [email protected]  
help / color / mirror / Atom feed
Re: PgbackRest PointTIme Recovery : server unable to start back
3+ messages / 2 participants
[nested] [flat]

* Re: PgbackRest PointTIme Recovery : server unable to start back
@ 2024-07-26 05:15 Muhammad Ikram <[email protected]>
  2024-07-26 06:33 ` Re: PgbackRest PointTIme Recovery : server unable to start back KK CHN <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Muhammad Ikram @ 2024-07-26 05:15 UTC (permalink / raw)
  To: KK CHN <[email protected]>; +Cc: pgsql-general

Hi KK CHN

Could you check server logs ?
Your service trace suggests that it started server and then failure happened

ul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE



Regards,
Ikram


On Fri, Jul 26, 2024 at 10:04 AM KK CHN <[email protected]> wrote:

> List,
>
> Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
> I am following the   PTR  on RHEL9 EPAS16.
> I am able to do a  backup(Full, diff and incr)  and   restore from a full
> backup  and restart of EPAS16 works fine.
>
> But when I do an incremental backup  after doing the   procedures
> mentioned in the PTR section of the above  reference link and  try
> restoring the EDB database from the INCR backup   and then starting up the
> EPAS16 the server  always results in dead state
>
>  My repo server is another machine.  If  I do  a  full restore  on the DB
> server  ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo  --delta
> restore) it works  and the server starts without any issue.
> Restoring  from  Incremental backup tty output shows successful but edb
> service start  results in  failure.
>
> Any help is much appreciated.
>
> Krishane.
>
>
>
>
> STEPS followed:
>
> after dropping  the table  pg-primary *⇒* Drop the important table
> ([section]
> stopped the EDB server.
>
> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo    --delta
> --set=20240719-122703F_20240724-094727I --target-timeline=current
> --type=time "--target=2024-07-24 09:44:01.3255+05:30"
> --target-action=promote  restore
> .....................................................
>
> 2024-07-26 09:48:06.343 P00   INFO: restore command end: completed
> successfully (1035ms)
>
>
> But
>
> [root@rservice01 ~]# sudo systemctl start edb-as-16.service
> [root@service01 ~]# sudo systemctl status edb-as-16.service
> × edb-as-16.service - EDB Postgres Advanced Server 16
>      Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
> preset: disabled)
>    *  Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
> IST; 8s ago
>    Duration: 242ms
>     Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
> ${PGDATA} (code=exited, status=0/SUCCESS)
>     Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
> (code=exited, status=1/FAILURE)
>    Main PID: 41908 (code=exited, status=1/FAILURE)
>         CPU: 331ms
>
> Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced
> Server 16...
> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
> LOG:  redirecting log output to logging collector process
> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
> HINT:  Future log output will appear in directory "log".
> Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
> 16.
> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
> exited, code=exited, status=1/FAILURE
> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
> 41909 (edb-postgres) with signal SIGKILL.
> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with
> result 'exit-code'.
> [root@service01 ~]#
>
> why is it unable to perform a restore and recovery  from an incr  backup ?
>
>
>
>
>
>
>
> On The Repo Server
> [root@service02 ~]#  sudo -u postgres pgbackrest --stanza=Demo_Repo info
> stanza: Demo_Repo
>     status: ok
>     cipher: aes-256-cbc
>
>     db (current)
>         wal archive min/max (16):
> 000000020000000000000021/0000000B0000000000000041
>
>         full backup: 20240719-122703F
>             timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
> 12:27:06+05:30
>             wal start/stop: 00000002000000000000002A /
> 00000002000000000000002A
>             database size: 61.7MB, database backup size: 61.7MB
>             repo1: backup size: 9.6MB
>
>         incr backup: 20240719-122703F_20240719-123353I
>             timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
> 12:33:56+05:30
>             wal start/stop: 00000002000000000000002C /
> 00000002000000000000002C
>             database size: 61.7MB, database backup size: 6.4MB
>             repo1: backup size: 6.2KB
>             backup reference list: 20240719-122703F
>
>         diff backup: 20240719-122703F_20240719-123408D
>             timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
> 12:34:10+05:30
>             wal start/stop: 00000002000000000000002E /
> 00000002000000000000002E
>             database size: 61.7MB, database backup size: 6.4MB
>             repo1: backup size: 6.4KB
>             backup reference list: 20240719-122703F
>
>         incr backup: 20240719-122703F_20240723-110212I
>             timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
> 11:02:15+05:30
>             wal start/stop: 000000070000000000000038 /
> 000000070000000000000038
>             database size: 48MB, database backup size: 6.4MB
>             repo1: backup size: 9.8KB
>             backup reference list: 20240719-122703F,
> 20240719-122703F_20240719-123408D
>
>         incr backup: 20240719-122703F_20240723-141818I
>             timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
> 14:18:22+05:30
>             wal start/stop: 00000008000000000000003C /
> 00000008000000000000003C
>             database size: 75.4MB, database backup size: 33.8MB
>             repo1: backup size: 4.7MB
>             backup reference list: 20240719-122703F,
> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I
>
>         incr backup: 20240719-122703F_20240724-094727I
>             timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
> 09:47:30+05:30
>             wal start/stop: 00000009000000000000003D /
> 00000009000000000000003D
>             database size: 62MB, database backup size: 11.5MB
>             repo1: backup size: 72KB
>             backup reference list: 20240719-122703F,
> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
> 20240719-122703F_20240723-141818
>
>
>
>

-- 
Muhammad Ikram


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: PgbackRest PointTIme Recovery : server unable to start back
  2024-07-26 05:15 Re: PgbackRest PointTIme Recovery : server unable to start back Muhammad Ikram <[email protected]>
@ 2024-07-26 06:33 ` KK CHN <[email protected]>
  2024-07-26 06:57   ` Re: PgbackRest PointTIme Recovery : server unable to start back Muhammad Ikram <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: KK CHN @ 2024-07-26 06:33 UTC (permalink / raw)
  To: Muhammad Ikram <[email protected]>; +Cc: pgsql-general

here the log outputs....

When I ran

$sudo -u enterprisedb pgbackrest --stanza=Demo_Repo  --delta
 --set=20240719-122703F_20240724-094727I --target-timeline=current
--type=time "--target=2024-07-24 09:44:01.3255+05:30"
--target-action=promote  restore



The Edb server  log out put shows
...........................................................................
................................................................
2024-07-23 15:34:12 IST ERROR:  syntax error at or near "selecct" at
character 1
2024-07-23 15:34:12 IST STATEMENT:  selecct current_timestamp
2024-07-23 15:34:26 IST FATAL:  database "enterprisedb" does not exist
2024-07-24 09:38:20 IST FATAL:  database "enterprisedb" does not exist
2024-07-24 09:42:17 IST FATAL:  database "test1" does not exist
2024-07-24 09:43:56 IST FATAL:  database "enterprisedb" does not exist
2024-07-24 09:45:07 IST FATAL:  database "enterprisedb" does not exist
2024-07-24 09:45:16 IST ERROR:  relation "important_table" does not exist
at character 58
2024-07-24 09:45:16 IST STATEMENT:  begin; drop table important_table;
commit; select * from important_table;
2024-07-24 09:46:00 IST LOG:  checkpoint starting: time
2024-07-24 09:46:02 IST LOG:  checkpoint complete: wrote 16 buffers (0.1%);
0 WAL file(s) added, 0 removed, 0 recycled; write=1.506 s, sync=0.001 s,
total=1.509 s; sync files=9, longest=0.001 s, average=0.001 s; distance=93
kB, estimate=171 kB; lsn=0/3C045768, redo lsn=0/3C045730
2024-07-24 09:47:27 IST LOG:  checkpoint starting: immediate force wait
2024-07-24 09:47:27 IST LOG:  checkpoint complete: wrote 2 buffers (0.0%);
0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s,
total=0.003 s; sync files=1, longest=0.001 s, average=0.001 s;
distance=16106 kB, estimate=16106 kB; lsn=0/3D000060, redo lsn=0/3D000028
2024-07-24 09:47:27.788 P00   INFO: archive-push command begin 2.52.1:
[pg_wal/00000009000000000000003C] --exec-id=34036-c1d6de0b
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-24 09:47:28.130 P00   INFO: pushed WAL file
'00000009000000000000003C' to the archive
2024-07-24 09:47:28.230 P00   INFO: archive-push command end: completed
successfully (444ms)



When  I ran       $ sudo systemctl start edb-as-16.service

4-07-26 11:32:56 IST LOG:  starting PostgreSQL 16.3 (EnterpriseDB Advanced
Server 16.3.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1
20231218 (Red Hat 11.4.1-3), 64-bit
2024-07-26 11:32:56 IST LOG:  listening on IPv4 address "0.0.0.0", port 5444
2024-07-26 11:32:56 IST LOG:  listening on IPv6 address "::", port 5444
2024-07-26 11:32:56 IST LOG:  listening on Unix socket "/tmp/.s.PGSQL.5444"
2024-07-26 11:32:56 IST LOG:  database system was interrupted; last known
up at 2024-07-24 09:47:27 IST
2024-07-26 11:32:56 IST LOG:  starting point-in-time recovery to 2024-07-24
09:44:01.3255+05:30
2024-07-26 11:32:56 IST LOG:  starting backup recovery with redo LSN
0/3D000028, checkpoint LSN 0/3D000060, on timeline ID 9
2024-07-26 11:32:56.475 P00   INFO: archive-get command begin 2.52.1:
[00000009.history, pg_wal/RECOVERYHISTORY] --exec-id=43293-1d583a8e
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.709 P00   INFO: found 00000009.history in the repo1:
16-1 archive
2024-07-26 11:32:56.809 P00   INFO: archive-get command end: completed
successfully (335ms)
2024-07-26 11:32:56 IST LOG:  restored log file "00000009.history" from
archive
2024-07-26 11:32:56.816 P00   INFO: archive-get command begin 2.52.1:
[00000009000000000000003D, pg_wal/RECOVERYXLOG] --exec-id=43295-55c5d6f7
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.999 P00   INFO: found 00000009000000000000003D in the
repo1: 16-1 archive
2024-07-26 11:32:57.099 P00   INFO: archive-get command end: completed
successfully (285ms)
2024-07-26 11:32:57 IST LOG:  restored log file "00000009000000000000003D"
from archive
2024-07-26 11:32:57 IST LOG:  redo starts at 0/3D000028
2024-07-26 11:32:57.114 P00   INFO: archive-get command begin 2.52.1:
[00000009000000000000003E, pg_wal/RECOVERYXLOG] --exec-id=43297-34867c05
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.282 P00   INFO: found 00000009000000000000003E in the
repo1: 16-1 archive
2024-07-26 11:32:57.382 P00   INFO: archive-get command end: completed
successfully (270ms)
2024-07-26 11:32:57 IST LOG:  restored log file "00000009000000000000003E"
from archive
2024-07-26 11:32:57.400 P00   INFO: archive-get command begin 2.52.1:
[00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43299-e2db2e1b
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.521 P00   INFO: unable to find 00000009000000000000003F
in the archive
2024-07-26 11:32:57.621 P00   INFO: archive-get command end: completed
successfully (222ms)
2024-07-26 11:32:57 IST LOG:  completed backup recovery with redo LSN
0/3D000028 and end LSN 0/3D000100
2024-07-26 11:32:57 IST LOG:  consistent recovery state reached at
0/3D000100
2024-07-26 11:32:57 IST LOG:  database system is ready to accept read-only
connections
2024-07-26 11:32:57.632 P00   INFO: archive-get command begin 2.52.1:
[00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43301-f613dae9
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.761 P00   INFO: unable to find 00000009000000000000003F
in the archive
2024-07-26 11:32:57.861 P00   INFO: archive-get command end: completed
successfully (231ms)
2024-07-26 11:32:57 IST LOG:  redo done at 0/3E000060 system usage: CPU:
user: 0.00 s, system: 0.00 s, elapsed: 0.75 s
2024-07-26 11:32:57 IST FATAL:  recovery ended before configured recovery
target was reached
2024-07-26 11:32:57 IST LOG:  startup process (PID 43292) exited with exit
code 1


ONLY inference  I can make is

  INFO  unable to find   00000009000000000000003F in the archive    ( This
means  the  EDB server  (10.10.20.6  ) unable to push the archives to the
 Repo server(10.10.20.7 ) ?    Is that the reason for the  recovery and
start backing of edb server fails ?


the pg_hba.conf   entry in the EDB Server machine is as

host    all             all             127.0.0.1/32            ident
host    all             all             10.10.20.7/32          scram-sha-256
#host    all             all              10.10.20.7/32          trust
# IPv6 local connections:
host    all             all             ::1/128                 ident
#host    all             all             10.10.20.7/24           trust

# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     peer
host    replication     all             10.10.20.7/32
scram-sha-256
host    replication     all             127.0.0.1/32            ident
host    replication     all             ::1/128                 ident


Do I have to change anything in pg_hba.conf ?


my EDB Server conf as  this

archive_mode = on
archive_command = 'pgbackrest --stanza=Demo_Repo  archive-push %p'
log_filename = 'postgresql.log'
max_wal_senders = 5
wal_level = replica


Any help ?

Krishane




On Fri, Jul 26, 2024 at 10:45 AM Muhammad Ikram <[email protected]> wrote:

> Hi KK CHN
>
> Could you check server logs ?
> Your service trace suggests that it started server and then failure
> happened
>
> ul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
> 16.
> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
> exited, code=exited, status=1/FAILURE
>
>
>
> Regards,
> Ikram
>
>
> On Fri, Jul 26, 2024 at 10:04 AM KK CHN <[email protected]> wrote:
>
>> List,
>>
>> Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
>> I am following the   PTR  on RHEL9 EPAS16.
>> I am able to do a  backup(Full, diff and incr)  and   restore from a full
>> backup  and restart of EPAS16 works fine.
>>
>> But when I do an incremental backup  after doing the   procedures
>> mentioned in the PTR section of the above  reference link and  try
>> restoring the EDB database from the INCR backup   and then starting up the
>> EPAS16 the server  always results in dead state
>>
>>  My repo server is another machine.  If  I do  a  full restore  on the DB
>> server  ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo  --delta
>> restore) it works  and the server starts without any issue.
>> Restoring  from  Incremental backup tty output shows successful but edb
>> service start  results in  failure.
>>
>> Any help is much appreciated.
>>
>> Krishane.
>>
>>
>>
>>
>> STEPS followed:
>>
>> after dropping  the table  pg-primary *⇒* Drop the important table
>> ([section]
>> stopped the EDB server.
>>
>> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo    --delta
>> --set=20240719-122703F_20240724-094727I --target-timeline=current
>> --type=time "--target=2024-07-24 09:44:01.3255+05:30"
>> --target-action=promote  restore
>> .....................................................
>>
>> 2024-07-26 09:48:06.343 P00   INFO: restore command end: completed
>> successfully (1035ms)
>>
>>
>> But
>>
>> [root@rservice01 ~]# sudo systemctl start edb-as-16.service
>> [root@service01 ~]# sudo systemctl status edb-as-16.service
>> × edb-as-16.service - EDB Postgres Advanced Server 16
>>      Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
>> preset: disabled)
>>    *  Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
>> IST; 8s ago
>>    Duration: 242ms
>>     Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
>> ${PGDATA} (code=exited, status=0/SUCCESS)
>>     Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
>> (code=exited, status=1/FAILURE)
>>    Main PID: 41908 (code=exited, status=1/FAILURE)
>>         CPU: 331ms
>>
>> Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced
>> Server 16...
>> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
>> LOG:  redirecting log output to logging collector process
>> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
>> HINT:  Future log output will appear in directory "log".
>> Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced
>> Server 16.
>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
>> exited, code=exited, status=1/FAILURE
>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
>> 41909 (edb-postgres) with signal SIGKILL.
>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with
>> result 'exit-code'.
>> [root@service01 ~]#
>>
>> why is it unable to perform a restore and recovery  from an incr  backup ?
>>
>>
>>
>>
>>
>>
>>
>> On The Repo Server
>> [root@service02 ~]#  sudo -u postgres pgbackrest --stanza=Demo_Repo info
>> stanza: Demo_Repo
>>     status: ok
>>     cipher: aes-256-cbc
>>
>>     db (current)
>>         wal archive min/max (16):
>> 000000020000000000000021/0000000B0000000000000041
>>
>>         full backup: 20240719-122703F
>>             timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
>> 12:27:06+05:30
>>             wal start/stop: 00000002000000000000002A /
>> 00000002000000000000002A
>>             database size: 61.7MB, database backup size: 61.7MB
>>             repo1: backup size: 9.6MB
>>
>>         incr backup: 20240719-122703F_20240719-123353I
>>             timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
>> 12:33:56+05:30
>>             wal start/stop: 00000002000000000000002C /
>> 00000002000000000000002C
>>             database size: 61.7MB, database backup size: 6.4MB
>>             repo1: backup size: 6.2KB
>>             backup reference list: 20240719-122703F
>>
>>         diff backup: 20240719-122703F_20240719-123408D
>>             timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
>> 12:34:10+05:30
>>             wal start/stop: 00000002000000000000002E /
>> 00000002000000000000002E
>>             database size: 61.7MB, database backup size: 6.4MB
>>             repo1: backup size: 6.4KB
>>             backup reference list: 20240719-122703F
>>
>>         incr backup: 20240719-122703F_20240723-110212I
>>             timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
>> 11:02:15+05:30
>>             wal start/stop: 000000070000000000000038 /
>> 000000070000000000000038
>>             database size: 48MB, database backup size: 6.4MB
>>             repo1: backup size: 9.8KB
>>             backup reference list: 20240719-122703F,
>> 20240719-122703F_20240719-123408D
>>
>>         incr backup: 20240719-122703F_20240723-141818I
>>             timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
>> 14:18:22+05:30
>>             wal start/stop: 00000008000000000000003C /
>> 00000008000000000000003C
>>             database size: 75.4MB, database backup size: 33.8MB
>>             repo1: backup size: 4.7MB
>>             backup reference list: 20240719-122703F,
>> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I
>>
>>         incr backup: 20240719-122703F_20240724-094727I
>>             timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
>> 09:47:30+05:30
>>             wal start/stop: 00000009000000000000003D /
>> 00000009000000000000003D
>>             database size: 62MB, database backup size: 11.5MB
>>             repo1: backup size: 72KB
>>             backup reference list: 20240719-122703F,
>> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
>> 20240719-122703F_20240723-141818
>>
>>
>>
>>
>
> --
> Muhammad Ikram
>
>


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: PgbackRest PointTIme Recovery : server unable to start back
  2024-07-26 05:15 Re: PgbackRest PointTIme Recovery : server unable to start back Muhammad Ikram <[email protected]>
  2024-07-26 06:33 ` Re: PgbackRest PointTIme Recovery : server unable to start back KK CHN <[email protected]>
@ 2024-07-26 06:57   ` Muhammad Ikram <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Muhammad Ikram @ 2024-07-26 06:57 UTC (permalink / raw)
  To: KK CHN <[email protected]>; +Cc: pgsql-general

I don't think it has anything to do with the pg_hba.conf file. Archive
seems missing. I will suggest taking a full backup, as you have mentioned
that you are able to take full backup.


Regards,
Ikram


On Fri, Jul 26, 2024 at 11:24 AM KK CHN <[email protected]> wrote:

> here the log outputs....
>
> When I ran
>
> $sudo -u enterprisedb pgbackrest --stanza=Demo_Repo  --delta
>  --set=20240719-122703F_20240724-094727I --target-timeline=current
> --type=time "--target=2024-07-24 09:44:01.3255+05:30"
> --target-action=promote  restore
>
>
>
> The Edb server  log out put shows
> ...........................................................................
> ................................................................
> 2024-07-23 15:34:12 IST ERROR:  syntax error at or near "selecct" at
> character 1
> 2024-07-23 15:34:12 IST STATEMENT:  selecct current_timestamp
> 2024-07-23 15:34:26 IST FATAL:  database "enterprisedb" does not exist
> 2024-07-24 09:38:20 IST FATAL:  database "enterprisedb" does not exist
> 2024-07-24 09:42:17 IST FATAL:  database "test1" does not exist
> 2024-07-24 09:43:56 IST FATAL:  database "enterprisedb" does not exist
> 2024-07-24 09:45:07 IST FATAL:  database "enterprisedb" does not exist
> 2024-07-24 09:45:16 IST ERROR:  relation "important_table" does not exist
> at character 58
> 2024-07-24 09:45:16 IST STATEMENT:  begin; drop table important_table;
> commit; select * from important_table;
> 2024-07-24 09:46:00 IST LOG:  checkpoint starting: time
> 2024-07-24 09:46:02 IST LOG:  checkpoint complete: wrote 16 buffers
> (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.506 s,
> sync=0.001 s, total=1.509 s; sync files=9, longest=0.001 s, average=0.001
> s; distance=93 kB, estimate=171 kB; lsn=0/3C045768, redo lsn=0/3C045730
> 2024-07-24 09:47:27 IST LOG:  checkpoint starting: immediate force wait
> 2024-07-24 09:47:27 IST LOG:  checkpoint complete: wrote 2 buffers (0.0%);
> 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s,
> total=0.003 s; sync files=1, longest=0.001 s, average=0.001 s;
> distance=16106 kB, estimate=16106 kB; lsn=0/3D000060, redo lsn=0/3D000028
> 2024-07-24 09:47:27.788 P00   INFO: archive-push command begin 2.52.1:
> [pg_wal/00000009000000000000003C] --exec-id=34036-c1d6de0b
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-24 09:47:28.130 P00   INFO: pushed WAL file
> '00000009000000000000003C' to the archive
> 2024-07-24 09:47:28.230 P00   INFO: archive-push command end: completed
> successfully (444ms)
>
>
>
> When  I ran       $ sudo systemctl start edb-as-16.service
>
> 4-07-26 11:32:56 IST LOG:  starting PostgreSQL 16.3 (EnterpriseDB Advanced
> Server 16.3.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1
> 20231218 (Red Hat 11.4.1-3), 64-bit
> 2024-07-26 11:32:56 IST LOG:  listening on IPv4 address "0.0.0.0", port
> 5444
> 2024-07-26 11:32:56 IST LOG:  listening on IPv6 address "::", port 5444
> 2024-07-26 11:32:56 IST LOG:  listening on Unix socket "/tmp/.s.PGSQL.5444"
> 2024-07-26 11:32:56 IST LOG:  database system was interrupted; last known
> up at 2024-07-24 09:47:27 IST
> 2024-07-26 11:32:56 IST LOG:  starting point-in-time recovery to
> 2024-07-24 09:44:01.3255+05:30
> 2024-07-26 11:32:56 IST LOG:  starting backup recovery with redo LSN
> 0/3D000028, checkpoint LSN 0/3D000060, on timeline ID 9
> 2024-07-26 11:32:56.475 P00   INFO: archive-get command begin 2.52.1:
> [00000009.history, pg_wal/RECOVERYHISTORY] --exec-id=43293-1d583a8e
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:56.709 P00   INFO: found 00000009.history in the repo1:
> 16-1 archive
> 2024-07-26 11:32:56.809 P00   INFO: archive-get command end: completed
> successfully (335ms)
> 2024-07-26 11:32:56 IST LOG:  restored log file "00000009.history" from
> archive
> 2024-07-26 11:32:56.816 P00   INFO: archive-get command begin 2.52.1:
> [00000009000000000000003D, pg_wal/RECOVERYXLOG] --exec-id=43295-55c5d6f7
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:56.999 P00   INFO: found 00000009000000000000003D in the
> repo1: 16-1 archive
> 2024-07-26 11:32:57.099 P00   INFO: archive-get command end: completed
> successfully (285ms)
> 2024-07-26 11:32:57 IST LOG:  restored log file "00000009000000000000003D"
> from archive
> 2024-07-26 11:32:57 IST LOG:  redo starts at 0/3D000028
> 2024-07-26 11:32:57.114 P00   INFO: archive-get command begin 2.52.1:
> [00000009000000000000003E, pg_wal/RECOVERYXLOG] --exec-id=43297-34867c05
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:57.282 P00   INFO: found 00000009000000000000003E in the
> repo1: 16-1 archive
> 2024-07-26 11:32:57.382 P00   INFO: archive-get command end: completed
> successfully (270ms)
> 2024-07-26 11:32:57 IST LOG:  restored log file "00000009000000000000003E"
> from archive
> 2024-07-26 11:32:57.400 P00   INFO: archive-get command begin 2.52.1:
> [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43299-e2db2e1b
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:57.521 P00   INFO: unable to find
> 00000009000000000000003F in the archive
> 2024-07-26 11:32:57.621 P00   INFO: archive-get command end: completed
> successfully (222ms)
> 2024-07-26 11:32:57 IST LOG:  completed backup recovery with redo LSN
> 0/3D000028 and end LSN 0/3D000100
> 2024-07-26 11:32:57 IST LOG:  consistent recovery state reached at
> 0/3D000100
> 2024-07-26 11:32:57 IST LOG:  database system is ready to accept read-only
> connections
> 2024-07-26 11:32:57.632 P00   INFO: archive-get command begin 2.52.1:
> [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43301-f613dae9
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:57.761 P00   INFO: unable to find
> 00000009000000000000003F in the archive
> 2024-07-26 11:32:57.861 P00   INFO: archive-get command end: completed
> successfully (231ms)
> 2024-07-26 11:32:57 IST LOG:  redo done at 0/3E000060 system usage: CPU:
> user: 0.00 s, system: 0.00 s, elapsed: 0.75 s
> 2024-07-26 11:32:57 IST FATAL:  recovery ended before configured recovery
> target was reached
> 2024-07-26 11:32:57 IST LOG:  startup process (PID 43292) exited with exit
> code 1
>
>
> ONLY inference  I can make is
>
>   INFO  unable to find   00000009000000000000003F in the archive    ( This
> means  the  EDB server  (10.10.20.6  ) unable to push the archives to the
>  Repo server(10.10.20.7 ) ?    Is that the reason for the  recovery and
> start backing of edb server fails ?
>
>
> the pg_hba.conf   entry in the EDB Server machine is as
>
> host    all             all             127.0.0.1/32            ident
> host    all             all             10.10.20.7/32
>  scram-sha-256
> #host    all             all              10.10.20.7/32          trust
> # IPv6 local connections:
> host    all             all             ::1/128                 ident
> #host    all             all             10.10.20.7/24           trust
>
> # Allow replication connections from localhost, by a user with the
> # replication privilege.
> local   replication     all                                     peer
> host    replication     all             10.10.20.7/32
> scram-sha-256
> host    replication     all             127.0.0.1/32            ident
> host    replication     all             ::1/128                 ident
>
>
> Do I have to change anything in pg_hba.conf ?
>
>
> my EDB Server conf as  this
>
> archive_mode = on
> archive_command = 'pgbackrest --stanza=Demo_Repo  archive-push %p'
> log_filename = 'postgresql.log'
> max_wal_senders = 5
> wal_level = replica
>
>
> Any help ?
>
> Krishane
>
>
>
>
> On Fri, Jul 26, 2024 at 10:45 AM Muhammad Ikram <[email protected]> wrote:
>
>> Hi KK CHN
>>
>> Could you check server logs ?
>> Your service trace suggests that it started server and then failure
>> happened
>>
>> ul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
>> 16.
>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
>> exited, code=exited, status=1/FAILURE
>>
>>
>>
>> Regards,
>> Ikram
>>
>>
>> On Fri, Jul 26, 2024 at 10:04 AM KK CHN <[email protected]> wrote:
>>
>>> List,
>>>
>>> Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
>>> I am following the   PTR  on RHEL9 EPAS16.
>>> I am able to do a  backup(Full, diff and incr)  and   restore from a
>>> full backup  and restart of EPAS16 works fine.
>>>
>>> But when I do an incremental backup  after doing the   procedures
>>> mentioned in the PTR section of the above  reference link and  try
>>> restoring the EDB database from the INCR backup   and then starting up the
>>> EPAS16 the server  always results in dead state
>>>
>>>  My repo server is another machine.  If  I do  a  full restore  on the
>>> DB server  ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo  --delta
>>> restore) it works  and the server starts without any issue.
>>> Restoring  from  Incremental backup tty output shows successful but edb
>>> service start  results in  failure.
>>>
>>> Any help is much appreciated.
>>>
>>> Krishane.
>>>
>>>
>>>
>>>
>>> STEPS followed:
>>>
>>> after dropping  the table  pg-primary *⇒* Drop the important table
>>> ([section]
>>> stopped the EDB server.
>>>
>>> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo    --delta
>>> --set=20240719-122703F_20240724-094727I --target-timeline=current
>>> --type=time "--target=2024-07-24 09:44:01.3255+05:30"
>>> --target-action=promote  restore
>>> .....................................................
>>>
>>> 2024-07-26 09:48:06.343 P00   INFO: restore command end: completed
>>> successfully (1035ms)
>>>
>>>
>>> But
>>>
>>> [root@rservice01 ~]# sudo systemctl start edb-as-16.service
>>> [root@service01 ~]# sudo systemctl status edb-as-16.service
>>> × edb-as-16.service - EDB Postgres Advanced Server 16
>>>      Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
>>> preset: disabled)
>>>    *  Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
>>> IST; 8s ago
>>>    Duration: 242ms
>>>     Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
>>> ${PGDATA} (code=exited, status=0/SUCCESS)
>>>     Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
>>> (code=exited, status=1/FAILURE)
>>>    Main PID: 41908 (code=exited, status=1/FAILURE)
>>>         CPU: 331ms
>>>
>>> Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced
>>> Server 16...
>>> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
>>> LOG:  redirecting log output to logging collector process
>>> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
>>> HINT:  Future log output will appear in directory "log".
>>> Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced
>>> Server 16.
>>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
>>> exited, code=exited, status=1/FAILURE
>>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
>>> 41909 (edb-postgres) with signal SIGKILL.
>>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with
>>> result 'exit-code'.
>>> [root@service01 ~]#
>>>
>>> why is it unable to perform a restore and recovery  from an incr  backup
>>> ?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On The Repo Server
>>> [root@service02 ~]#  sudo -u postgres pgbackrest --stanza=Demo_Repo info
>>> stanza: Demo_Repo
>>>     status: ok
>>>     cipher: aes-256-cbc
>>>
>>>     db (current)
>>>         wal archive min/max (16):
>>> 000000020000000000000021/0000000B0000000000000041
>>>
>>>         full backup: 20240719-122703F
>>>             timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
>>> 12:27:06+05:30
>>>             wal start/stop: 00000002000000000000002A /
>>> 00000002000000000000002A
>>>             database size: 61.7MB, database backup size: 61.7MB
>>>             repo1: backup size: 9.6MB
>>>
>>>         incr backup: 20240719-122703F_20240719-123353I
>>>             timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
>>> 12:33:56+05:30
>>>             wal start/stop: 00000002000000000000002C /
>>> 00000002000000000000002C
>>>             database size: 61.7MB, database backup size: 6.4MB
>>>             repo1: backup size: 6.2KB
>>>             backup reference list: 20240719-122703F
>>>
>>>         diff backup: 20240719-122703F_20240719-123408D
>>>             timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
>>> 12:34:10+05:30
>>>             wal start/stop: 00000002000000000000002E /
>>> 00000002000000000000002E
>>>             database size: 61.7MB, database backup size: 6.4MB
>>>             repo1: backup size: 6.4KB
>>>             backup reference list: 20240719-122703F
>>>
>>>         incr backup: 20240719-122703F_20240723-110212I
>>>             timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
>>> 11:02:15+05:30
>>>             wal start/stop: 000000070000000000000038 /
>>> 000000070000000000000038
>>>             database size: 48MB, database backup size: 6.4MB
>>>             repo1: backup size: 9.8KB
>>>             backup reference list: 20240719-122703F,
>>> 20240719-122703F_20240719-123408D
>>>
>>>         incr backup: 20240719-122703F_20240723-141818I
>>>             timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
>>> 14:18:22+05:30
>>>             wal start/stop: 00000008000000000000003C /
>>> 00000008000000000000003C
>>>             database size: 75.4MB, database backup size: 33.8MB
>>>             repo1: backup size: 4.7MB
>>>             backup reference list: 20240719-122703F,
>>> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I
>>>
>>>         incr backup: 20240719-122703F_20240724-094727I
>>>             timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
>>> 09:47:30+05:30
>>>             wal start/stop: 00000009000000000000003D /
>>> 00000009000000000000003D
>>>             database size: 62MB, database backup size: 11.5MB
>>>             repo1: backup size: 72KB
>>>             backup reference list: 20240719-122703F,
>>> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
>>> 20240719-122703F_20240723-141818
>>>
>>>
>>>
>>>
>>
>> --
>> Muhammad Ikram
>>
>>

-- 
Muhammad Ikram


^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2024-07-26 06:57 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-07-26 05:15 Re: PgbackRest PointTIme Recovery : server unable to start back Muhammad Ikram <[email protected]>
2024-07-26 06:33 ` KK CHN <[email protected]>
2024-07-26 06:57   ` Muhammad Ikram <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox