Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sZSbp-0019nX-Ds for pgsql-general@arkaria.postgresql.org; Thu, 01 Aug 2024 10:00:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sZSbn-006igb-V4 for pgsql-general@arkaria.postgresql.org; Thu, 01 Aug 2024 10:00:31 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sZSbn-006idT-Ai for pgsql-general@lists.postgresql.org; Thu, 01 Aug 2024 10:00:31 +0000 Received: from mail-vk1-xa31.google.com ([2607:f8b0:4864:20::a31]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sZSbk-002Wgf-9J for pgsql-general@postgresql.org; Thu, 01 Aug 2024 10:00:30 +0000 Received: by mail-vk1-xa31.google.com with SMTP id 71dfb90a1353d-4f89c9a1610so99971e0c.3 for ; Thu, 01 Aug 2024 03:00:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722506427; x=1723111227; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=kfXqszIa+ltvQCpPHA4KAK8idjMMmNC+ETl39yUhF/A=; b=SSv3oAKXoo5px0WSS8rqbaihr2WN9Qf2KLppU46Hl/KnwsseZ5yqr71RXyXXdGPwZk 3+mBLoSBOSWMZDmzTFscxfE3yZ+rOl7a8AZtyOMi1w2JE4lW7ZVAnv5JXmyfANj+ay+Y EaV9wY0oNztzjrZr6ic6IxI9nlr/nQlwEJE++4FdoZxeMsFnGtWbU69aUdpI2rdg09ZK XoeGyGoMhXsEqDtKNLDCHBA1ZqqptTVIJFMAhmrfWW+m31jh4/wg8nOsJNHtahttiigw yn9OxHy9rPsJq7BbYaRGqp7JPfftF+Ye2e/Bk4+IZrj6TzhYG2OwfyLst3uA+NgiyFzw 6h9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722506427; x=1723111227; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kfXqszIa+ltvQCpPHA4KAK8idjMMmNC+ETl39yUhF/A=; b=MJNqIOLtptac+aoRZmkHxut91YXxHsdagGnMYBYIpYeLlD2ceu6PjNdBbw49bMnzev weeu4ZtrXZQlMxHON7oQS/AwT/VkQzUDWKFVM8Cz/80Q0hxmRGelUijQDmboRXI0B9y/ 4g+no6Nd0Aq/RzCyasWNEub5qVmkU3ZSQpi18SouGveqrft2qexgq0eaDhNgq6Vyr1tH ffKoE1l0i0q+X78Py+pbQHjdNUuIg78HBmg6u3lv3vLbrHmg4AMRV/YCiPKhGKfV7r54 TY7TNjaTYJGMTS7Uhmat4UzNFsP1TeKReqc+UMvpd4X4nTVZIIkU0PHKoc/GlQb4B9HG xEiw== X-Gm-Message-State: AOJu0YwG3qpYcIc3sCnrp7DqOKlaZ9RiW253Ghvuv/DhyGfO2TqZDWOf y3Q9OK6zLEQBvA0w6GkJHwjxJFsNC8RYKJJkebfYr1B/SRjjSkj/cQKHKGLmAqvOMWdRXljHiEc 6pGPgbTRZ2ePWMcNZOV+hU8Np5Be4hw== X-Google-Smtp-Source: AGHT+IEUPisPVUrwIACIsOVIvB+6Nvr8QfbDyrq/E/kmH705/52V5Z3qrTATCkOeZFUSe1o2PagBlsFLH2giII8x8+E= X-Received: by 2002:a05:6122:1787:b0:4f5:446c:f749 with SMTP id 71dfb90a1353d-4f8961dfbb5mr2837342e0c.6.1722506427389; Thu, 01 Aug 2024 03:00:27 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Kashif Zeeshan Date: Thu, 1 Aug 2024 15:00:16 +0500 Message-ID: Subject: Re: PgBackRest PTR recovery: After table drop to get dropped state To: KK CHN Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000ba5f01061e9c47c8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000ba5f01061e9c47c8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi On Thu, Aug 1, 2024 at 2:54=E2=80=AFPM KK CHN wrote: > List, > > *Not working (start EPAS server always fails):* > > 1. Testing PTR using PgBackRest(2.52.1) on RHEL9 EPAS-16, and RHEL9 ( > Repo Server) > > When I do a PTR > > 1. After doing a table drop and then > 2. Noting down the time stamp and then > 3. Taking an incremental backup in hope that If I do a restore from this > incr Backup, that won't contain the dropped table. > 4. Correct me if I am conceptually wrong here. > 5. I am *never *successful in restoring the EPAS server in this scenario= . > > > *I know the following will work for me, w*hy not the above one if I > really want that state of cluster also ? > > *This is Working. * > 1. Create table > 2. Noting down the timestamp > 3. Taking incremental backup on RepoServer. > 4. drop the created table . > 5. Then stop the EPAS server and do a PTR, by the --set=3Dstep 3 incr > backup and target=3D step 2 time stamp .. It finished the pgaback restor= e > and promote command > 6. I am able to start back the EPAS server and see the dropped table > recovered there. > > But If I want a PTR as in the first section it fails.. Why ? > > Thank you, > Krishane > > > > > > *What I have done and results obtained: * > > Created a table important_table4 in my EPAS and note down the time after > creation of this table it is ( t1 : "01-AUG-24 14:08:32.447796+05:30" ) > > Then I performed an Incremental backup (incr backup: > 20240729-160137F_20240801-141148I ) > timestamp start/stop: 2024-08-01 14:11:48+05:30 / 2024-08-01 14:11:52+05:= 30 > > > Now I dropped the table table4 from the EPAS and noted down the time > > > I want to restore the table4,, so I stopped EPAS and executed > > $ sudo -u enterprisedb pgbackrest --stanza=3DDemo_Repo --delta > --set=3D20240729-160137F_20240801-141148I --target-timeline=3Dcurrent > --type=3Dtime --target=3D"01-AUG-24 14:08:32.447796+05:30" > --target-action=3Dpromote restore > > IT WORKS AS EXPECTED .. after restarting the EPAS I am able to get the > important_table4 back. > > root@service01 ~]# sudo -u enterprisedb psql edb > psql (16.3.0) > Type "help" for help. > > edb=3D# \dt > List of relations > Schema | Name | Type | Owner > --------+------------------+-------+-------------- > public | important_table | table | enterprisedb > public | important_table2 | table | enterprisedb > public | important_table3 | table | enterprisedb > public | important_table4 | table | enterprisedb > (4 rows) > > SO all works fine !!!! . > > > *But Now the PROBLEM Statement. * > > *1. I am dropping the table table 4 again * > edb=3D# \q > [root@service01 ~]# sudo -u enterprisedb psql -c "begin; drop table > important_table4; commit;" edb > BEGIN > DROP TABLE > COMMIT > *2 . [root@service01 ~]#* sudo -u enterprisedb psql -Atc "select > current_timestamp" edb 01-AUG-24 14:23:22.085076 +05:30 > Noting the time as : (01-AUG-24 14:23:22.085076 +05:30 ) > > 3. Now I am performing an incremental backup after step 2 on REPO SErve= r > ( Hoping that this latest INCR Backup is without dropped important_table4= , > so that a recovery of the cluster shouldn't show the table4 again. ) > > incr backup details. : 20240729-160137F_20240801-142433I > timestamp start/stop*: 2024-08-01 14:24:33+05:30 / 2024-08-01 > 14:24:36+05:30* > > 4. Now I want to test the database recovery after dropping the table4 in > step1 to verify that my EPAS restores from the backup in step 3 and time > stamp (01-AUG-24 14:23:22.085076 +05:30, so that the restored EPAS > cluster doesn't contain the important_table4. > > 5. $ sudo -u enterprisedb pgbackrest --stanza=3DDemo_Repo --delta > --set=3D20240729-160137F_20240801-142433I --target-timeline=3Dcurrent > --type=3Dtime --target=3D"01-AUG-24 14:23:22.085076+05:30" > --target-action=3Dpromote restore > ------------ > ------------- > INFO: restore command end: completed successfully (1035ms) > > *ISSUE: I am unable to get the EPAS Server* in running state after > step 5 > > *What am I doing wrong ? OR am I conceptually wrong ?* > > > > > OUTPUT on executing step 5. > > [root@service01 ~]# sudo -u enterprisedb pgbackrest --stanza=3DDemo_Repo > --delta --set=3D20240729-160137F_20240801-142433I --target-timeline=3Dcu= rrent > --type=3Dtime --target=3D"01-AUG-24 14:23:22.085076+05:30" > --target-action=3Dpromote restore > > 2024-08-01 14:30:03.535 P00 INFO: restore command begin 2.52.1: --delta > --exec-id=3D82738-b5fe7415 --log-level-console=3Dinfo --log-level-file=3D= debug > --pg1-path=3D/var/lib/edb/as16/data --pg-version-force=3D16 > --repo1-host=3D10.10.20.7 --repo1-host-user=3Dpostgres > --set=3D20240729-160137F_20240801-142433I --stanza=3DDemo_Repo > --target=3D"01-AUG-24 14:23:22.085076+05:30" --target-action=3Dpromote > --target-timeline=3Dcurrent --type=3Dtime > 2024-08-01 14:30:03.880 P00 INFO: repo1: restore backup set > 20240729-160137F_20240801-142433I, recovery will start at 2024-08-01 > 14:24:33 > 2024-08-01 14:30:03.881 P00 INFO: remove invalid files/links/paths from > '/var/lib/edb/as16/data' > 2024-08-01 14:30:04.567 P00 INFO: write updated > /var/lib/edb/as16/data/postgresql.auto.conf > 2024-08-01 14:30:04.569 P00 INFO: restore global/pg_control (performed > last to ensure aborted restores cannot be started) > 2024-08-01 14:30:04.569 P00 INFO: restore size =3D 75.9MB, file total = =3D > 2171 > 2024-08-01 14:30:04.569 P00 INFO: restore command end: completed > successfully (1035ms) > *[root@service01 ~]# systemctl start edb-as-16.service* > > *Now If I check the server status : Its dead * > > [root@service01 ~]# systemctl status edb-as-16.service > =C3=97 edb-as-16.service - EDB Postgres Advanced Server 16 > Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled; > preset: disabled) > *Active: failed* (Result: exit-code) since Thu 2024-08-01 14:30:58 > IST; 4s ago > Duration: 228ms > Process: 82752 ExecStartPre=3D/usr/edb/as16/bin/edb-as-16-check-db-di= r > ${PGDATA} (code=3Dexited, status=3D0/SUCCESS) > Process: 82757 ExecStart=3D/usr/edb/as16/bin/edb-postgres -D ${PGDATA= } > (code=3Dexited, status=3D1/FAILURE) > Main PID: 82757 (code=3Dexited, status=3D1/FAILURE) > CPU: 325ms > > Aug 01 14:30:56 service01 systemd[1]: Starting EDB Postgres Advanced > Server 16... > Aug 01 14:30:56 service01 edb-postgres[82757]: 2024-08-01 14:30:56 IST > LOG: redirecting log output to logging collector process > Aug 01 14:30:56 rservice01 edb-postgres[82757]: 2024-08-01 14:30:56 IST > HINT: Future log output will appear in directory "log". > Aug 01 14:30:58 service01 systemd[1]: Started EDB Postgres Advanced Serve= r > 16. > Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Main process > exited, code=3Dexited, status=3D1/FAILURE > Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Killing process > 82758 (edb-postgres) with signal SIGKILL. > A*ug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Failed with > result 'exit-code'.* > [root@service01 ~]# > > Any hints/guidance most welcome. > > Can you please share the DB Server log as it contains the exact error which is causing the server not to start. Thanks --000000000000ba5f01061e9c47c8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi

On Thu, Aug 1, 2024 at 2:54=E2=80=AFPM = KK CHN <kkchn.in@gmail.com>= wrote:
List,

Not working (start EPAS ser= ver always fails):=C2=A0

1. Testing PTR using= =C2=A0 PgBackRest(2.52.1)=C2=A0 on RHEL9=C2=A0 EPAS-16, and RHEL9 ( Repo=C2= =A0 =C2=A0 =C2=A0 =C2=A0Server)=C2=A0

=C2=A0 When = I do a PTR

1.=C2=A0 After doing a table drop and t= hen
2. Noting down the time stamp and then=C2=A0
3. Tak= ing an incremental=C2=A0backup in hope that If I do a restore=C2=A0from thi= s incr Backup, that won't=C2=A0 contain the=C2=A0 dropped table.=C2=A0<= /div>
4.=C2=A0Correct me=C2=A0 if I=C2=A0am=C2=A0 conceptually wrong he= re.=C2=A0=C2=A0
5.=C2=A0 I am never successful in restorin= g the EPAS server in this scenario.


I know the following=C2=A0will work for me, why not the abov= e one if I really want=C2=A0that state of cluster also=C2=A0 ?=C2=A0

This is Working.=C2=A0
=C2=A01. Cre= ate table=C2=A0
2. Noting down the timestamp
3.=C2=A0 T= aking incremental backup on RepoServer.
4. drop the created table= .
5. Then stop the EPAS server and do a=C2=A0 PTR, by the=C2=A0 = --set=3Dstep 3 incr backup=C2=A0 and target=3D step 2 time stamp .. It fini= shed the pgaback=C2=A0restore and promote=C2=A0command
6. I am ab= le to start back the=C2=A0 EPAS server and see the dropped table recovered = there.

But If I want a PTR as in the first section= it fails.. Why ?=C2=A0

Thank you,
Krish= ane




<= br>
What I have done and results obtained:=C2=A0

Created a table important_table4 in my EPAS and note down t= he time after creation of this table it is=C2=A0 ( t1 :=C2=A0 "01-AUG-= 24 14:08:32.447796+05:30" )

Then I performed = an Incremental backup=C2=A0 =C2=A0(incr backup: 20240729-160137F_20240801-1= 41148I )
timestamp start/stop: 2024-08-01 14:11:48+05:30 / 2024-0= 8-01 14:11:52+05:30


Now I dropp= ed the table table4 from the EPAS and noted down the time=C2=A0
<= br>

I want to=C2=A0 restore the table4,, so I stop= ped EPAS and executed=C2=A0

$ sudo -u enterprisedb= pgbackrest --stanza=3DDemo_Repo --delta --set=3D20240729-160137F_20240801-= 141148I =C2=A0--target-timeline=3Dcurrent --type=3Dtime =C2=A0--target=3D&q= uot;01-AUG-24 14:08:32.447796+05:30" --target-action=3Dpromote restore=

IT WORKS AS EXPECTED .. after restarting the = EPAS I am able to get the important_table4 back.=C2=A0

=
root@service01 ~]# sudo -u enterprisedb psql edb
psql (16.3.0)
T= ype "help" for help.

edb=3D# \dt
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 List of relations
=C2=A0Schema | =C2= =A0 =C2=A0 =C2=A0 Name =C2=A0 =C2=A0 =C2=A0 | Type =C2=A0| =C2=A0 =C2=A0Own= er
--------+------------------+-------+--------------
=C2=A0public | = important_table =C2=A0| table | enterprisedb
=C2=A0public | important_ta= ble2 | table | enterprisedb
=C2=A0public | important_table3 | table | en= terprisedb
=C2=A0public | important_table4 | table | enterprisedb
(4 = rows)

SO all works fine !!!! .=C2=A0

=

But Now the PROBLEM Statement.=C2=A0

1. I am dropping the table table 4=C2=A0again=C2=A0=
edb=3D# \q
[root@service01 ~]# sudo -u enterprisedb psql = -c "begin; drop table important_table4; commit;" edb
BEGIN
= DROP TABLE
COMMIT
2 .=C2=A0 [root@service01 ~]# sudo -u enterp= risedb psql -Atc "select current_timestamp" edb=C2=A0 01-AUG-24 1= 4:23:22.085076 +05:30
Noting the time as :=C2=A0 =C2=A0(01-AU= G-24 14:23:22.085076 +05:30 )

3. Now=C2=A0 I a= m performing an incremental backup after step 2=C2=A0 on REPO SErver ( Hopi= ng that this latest INCR Backup is without dropped important_table4, so tha= t a recovery of the cluster=C2=A0 shouldn't show the table4 again. )=C2= =A0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 incr backup detail= s. : 20240729-160137F_20240801-142433I
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 timestamp start/stop: 2024-08-01 14:24:33+05:30 / 2024-08-01 = 14:24:36+05:30

4. Now I want to test the d= atabase recovery=C2=A0 after dropping the table4 in step1 to verify that my= EPAS restores from the backup in step 3 and time stamp (01-AUG-24 14:23:22= .085076 +05:30,=C2=A0 =C2=A0so that=C2=A0 the restored EPAS cluster doesn&#= 39;t contain the important_table4.

5.=C2=A0 $ sudo= -u enterprisedb pgbackrest --stanza=3DDemo_Repo --delta=C2=A0 =C2=A0--set= =3D20240729-160137F_20240801-142433I =C2=A0--target-timeline=3Dcurrent --ty= pe=3Dtime =C2=A0--target=3D"01-AUG-24 14:23:22.085076+05:30" --ta= rget-action=3Dpromote restore
=C2=A0------------
------= -------
INFO: restore command end: completed successfully (1035ms= )

ISSUE:=C2=A0 =C2=A0 I am unable to get th= e EPAS Server in running state after step 5=C2=A0

<= div>=C2=A0What am I doing=C2=A0wrong ?=C2=A0 OR am I conceptually wrong = ?




OUTPUT on executing step 5.=C2=A0

[root@service01= ~]# sudo -u enterprisedb pgbackrest --stanza=3DDemo_Repo --delta --set=3D2= 0240729-160137F_20240801-142433I =C2=A0--target-timeline=3Dcurrent --type= =3Dtime =C2=A0--target=3D"01-AUG-24 14:23:22.085076+05:30" --targ= et-action=3Dpromote restore

2024-08-01 14:30:03.535 P00 =C2= =A0 INFO: restore command begin 2.52.1: --delta --exec-id=3D82738-b5fe7415 = --log-level-console=3Dinfo --log-level-file=3Ddebug --pg1-path=3D/var/lib/e= db/as16/data --pg-version-force=3D16 --repo1-host=3D10.10.20.7 --repo1-host= -user=3Dpostgres --set=3D20240729-160137F_20240801-142433I --stanza=3DDemo_= Repo --target=3D"01-AUG-24 14:23:22.085076+05:30" --target-action= =3Dpromote --target-timeline=3Dcurrent --type=3Dtime
2024-08-01 14:30:03= .880 P00 =C2=A0 INFO: repo1: restore backup set 20240729-160137F_20240801-1= 42433I, recovery will start at 2024-08-01 14:24:33
2024-08-01 14:30:03.8= 81 P00 =C2=A0 INFO: remove invalid files/links/paths from '/var/lib/edb= /as16/data'
2024-08-01 14:30:04.567 P00 =C2=A0 INFO: write updated /= var/lib/edb/as16/data/postgresql.auto.conf
2024-08-01 14:30:04.569 P00 = =C2=A0 INFO: restore global/pg_control (performed last to ensure aborted re= stores cannot be started)
2024-08-01 14:30:04.569 P00 =C2=A0 INFO: resto= re size =3D 75.9MB, file total =3D 2171
2024-08-01 14:30:04.569 P00 =C2= =A0 INFO: restore command end: completed successfully (1035ms)
[root@= service01 ~]# systemctl =C2=A0start edb-as-16.service
Now=C2=A0 If I check the server=C2=A0 status=C2=A0 :=C2=A0 = =C2=A0Its dead=C2=A0

[root@service01 ~]# s= ystemctl =C2=A0status edb-as-16.service
=C3=97 edb-as-16.service - EDB P= ostgres Advanced Server 16
=C2=A0 =C2=A0 =C2=A0Loaded: loaded (/etc/syst= emd/system/edb-as-16.service; disabled; preset: disabled)
=C2=A0 =C2=A0 = =C2=A0Active: failed (Result: exit-code) since Thu 2024-08-01 14:30:= 58 IST; 4s ago
=C2=A0 =C2=A0Duration: 228ms
=C2=A0 =C2=A0 Process: 82= 752 ExecStartPre=3D/usr/edb/as16/bin/edb-as-16-check-db-dir ${PGDATA} (code= =3Dexited, status=3D0/SUCCESS)
=C2=A0 =C2=A0 Process: 82757 ExecStart=3D= /usr/edb/as16/bin/edb-postgres -D ${PGDATA} (code=3Dexited, status=3D1/FAIL= URE)
=C2=A0 =C2=A0Main PID: 82757 (code=3Dexited, status=3D1/FAILURE)=C2=A0 =C2=A0 =C2=A0 =C2=A0 CPU: 325ms

Aug 01 14:30:56 service01 sy= stemd[1]: Starting EDB Postgres Advanced Server 16...
Aug 01 14:30:56 se= rvice01 edb-postgres[82757]: 2024-08-01 14:30:56 IST LOG: =C2=A0redirecting= log output to logging collector process
Aug 01 14:30:56 rservice01 edb-= postgres[82757]: 2024-08-01 14:30:56 IST HINT: =C2=A0Future log output will= appear in directory "log".
Aug 01 14:30:58 service01 systemd[= 1]: Started EDB Postgres Advanced Server 16.
Aug 01 14:30:58 service01 s= ystemd[1]: edb-as-16.service: Main process exited, code=3Dexited, status=3D= 1/FAILURE
Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Killi= ng process 82758 (edb-postgres) with signal SIGKILL.
Aug 01 14:30:58 = service01 systemd[1]: edb-as-16.service: Failed with result 'exit-code&= #39;.
[root@service01 ~]#

Any hints/gui= dance most welcome.=C2=A0

Can y= ou please share the DB Server log as it contains the exact error which is c= ausing the server not to start.

Thanks=C2=A0
=
--000000000000ba5f01061e9c47c8--