Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tdRFu-007DQH-1I for pgsql-admin@arkaria.postgresql.org; Thu, 30 Jan 2025 09:54:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tdRFq-00B2KF-Gu for pgsql-admin@arkaria.postgresql.org; Thu, 30 Jan 2025 09:54:34 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tdRFq-00B2K7-0w for pgsql-admin@lists.postgresql.org; Thu, 30 Jan 2025 09:54:34 +0000 Received: from mail-francecentralazon11021119.outbound.protection.outlook.com ([40.107.160.119] helo=PAUP264CU001.outbound.protection.outlook.com) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tdRFm-002K9z-1K for pgsql-admin@lists.postgresql.org; Thu, 30 Jan 2025 09:54:33 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FSoTVCpfoY8udHES+75W/9hGwF4kSjTDOZ6WJfY8ZHgj8B31eNyu8ATaYBsC2++lYZTseh0BTIYIODkJMdWGeRBXLNcVsPCktScbesg552HbIXmjS6r1sBsVDpSK/VxTBDqkwMzL4ng83sJS6sadUa+QVxvlr6eqGIw56qYlZrbo1yI6nj7TPuhS4yGIqQg0Ah27MxRZt4Xh90JtmACbiltrA8P8nZvwW805xt+LOlLB45/wmMS8KPisgGYt9kU+p7oiWHfLgfFRNVkRnq+eKGM93RiOCitTydJ3EyyjeOcR4Ejld/vStg2RZ+0YK7I/6XZWQh4sz8UrxcdAUsUGzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5dZ4GKSV20Us5JQWpI5M5fDI8agGIjc/Eg379OhMLzE=; b=EI3IXU4ya0ZPpLievgEz8v29EtTpR/gaNyHq81MO56Z/XFxVBru7T7OtbAHMFSKBxwvaDxqEW56XxehRSGuwR1sMgs8zxQmk08PlemRkGaXah14FQFdS8JKaSmxv4am9Vw9olsnlOyYq/4IWjp/ITBkEYRgTqLV1S6rJq0fCvowUdP3YS1xy1QXZIZ1JQjoVYl28iH3VTno1J7iAjuNnJmvyWy1vXIxH6eWTNx3nRzdaJoBDQbkP0jigFCRFpXO7UbSJ6vErYykkgi5beyFqzY1uAQpSbONtLEKqwL9vEdMQjtzEDcR8t2Y3dJ5Yybs4CrmtF8bswdSaX/7Ev2f0TA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=xelians.fr; dmarc=pass action=none header.from=xelians.fr; dkim=pass header.d=xelians.fr; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=xelians.fr; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5dZ4GKSV20Us5JQWpI5M5fDI8agGIjc/Eg379OhMLzE=; b=z7jSiK5KHL6wmbWHqOywetDAE3jVC0zg5nTIZ/e5qMwKReE4CyImGQkZN8kya6hR12hgN3AA6OLmWEZUBzpf1TlGU1qTkrqhvs+aeOvSr68NypRDMMRiwTJA7KIK5JxfKUVjpP1t2147BuXDWfpONg4lR+e0WM2JkxIgbSeyGk7yX/Z7S2ldB5WQkmtNw/MSH/l0vAO59I2S2Js3tMrAHXefyHoW1R77O5JGaPR4XQYZWjWz4q8q/oJ6hH57L+hiAy8lc9i8pAZbnzDYJYBN6VH8nUVfCxY0F+BkQRaCIRGym/sii9UvJ3WqAdMXdPGNppWQ+tl6cHEQbjm3uJvZzQ== Received: from MR2P264MB0147.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:4::20) by MRZP264MB1733.FRAP264.PROD.OUTLOOK.COM (2603:10a6:501:c::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8398.20; Thu, 30 Jan 2025 09:54:26 +0000 Received: from MR2P264MB0147.FRAP264.PROD.OUTLOOK.COM ([fe80::f31c:a92e:dfaa:af4f]) by MR2P264MB0147.FRAP264.PROD.OUTLOOK.COM ([fe80::f31c:a92e:dfaa:af4f%4]) with mapi id 15.20.8398.017; Thu, 30 Jan 2025 09:54:26 +0000 From: Samuel VISCAPI To: Alvaro Herrera CC: "pgsql-admin@lists.postgresql.org" Subject: RE: 2nd PostgreSQL server in WAL shipping cluster fails to start Thread-Topic: 2nd PostgreSQL server in WAL shipping cluster fails to start Thread-Index: AdtifmqCI/mjxc4zTd6+dTy4ZYsnWQACMquABBvf7qA= Date: Thu, 30 Jan 2025 09:54:26 +0000 Message-ID: References: <202501091111.ds3gg5ae7kyc@alvherre.pgsql> In-Reply-To: <202501091111.ds3gg5ae7kyc@alvherre.pgsql> Accept-Language: fr-FR, en-US Content-Language: fr-FR X-Mentions: alvherre@alvh.no-ip.org X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=xelians.fr; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: MR2P264MB0147:EE_|MRZP264MB1733:EE_ x-ms-office365-filtering-correlation-id: 567b70d4-a8ca-49eb-3a97-08dd411414d8 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0;ARA:13230040|10070799003|366016|1800799024|376014|38070700018; x-microsoft-antispam-message-info: =?iso-8859-1?Q?8CTibLoMDhVI26qLezfb8hi4od6PqB4Y3JMDioGja69kDz2Dv1tmNOYk1U?= =?iso-8859-1?Q?RSkUcCq1tTmnI09tsB44207d0zveG0OqXFPMj/N9PDQZ78naSvUI6/zm5m?= =?iso-8859-1?Q?EegQZlZMKQcFOqi4AJJ6QkYgVt+MkQ3NXDp77mSkoAkIxSARG6UEs0pudc?= =?iso-8859-1?Q?jVOhB/XF89V2Uo3se4k5PTq3gD0FnrGhFOGk/XUDHcRF7oJklTAiPb4w11?= =?iso-8859-1?Q?wGgjyAjYJkPiXz4dWrba2iAE7+EdMapT2elc5J72syyCRG6jPoRXugG9hY?= =?iso-8859-1?Q?S+M1JZAwoS48B7YdiBJYTiZ+6Q9JGbGPWM7b87dguit6UIVuijgDeJUc2m?= =?iso-8859-1?Q?698UXjOu6Rsh02HtCbX73PBrFoesva7cY7ASk6/N9H38OmX3R8kfziWG/z?= =?iso-8859-1?Q?Zm8n4XrlNpIM9XZrihVHhg6rnoMBcqYUnrA1DHZ4cHF1q9a3qcWicmUfb8?= =?iso-8859-1?Q?TLsHRQCksEG+WOxxXdODH0vhpT+7R+tVenFPDltrUhf8+HR1tIe0oVFlEc?= =?iso-8859-1?Q?2us8RkDXJL0H0mnjIi2q1jJDkqHoQws9/RhPG4s2E9KwEERfoU/zuk+WbV?= =?iso-8859-1?Q?dN6VmAYmBISIrTN4JCXFV+YBZCCn+cPpIkZFN5Pdqe1+D0LlqpbdW8Gvsr?= =?iso-8859-1?Q?R1+E8DTOpGVzS9sRJLOs9+/jzSrwy6vIzXycaEcqSP8ZQckX18wjGu1l4b?= =?iso-8859-1?Q?oHNe+36BB4YFZNWyoV0GVxnUisXh8UBu5TzDzEfTIA1Urb5ED22/yEbeS/?= =?iso-8859-1?Q?e9/L7FVBSy9iE/9E749XbJvfYQ5GOtCxdNRKKtOAM8xED2TqwqQV9j31km?= =?iso-8859-1?Q?903aTzBAsI7TiV2Qitato6c5IEXZMD499oi1v5oDmueDk+rhjoZdKGWHZO?= =?iso-8859-1?Q?KJD9svhIkh/r/TSxr9UFNiefbfEkg5yOFQM+LFxNI2rBS97gUKGuY01W+L?= =?iso-8859-1?Q?OppvZufbkDAdkzVu225TDET0mHp9NjHQ4WPrcOb6OCsAaL/9e7R1h3HDCF?= =?iso-8859-1?Q?RickUvlzI43WOB3/VW9V8aw719f7myh6oNsd/ZK4i/Ceh+NePYcqNx2Zt/?= =?iso-8859-1?Q?8NJXala2NxKh1SHf1f1nWslKwSlRZEMWir75K43I0ofCkxlKzasREzZ8u2?= =?iso-8859-1?Q?v4pWNEovf8EYaacgJGz92IF7IlQ796ksLqjv3YS1PXhDpkeYKdawkuJh0V?= =?iso-8859-1?Q?hUFOFCNHlt14zHdnL2r03TPBR6Y/BzxSCgtZc8WxqBmKMYoP0zqF22VW8d?= =?iso-8859-1?Q?+iJWgOS1FuFOV3uji9yWi1ZUiDbabXqup30F+RCY00SSez2elJfyQFjwJD?= =?iso-8859-1?Q?yPtzPEBQA/6mwyOiZmbb45hUSGLLb8nRA3xFK1cSrMVq5ow6xEpRWMJK1c?= =?iso-8859-1?Q?29LF1839k/KpkrPLK/xsGbE7eqZ2FzKGPrYZY5dTIUVHwwVHSBymFAMGm6?= =?iso-8859-1?Q?tWCxHHgciSXdg2Ys?= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MR2P264MB0147.FRAP264.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230040)(10070799003)(366016)(1800799024)(376014)(38070700018);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?/lfEPusFw1H092JiSkhY/V/h5yI99MMJ6hO3lY7wFElNPfD5LFdyroWqiB?= =?iso-8859-1?Q?dbvpTxaGLggUwK4c61sQyFkTeYCUpVEOQM34LjDu2FKx6ZUrHIclzQaKpd?= =?iso-8859-1?Q?B/SIhfq8+WzrZH6mu2SpgPlkTkmaNq47H+b1dScDLupVfEerZ0cMhxXCUH?= =?iso-8859-1?Q?sg0Nr0kzTVEPWLyW6+Mpwp8UpzGTclWH9nIwqIdpALr/aRwl5kvEElMM0G?= =?iso-8859-1?Q?pG/ADaDaoGxfF/jDOUSNs1kf+Tm5PGWOOdLOrT3q/73S6wGMTbknE/M6RA?= =?iso-8859-1?Q?DQPTcW+9mzGWGwVMxxuvgYe2IgDqYBGgJT7kJlrsJREfFvRMsPoK2ilfrO?= =?iso-8859-1?Q?Tj7AB1RK5OUK7aLOiuM0IqVOfBghGOOfnIMpirIPDYmaOwPkc45POmUVrC?= =?iso-8859-1?Q?Jea4TxZrUG/cBABopw/nXkJtPab2vxAOiVomv2reXVLDiqJank+FHzorbx?= =?iso-8859-1?Q?PNlRN3WSZKL+LFGfA703Ng6/MGxbYwO8jwOs3DGtxDCjdR3kx+y1xsELdz?= =?iso-8859-1?Q?g7rRZCdCTkkIbP3bi2iBQtW03FHNs7dJ2ZmvIpTNhq8OOR/9sKpEcgkD1I?= =?iso-8859-1?Q?76cEW7/4DLhS5XM1neBAHaNH86zOb9K+YdBOoj+Sf2U9nyAUl3oGcpaB+f?= =?iso-8859-1?Q?x5gmfDp9QiZr8uGDrtvNJIuifIu8uTZMnQhttoo4i4H4H0Egl59ARSpn/h?= =?iso-8859-1?Q?T40gcynNXqmiYctm9Z4zW4uV2tdV122gFWgjr8Z/XcJEyzZHTvXxbztzeB?= =?iso-8859-1?Q?yGC3Rv6wRr4QuV2Y+lsvRFosQ1SFCcLzIw4woyjroFhpVwgzkuZ+kJlDWa?= =?iso-8859-1?Q?/g+qC5eOhpmgN/6nmJuQ2xLod9qpU4QMj11gh8z7YUpdD2jcLjfBnC3vK6?= =?iso-8859-1?Q?R+dhwfW1RrlcCPPSq0w7chTYPdvW6/n4PZkTWx0pf2Cv5H3ubp6oJ1A1cU?= =?iso-8859-1?Q?bwCgyMfNcVu3v80h6E0U+hK6Ra7irX/quTwdSBTOZ/knss2n5oQxLrFP45?= =?iso-8859-1?Q?gYAl3NQ5Q9VJ7BFYcV1DHk2vzkuuUQi9yieZ+qZq1hv+QImEJBVkHuJdy0?= =?iso-8859-1?Q?0WEyWViCK43K9IQwFCKqKnZrokZzI1PDWKpsSQa+Jh59bDJQ5pnlgIDVUI?= =?iso-8859-1?Q?pYng7VhZ/gNcn+UKok2+2KjMG3SrMzz+vfwh99QpNIt8eICqXlqwaZJ7Zz?= =?iso-8859-1?Q?qYAD0qPFcJDLganSDGk3rC0ItaomnPngIX0usW/4RBrb8ELABDgE1V5Vbd?= =?iso-8859-1?Q?S59SlF3AQcnHXzgUqx9szYIVffB/dbOoBRA6F+czj9474K5UmSmcNMsxPa?= =?iso-8859-1?Q?apNExIJsJ6LWzDFn9IaXFuTaAxTxpSe4amlLDsaBByfoVqWjd/e18uSCVS?= =?iso-8859-1?Q?jjbRNZBOH8boL/AkUeqZPqX/POgGJi4AnVnCP4jgKELJdJXs37t3AqsC5E?= =?iso-8859-1?Q?to4qiMFQdTGiSRDt2bHyASHgQTTJ077M75NiwRj5Wo/0COgpZRR/M8bAeb?= =?iso-8859-1?Q?77U9DoK5las3a3a0d6WIlNGOVI8J9CFlCweWmhiVN7QOTy/lShwUAaFMzs?= =?iso-8859-1?Q?Ri27xdDczoz4BTrmKYuGIgH1Wn1H/nO34XU7pWHIt/5wyM7SOa9NYCxvO5?= =?iso-8859-1?Q?ZrXmAQ9uY5WcAyLFiVqUK7/OqX67IW1XsgqavPb6+bKo4R2dbpMdZ8zzcb?= =?iso-8859-1?Q?M72APdR4qLEovUKxVfBnLEC8ymDCsxfw912NFDpK?= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: xelians.fr X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0147.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 567b70d4-a8ca-49eb-3a97-08dd411414d8 X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Jan 2025 09:54:26.0948 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 6fc1c04c-4bce-498f-bdd2-4a407cbd7dcd X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: kBjd+4/xdFaPsBDC2+y7XHnB+fLB38Lrnrmx7HwKRjP/5Ul8isYs7tptmFZXfTpNgnuPC7gHALAwusfW5S39nvaKmPLbuwjqWXymZw5LUHg= X-MS-Exchange-Transport-CrossTenantHeadersStamped: MRZP264MB1733 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Dear Alvaro and Anton, Thank you for your quick replies, really appreciated. I ran some more tests= today and the results are as follows: Available disk space on primary server : 101GB free in the data directory. = I believe some cleaning up has been done, I need to check with my colleague= s about that. Wal_sender_timeout : #wal_sender_timeout =3D 60s Wal_keep_segments : not present in postgresql.conf Networking issues : none at the moment, the primary server can ping the ser= ver in standby mode by its IP address. Name resolution doesn't work though. Monitor replication lag between the two nodes: - Primary node :=20 SELECT pg_current_wal_lsn(); pg_current_wal_lsn -------------------- 18/B6C404B8 - Standby node : TODO when the node is up again I'm tempted to give the pgbasebackup command a go: https://www.postgresql.o= rg/docs/13/app-pgbasebackup.html pg_basebackup -h primary_host -D /some_app/data -U postgres -v -P @Alvaro Herrera : the archive_command is indeed set in the primary server, = as follows: archive_command =3D 'test ! -f /some_app/data/mnt/server/archivedir/%f && c= p %p /some_app/data/mnt/server/archivedir/%f' There are some files in the pg_wal directory but they only amount to 289MB.= If I need to rebuild the replica from scratch, could I use the pg_baseback= up command above for that purpose ? And you are obviously right about the error message in French. Please find = it below: 2025-01-09 09:46:35.742 CET [3147382] FATAL: n'a pas pu recevoir des donn= =E9es du flux de WAL : ERREUR: le segment demand=E9 du journal de transact= ion, 000000010000000D000000D6, a d=E9j=E0 =E9t=E9 supprim=E9 2025-01-09 09:46:40.745 CET [3147395] LOG: d=E9marr=E9 le flux des journau= x depuis le principal =E0 D/D6000000 sur la timeline 1 2025-01-09 09:46:40.745 CET [3147395] FATAL: n'a pas pu recevoir des donn= =E9es du flux de WAL : ERREUR: le segment demand=E9 du journal de transact= ion, 000000010000000D000000D6, a d=E9j=E0 =E9t=E9 supprim=E9 2025-01-09 09:46:45.749 CET [3147397] LOG: d=E9marr=E9 le flux des journau= x depuis le principal =E0 D/D6000000 sur la timeline 1 2025-01-09 09:46:45.749 CET [3147397] FATAL: n'a pas pu recevoir des donn= =E9es du flux de WAL : ERREUR: le segment demand=E9 du journal de transact= ion, 000000010000000D000000D6, a d=E9j=E0 =E9t=E9 supprim=E9 2025-01-09 09:46:50.753 CET [3147424] LOG: d=E9marr=E9 le flux des journau= x depuis le principal =E0 D/D6000000 sur la timeline 1 2025-01-09 09:46:50.753 CET [3147424] FATAL: n'a pas pu recevoir des donn= =E9es du flux de WAL : ERREUR: le segment demand=E9 du journal de transact= ion, 000000010000000D000000D6, a d=E9j=E0 =E9t=E9 supprim=E9 Thanks again for your help. Best regards, Samuel -----Message d'origine----- De=A0: Alvaro Herrera =20 Envoy=E9=A0: jeudi 9 janvier 2025 12:11 =C0=A0: Samuel VISCAPI Cc=A0: pgsql-admin@lists.postgresql.org Objet=A0: Re: 2nd PostgreSQL server in WAL shipping cluster fails to start On 2025-Jan-09, Samuel VISCAPI wrote: > Dear all, >=20 > Some years ago I set up a two PostgreSQL 13 nodes cluster on Debian 11. T= his cluster uses the Write-Ahead Log Shipping method. I've just been told t= he second server (in standby mode) is down and refuses to start again with = the following error message (roughly translated from French) : >=20 > 2025-01-09 09:46:35.742 CET [3147382] FATAL: could not receive data=20 > from WAL stream: ERROR: segment requested from transaction journal,=20 > 000000010000000D000000D6, has already been removed > 2025-01-09 09:46:40.745 CET [3147395] LOG: journal flow started from=20 > primary at D/D6000000 on timeline 1 > 2025-01-09 09:46:40.745 CET [3147395] FATAL: could not receive data=20 > from WAL stream : ERROR: segment requested from transaction journal,=20 > 000000010000000D000000D6, has already been removed > 2025-01-09 09:46:45.749 CET [3147397] LOG: journal flow started from=20 > primary at D/D6000000 on timeline 1 > 2025-01-09 09:46:45.749 CET [3147397] FATAL: could not receive data=20 > from WAL stream : ERROR: segment requested from transaction journal,=20 > 000000010000000D000000D6, has already been removed > 2025-01-09 09:46:50.753 CET [3147424] LOG: journal flow started from=20 > primary at D/D6000000 on timeline 1 > 2025-01-09 09:46:50.753 CET [3147424] FATAL: could not receive data=20 > from WAL stream : ERROR: segment requested from transaction journal,=20 > 000000010000000D000000D6, has already been removed This means the standby is requesting a segment that was already removed. You may be able to find those files in a WAL archive, if you have archive_c= ommand set in the primary. If you do, then it would work to copy those to = the standby's pg_wal/ subdirectory. If you don't have them, then the repli= ca must be rebuilt. Note that it's not a good idea to translate the error messages when posting= -- that is, it's better to post exactly what the log file has. For helpfulness you could post a translation for the lines separately. But the developers can find the translated messages in the source code, if = they need them ... but if you translate them yourself, there's no way to kn= ow exactly which ones they are. Sometimes subtle differences are important= cues. --=20 =C1lvaro Herrera 48=B001'N 7=B057'E - https://www.Enterpris= eDB.com/