Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rwGD5-000d0m-9q for pgsql-general@arkaria.postgresql.org; Mon, 15 Apr 2024 06:52:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rwGD4-00B0ve-0K for pgsql-general@arkaria.postgresql.org; Mon, 15 Apr 2024 06:52:58 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rwGD3-00B0vW-Ls for pgsql-general@lists.postgresql.org; Mon, 15 Apr 2024 06:52:57 +0000 Received: from mail-4018.proton.ch ([185.70.40.18]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rwGD2-0011TX-0a for pgsql-general@lists.postgresql.org; Mon, 15 Apr 2024 06:52:56 +0000 Date: Mon, 15 Apr 2024 06:52:48 +0000 To: "pgsql-general@lists.postgresql.org" From: Nicolas Seinlet Subject: Failing streaming replication on PostgreSQL 14 Message-ID: <2UjR_dq61Dzm98Vci6oYDLK4PlmvAzhbVg0uYCunualePnYCN8sYSXVuJW2N4D1kHqiWPw7VvEZMYS0Hy4UHfRI6lVApRrs_1-LZ3E0jzNE=@seinlet.com> Feedback-ID: 32582315:user:proton MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha512; boundary="------5989a174ef5164c947ba569366e1dbd159220500668abaa2c57abdf49b891174"; charset=utf-8 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------5989a174ef5164c947ba569366e1dbd159220500668abaa2c57abdf49b891174 Content-Type: multipart/mixed;boundary=---------------------8d0084ae22705be290d3d3d0ec31251e -----------------------8d0084ae22705be290d3d3d0ec31251e Content-Transfer-Encoding: quoted-printable Content-Type: text/plain;charset=utf-8 Hello everyone, Since I moved some clusters from PostgreSQL 12 to 14, I noticed random fai= lures in streaming replication. I say "random" mostly because I haven't go= t the source of the issue. I'm using the Ubuntu/cyphered ZFS/PostgreSQL combination. I'm using Ubuntu= LTS (20.04 22.04) and provided ZFS/PostgreSQL with LTS (PostgreSQL 12 on = Ubuntu 20.04 and 14 on 22.04). The streaming replication of PostgreSQL is configured with `primary_connin= fo 'host=3Dmain_server port=3D5432 user=3Dreplicant password=3Da_very_secu= re_password sslmode=3Drequire application_name=3Dreplication_postgresql_ap= p' ` , no replication slot nor restore command, and the wal is configured = with `full_page_writes =3D off wal_init_zero =3D off wal_recycle =3D off` If this works like a charm on PostgreSQL 12, it's sometimes failing with P= ostgreSQL 14. As we also changed the OS, maybe the issue relies somewhere = else. When the issue is detected, the WAL on the primary is correct. A piece of = the WAL is wrong on the secondary. Only some bytes. Some bytes later, the = wal is again correct. Stopping PostgreSQL on the secondary, removing the w= rong WAL file, and restarting PostgreSQL solves the issue. We've added another secondary and noticed the issue can appear on one of t= he secondaries, not both at the same time. What can I do to detect the origin of this issue? Have a nice week, Nicolas. -----------------------8d0084ae22705be290d3d3d0ec31251e-- --------5989a174ef5164c947ba569366e1dbd159220500668abaa2c57abdf49b891174 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: ProtonMail wnUEARYKACcFgmYczrIJkJG8s0TkXG4EFiEEyut/r/ADqlpVUS/JkbyzRORc bgQAAMhvAPwJuHxSh2LWHa2AJpGDfqkFpPBzJ8aGBRsfHZXLWueXZwEA5GWc bZCk72MS5Koigj2RxcZfVwaF1wLwiGeMMWlo8wk= =1mK7 -----END PGP SIGNATURE----- --------5989a174ef5164c947ba569366e1dbd159220500668abaa2c57abdf49b891174--