Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tdncb-009YYD-B9 for pgsql-general@arkaria.postgresql.org; Fri, 31 Jan 2025 09:47:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tdnca-00H3eD-5b for pgsql-general@arkaria.postgresql.org; Fri, 31 Jan 2025 09:47:32 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tdncZ-00H3dl-OL for pgsql-general@lists.postgresql.org; Fri, 31 Jan 2025 09:47:31 +0000 Received: from mail-ej1-x632.google.com ([2a00:1450:4864:20::632]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tdncX-002RP8-23 for pgsql-general@lists.postgresql.org; Fri, 31 Jan 2025 09:47:30 +0000 Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-ab6fb2940d4so118042666b.1 for ; Fri, 31 Jan 2025 01:47:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738316848; x=1738921648; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=yhwWesBsMSapIqdDHgBQUyHNil55PSWwSNXMQvkbOQc=; b=gN5sSLrp+xAHRDQVGC6I6DGphdV7UNTGSntyjWzpuzdD3hNFGrPWo0pDW/6ufu3UvC CjRE/WG5HYZar3Yclht2dQ7Y7fbIESRCP/0yCvm2BxHzLYtisGSBl+uLFurR+GrW1PIS ntDOiwV6ui6yGCc0KF+Y7k+zRccFptZ1hXRWjtzv6LQh4UKg1wDUudkDfKv1KlzTob1b mdsbd+X4LcmBDia1SLAJT6NiH9RE6F0EO7M5n5wv8Mhcx9WeQQIIE3ykjkdwp53o2MX/ v1LiFeq0vqV4TBk756S4cqgbMWGmcC05Mi/qucIYOdnYGoIonRybsanITGE58KPmhfh0 ZjTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738316848; x=1738921648; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=yhwWesBsMSapIqdDHgBQUyHNil55PSWwSNXMQvkbOQc=; b=g/FE767e8/3VD13L+gI7sElLCmy1oMUnbf2hrkOsUo+gNBC4hR5DwD/amrHAntzXOO FX2SCEOTHPtKfCbs2vBwhbeYHKvM3Lex0HLeITbQrU7JOE8IDzUJVKkQoHfJqPsXesmP kkHgTuwdawqRdVi4iO2WgcAFYbbs3tlvpehEPfadwM0+m3veAwY4Qiztl/BlSQQNipUn mXfkE2DaLIM0yZAaOgj+4dWnctQtqjBzPJEc0li0bYxYb+0/lzrGeTpSd0gIM2sAdFnb xlMPZ4n149taB2SN6G3f0E4mS4I3KzoCzVNRnJ5FwO7vgRRhi5r1uMyYdkY1KfkS/Bc7 s3Ng== X-Gm-Message-State: AOJu0YwvMcvN5iSG0tT2y5XXwtntAhB2GPxQmdMHiwi++TGBv3HUk/s8 oYGkNxxhmS2Qqi47140jFvRscEPMv071DcbZ9Sr18M7XFjBpz/yu9xAy0INqw/32zOkFfO2rpDG 366rx6Ejufeb6wqGFFRV6uKYcvxT4Usc1 X-Gm-Gg: ASbGncsM4JKdlyvz2RI+5VJg0XELouKyJYajwarKOXhMtyyk/uql2lyLS9kTYO1CekH OreOHzM3bi9hNkUtxqFWBo+tnIbTskJQVTreteM4leD7oWun61HJhgqaRJdgfAj0pM6rLPkNm X-Google-Smtp-Source: AGHT+IGvDH3dor9tDS6wuKo0WNvvgheVVr/6os9uqL6KI9d9vcZmet/IDLkWIJuJCAtz7hUzTi4mGTzyKjxHG5P3k48= X-Received: by 2002:a17:906:7951:b0:ab6:f0d3:9687 with SMTP id a640c23a62f3a-ab6f0d3e9cdmr415786166b.21.1738316848184; Fri, 31 Jan 2025 01:47:28 -0800 (PST) MIME-Version: 1.0 From: Koen De Groote Date: Fri, 31 Jan 2025 10:47:17 +0100 X-Gm-Features: AWEUYZm4gKdM41fDy9MmnSwz7n9cO6uhLpEVjylFOBsVRIoY_B81gJcs-HHS7YM Message-ID: Subject: Postgres restore sometimes restores to a point 2 days in the past To: PostgreSQL General Content-Type: multipart/alternative; boundary="0000000000003e4577062cfd6e80" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000003e4577062cfd6e80 Content-Type: text/plain; charset="UTF-8" I'm running postgres 16.6 My backup strategy is: basebackup and WAL archive. These get uploaded to the cloud. The restore is on an isolated machine and is performed daily. It downloads the basebackup, unpacks it, sets a recovery.signal, and a script is provided as restore_command, to download the WAL archives %f and unpack them into %p In the script, the final unpacking is simply "gzip -dc %f > %p". The gz files are first checked with "gzip -t". If a WAL archive is asked that doesn't exist yet, the script naturally cannot find it, and exits with status code 1. This is the end of the recovery. There are a few tables that are known to receive new entries multiple times per day. However, the state of the recovery showed the latest item to be 2 days in the past. Checking the live DB, there are an expected amount of items since that ID. I checked the logs, the last WAL archive that got downloaded is indeed the last one that was available. The one that failed to download on the restore machine, was uploaded to the cloud 8 minutes later, according to the upload logs on the live DB. The postgres logs themselves seem perfectly normal. It logs all these WAL recoveries, switches the timeline, and becomes available. What could be going wrong? My main issue is that I don't know where to start looking, since nothing in the logs seems abnormal. Regards, Koen De Groote --0000000000003e4577062cfd6e80 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I'm running postgres 16.6

My backup= strategy is: basebackup and WAL archive. These get uploaded to the cloud.<= /div>

The restore is on an isolated machine and is perfo= rmed daily. It downloads the basebackup, unpacks it, sets a recovery.signal= , and a script is provided as restore_command, to download the WAL archives= =C2=A0%f and unpack them into %p

In the script, th= e final unpacking is simply "gzip -dc %f > %p". The gz files a= re first checked with "gzip -t".

If a WAL archiv= e is asked that doesn't exist yet, the script naturally cannot find it,= and exits with status code 1. This is the end of the recovery.

There are a few tables that are known to receive new entries multip= le times per day. However, the state of the recovery showed the latest item= to be 2 days in the past. Checking the live DB, there are an expected amou= nt of items since that ID.

I checked the logs, the= last WAL archive that got downloaded is indeed the last one that was avail= able. The one that failed to download on the restore machine, was uploaded = to the cloud 8 minutes later, according to the upload logs on the live DB.<= /div>

The postgres logs themselves seem perfectly normal= . It logs all these WAL recoveries, switches the timeline, and becomes avai= lable.

What could be going wrong? My main issue is= that I don't know where to start looking, since nothing in the logs se= ems abnormal.

Regards,
Koen De Groote
--0000000000003e4577062cfd6e80--