Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vqQcn-00Acdt-2X for pgsql-hackers@arkaria.postgresql.org; Thu, 12 Feb 2026 06:56:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vqQcm-007E1v-2o for pgsql-hackers@arkaria.postgresql.org; Thu, 12 Feb 2026 06:56:29 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vqQcm-007E1m-16 for pgsql-hackers@lists.postgresql.org; Thu, 12 Feb 2026 06:56:29 +0000 Received: from forwardcorp1d.mail.yandex.net ([178.154.239.200]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vqQck-00000000H5I-29mQ for pgsql-hackers@postgresql.org; Thu, 12 Feb 2026 06:56:28 +0000 Received: from mail-nwsmtp-smtp-corp-main-80.iva.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-80.iva.yp-c.yandex.net [IPv6:2a02:6b8:c0c:1e1e:0:640:c6a6:0]) by forwardcorp1d.mail.yandex.net (Yandex) with ESMTPS id DCE4380C13; Thu, 12 Feb 2026 09:56:23 +0300 (MSK) Received: from smtpclient.apple (unknown [2a02:6bf:8080:935::1:30]) by mail-nwsmtp-smtp-corp-main-80.iva.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id LuN0dU0Awa60-fS5fOJYR; Thu, 12 Feb 2026 09:56:23 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1770879383; bh=JFzIvooFleJ7QCPy1LtVrm4oTAnN2TM85c1JrEzoWyw=; h=References:To:Cc:In-Reply-To:Date:From:Message-Id:Subject; b=KqVW01qAgWN/W0R/lD09rO1OoWFNOliQXvOQsFPLALw/rOGzHtB2uvieHKAOqLKBE TLLKUBUAsjJhl5tWjXr8HDMu4LpTkdsG8bcK9rSc712Xgn6szPVF+LwVp71VNPE2XA GdRZXVc+rhj61BxxAx3z2HvFcGXdGHO4+di1pzfY= Authentication-Results: mail-nwsmtp-smtp-corp-main-80.iva.yp-c.yandex.net; dkim=pass header.i=@yandex-team.ru From: Andrey Borodin Message-Id: Content-Type: multipart/mixed; boundary="Apple-Mail=_0C363831-9101-4A4D-87C3-689941891F87" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.300.41.1.7\)) Subject: Re: Streaming replication and WAL archive interactions Date: Thu, 12 Feb 2026 11:56:10 +0500 In-Reply-To: <5550D20D.6090703@iki.fi> Cc: Michael Paquier , Robert Haas , Venkata Balaji N , Andres Freund , Fujii Masao , Borodin Vladimir , PostgreSQL-development , nkak@vmware.com, Roman Khapov , Kirill Reshke , ShirishaRao@vmware.com To: hlinnaka@iki.fi References: <548AF1CB.80702@vmware.com> <689EB259-44C2-4820-B901-4F6B1C55A1E4@simply.name> <549083D6.1000301@vmware.com> <54949108.3030109@vmware.com> <552FA38F.9060005@iki.fi> <5535FE71.1010905@iki.fi> <55362CAD.2000207@iki.fi> <553741FE.1080403@iki.fi> <554CB84E.3070406@iki.fi> <5550D20D.6090703@iki.fi> X-Mailer: Apple Mail (2.3864.300.41.1.7) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_0C363831-9101-4A4D-87C3-689941891F87 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 11 May 2015, at 21:00, Heikki Linnakangas wrote: >=20 > Applied that part. >=20 >> Now that we got this last-partial-segment problem out of the way, I'm >> going to try fixing the problem you (Michael) pointed out about = relying >> on pgstat file. Meanwhile, I'd love to get more feedback on the rest = of >> the patch, and the documentation. >=20 > And here is a new version of the patch. I kept the approach of using = pgstat, but it now only polls pgstat every 10 seconds, and doesn't block = to wait for updated stats. Hi Heikki, There=E2=80=99s a nearby thread [0] (about 10 years later) where I=E2=80=99= m working on a problem your patch from this thread helps solve. In datacenter large outages, 1=E2=80=932% of clusters end up with gaps = in their PITR timeline. In HA setups, when the primary is lost, some WAL can be missing from the = archive even though it was streamed to the standby. Many HA tools = (PGConsul, Patroni, etc.) try to re-archive from the standby, but those = WAL files may already have been removed. Your =E2=80=9Cshared=E2=80=9D archive mode addresses this: the standby = keeps WAL until it=E2=80=99s archived. archive_mode=3Dalways plus an = archive tool can work, but it=E2=80=99s expensive. In WAL-G, for = example, the archive command does a GET on the standby=E2=80=99s WAL, = then decrypts and compares. Switching to HEAD would reduce cost in some = clouds but still adds cost. Another option is coordinating archiving outside Postgres, but that = would mean building distributed coordination into the archive tool. Shared archive mode tackles this in Postgres itself. I=E2=80=99ve retrofitted your patch, incorporated ideas from the = Greenplum work [1], and made some improvements. The patchset has three parts: * Rebase + tests =E2=80=93 Your original patch, rebased, with tests = added. * Timeline switching =E2=80=93 Correct handling of timeline switches in = archive status updates. * Avoid directory scans =E2=80=93 Skip scanning archive_status when = possible, which was costly in WAL-G setups. What do you think? Best regards, Andrey Borodin. --Apple-Mail=_0C363831-9101-4A4D-87C3-689941891F87 Content-Disposition: attachment; filename=v4-0001-Add-archive_mode-shared-for-coordinated-WAL-archi.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="v4-0001-Add-archive_mode-shared-for-coordinated-WAL-archi.patch" Content-Transfer-Encoding: quoted-printable =46rom=20334637b17d6ea1f93bb10966a064eb70ca472db9=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Andrey=20Borodin=20=0ADate:=20= Tue,=2010=20Feb=202026=2012:47:32=20+0500=0ASubject:=20[PATCH=20v4=20= 1/3]=20Add=20archive_mode=3Dshared=20for=20coordinated=20WAL=20archiving=0A= =0AIntroduce=20a=20new=20archive_mode=20setting=20"shared"=20to=20= prevent=20WAL=20history=0Aloss=20during=20standby=20promotion=20in=20HA=20= streaming=20replication=20setups.=0A=0AIn=20shared=20mode,=20the=20= primary=20proactively=20sends=20archival=20status=20updates=0Ato=20= standbys=20via=20the=20replication=20protocol.=20The=20standby=20creates=20= .ready=0Afiles=20for=20received=20WAL=20segments=20but=20defers=20= marking=20them=20as=20.done=20until=0Athe=20primary=20confirms=20= archival.=20This=20prevents=20WAL=20from=20being=20recycled=0Abefore=20= it's=20safely=20archived,=20addressing=20a=20critical=20gap=20in=20PITR=20= continuity=0Aduring=20failover.=0A=0AKey=20implementation=20details:=0A=0A= -=20Primary=20periodically=20sends=20last=20archived=20WAL=20segment=20= via=20new=0A=20=20PqReplMsg_ArchiveStatusReport=20('a')=20message=0A-=20= Standby=20marks=20all=20segments=20<=3D=20reported=20segment=20as=20= .done=20using=0A=20=20alphanumeric=20comparison=20on=20segment=20part=20= (timeline-safe)=0A-=20Archiver=20skips=20during=20recovery=20in=20shared=20= mode,=20activates=20on=20promotion=0A-=20Cascading=20replication:=20each=20= standby=20coordinates=20with=20immediate=20upstream=0A-=20Startup=20= check=20rejects=20archive_mode=3Don=20during=20recovery=0A=0AThis=20= "push"=20design=20(primary=20sends=20status)=20is=20more=20efficient=20= than=20"pull"=0A(standby=20queries=20per-segment),=20avoiding=20= directory=20scans=20and=20stat()=20calls.=0ABased=20on=20Heikki=20= Linnakangas's=202014=20design=20and=20Greenplum's=20production=0A= implementation,=20modernized=20for=20PostgreSQL=2019.=0A=0AIncludes=20= TAP=20tests=20covering=20basic=20synchronization,=20promotion,=0A= cascading=20replication,=20and=20multiple=20standbys=20scenarios.=0A---=0A= =20doc/src/sgml/config.sgml=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20|=20=2036=20++-=0A=20doc/src/sgml/high-availability.sgml=20=20=20=20= =20=20=20|=20=2072=20++++--=0A=20src/backend/access/transam/xlog.c=20=20=20= =20=20=20=20=20=20|=20=20=201=20+=0A=20src/backend/postmaster/pgarch.c=20= =20=20=20=20=20=20=20=20=20=20|=20=2017=20+-=0A=20= src/backend/replication/walreceiver.c=20=20=20=20=20|=20146=20= +++++++++++-=0A=20src/backend/replication/walsender.c=20=20=20=20=20=20=20= |=20=2093=20++++++++=0A=20src/include/access/xlog.h=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20|=20=20=201=20+=0A=20= src/include/libpq/protocol.h=20=20=20=20=20=20=20=20=20=20=20=20=20=20|=20= =20=201=20+=0A=20src/test/recovery/t/050_archive_shared.pl=20|=20270=20= ++++++++++++++++++++++=0A=209=20files=20changed,=20599=20insertions(+),=20= 38=20deletions(-)=0A=20create=20mode=20100644=20= src/test/recovery/t/050_archive_shared.pl=0A=0Adiff=20--git=20= a/doc/src/sgml/config.sgml=20b/doc/src/sgml/config.sgml=0Aindex=20= 37342986969..5a6af71d2f8=20100644=0A---=20a/doc/src/sgml/config.sgml=0A= +++=20b/doc/src/sgml/config.sgml=0A@@=20-3845,14=20+3845,36=20@@=20= include_dir=20'conf.d'=0A=20=20=20=20=20=20=20=20=20are=20sent=20to=20= archive=20storage=20by=20setting=0A=20=20=20=20=20=20=20=20=20=20or=0A=20=20=20=20=20=20=20=20=20= .=20In=20addition=20to=20= off,=0A-=20=20=20=20=20=20=20=20to=20disable,=20there=20= are=20two=20modes:=20on,=20and=0A-=20=20=20=20=20=20=20= =20always.=20During=20normal=20operation,=20there=20= is=20no=0A-=20=20=20=20=20=20=20=20difference=20between=20the=20two=20= modes,=20but=20when=20set=20to=20always=0A-=20=20=20=20= =20=20=20=20the=20WAL=20archiver=20is=20enabled=20also=20during=20= archive=20recovery=20or=20standby=0A-=20=20=20=20=20=20=20=20mode.=20In=20= always=20mode,=20all=20files=20restored=20from=20the=20= archive=0A-=20=20=20=20=20=20=20=20or=20streamed=20with=20streaming=20= physical=20replication=20will=20be=20archived=20(again).=20See=0A-=20=20=20= =20=20=20=20=20=20= for=20details.=0A+=20=20=20=20=20=20=20=20to=20disable,=20there=20are=20= three=20modes:=20on,=20shared,=0A+=20= =20=20=20=20=20=20=20and=20always.=20During=20normal=20= operation=20as=20a=20primary,=20there=20is=20no=0A+=20=20=20=20=20=20=20=20= difference=20between=20the=20three=20modes,=20but=20they=20differ=20= during=20archive=20recovery=20or=0A+=20=20=20=20=20=20=20=20standby=20= mode:=0A=20=20=20=20=20=20=20=20=0A+=20=20=20=20=20=20=20= =0A+=20=20=20=20=20=20=20=20=0A+=20=20=20=20=20=20= =20=20=20=0A+=20=20=20=20=20=20=20=20=20=20on:=20= Archives=20WAL=20only=20when=20running=20as=20a=20primary.=0A+=20=20=20=20= =20=20=20=20=20=0A+=20=20=20=20=20=20=20=20=0A+=20=20=20= =20=20=20=20=20=0A+=20=20=20=20=20=20=20=20=20=0A+=20=20=20= =20=20=20=20=20=20=20shared:=20Coordinates=20= archiving=20between=20primary=20and=20standby.=0A+=20=20=20=20=20=20=20=20= =20=20The=20standby=20defers=20WAL=20archival=20and=20deletion=20until=20= the=20primary=20confirms=0A+=20=20=20=20=20=20=20=20=20=20archival=20via=20= streaming=20replication.=20This=20prevents=20WAL=20history=20loss=20= during=0A+=20=20=20=20=20=20=20=20=20=20standby=20promotion=20in=20high=20= availability=20setups.=20Upon=20promotion,=20the=20standby=0A+=20=20=20=20= =20=20=20=20=20=20automatically=20starts=20archiving=20any=20remaining=20= unarchived=20WAL.=20This=20mode=20works=0A+=20=20=20=20=20=20=20=20=20=20= with=20cascading=20replication,=20where=20each=20standby=20coordinates=20= with=20its=20immediate=0A+=20=20=20=20=20=20=20=20=20=20upstream=20= server.=20See=20=20= for=20details.=0A+=20=20=20=20=20=20=20=20=20=0A+=20=20=20=20=20=20= =20=20=0A+=20=20=20=20=20=20=20=20=0A+=20=20=20=20=20= =20=20=20=20=0A+=20=20=20=20=20=20=20=20=20=20= always:=20Archives=20all=20WAL=20independently,=20= even=20during=20recovery.=0A+=20=20=20=20=20=20=20=20=20=20All=20files=20= restored=20from=20the=20archive=20or=20streamed=20with=20streaming=20= physical=0A+=20=20=20=20=20=20=20=20=20=20replication=20will=20be=20= archived=20(again),=20regardless=20of=20their=20source.=0A+=20=20=20=20=20= =20=20=20=20=0A+=20=20=20=20=20=20=20=20=0A+=20=20=20=20= =20=20=20=0A=20=20=20=20=20=20=20=20=0A=20=20=20=20=20= =20=20=20=20archive_mode=20is=20a=20separate=20= setting=20from=0A=20=20=20=20=20=20=20=20=20= archive_command=20and=0Adiff=20--git=20= a/doc/src/sgml/high-availability.sgml=20= b/doc/src/sgml/high-availability.sgml=0Aindex=20c3f269e0364..8f1a4d6784c=20= 100644=0A---=20a/doc/src/sgml/high-availability.sgml=0A+++=20= b/doc/src/sgml/high-availability.sgml=0A@@=20-1447,35=20+1447,61=20@@=20= postgres=3D#=20WAIT=20FOR=20LSN=20'0/306EE20';=0A=20=20=20=20= =0A=20=0A=20=20=20=20=0A-=20=20=20=20=20When=20= continuous=20WAL=20archiving=20is=20used=20in=20a=20standby,=20there=20= are=20two=0A-=20=20=20=20=20different=20scenarios:=20the=20WAL=20archive=20= can=20be=20shared=20between=20the=20primary=0A-=20=20=20=20=20and=20the=20= standby,=20or=20the=20standby=20can=20have=20its=20own=20WAL=20archive.=20= When=0A-=20=20=20=20=20the=20standby=20has=20its=20own=20WAL=20archive,=20= set=20archive_mode=0A+=20=20=20=20=20When=20= continuous=20WAL=20archiving=20is=20used=20in=20a=20standby,=20there=20= are=20three=0A+=20=20=20=20=20different=20scenarios:=20the=20standby=20= can=20have=20its=20own=20independent=20WAL=20archive,=0A+=20=20=20=20=20= the=20WAL=20archive=20can=20be=20shared=20between=20the=20primary=20and=20= standby,=20or=20archiving=0A+=20=20=20=20=20can=20be=20coordinated=20= between=20them.=0A+=20=20=20=0A+=0A+=20=20=20=0A+=20=20=20=20= =20For=20an=20independent=20archive,=20set=20= archive_mode=0A=20=20=20=20=20=20to=20= always,=20and=20the=20standby=20will=20call=20the=20= archive=0A=20=20=20=20=20=20command=20for=20every=20WAL=20segment=20it=20= receives,=20whether=20it's=20by=20restoring=0A-=20=20=20=20=20from=20the=20= archive=20or=20by=20streaming=20replication.=20The=20shared=20archive=20= can=0A-=20=20=20=20=20be=20handled=20similarly,=20but=20the=20= archive_command=20or=20= archive_library=20must=0A-=20=20=20=20=20test=20if=20= the=20file=20being=20archived=20exists=20already,=20and=20if=20the=20= existing=20file=0A-=20=20=20=20=20has=20identical=20contents.=20This=20= requires=20more=20care=20in=20the=0A-=20=20=20=20=20= archive_command=20or=20= archive_library,=20as=20it=20must=0A-=20=20=20=20=20= be=20careful=20to=20not=20overwrite=20an=20existing=20file=20with=20= different=20contents,=0A-=20=20=20=20=20but=20return=20success=20if=20= the=20exactly=20same=20file=20is=20archived=20twice.=20And=0A-=20=20=20=20= =20all=20that=20must=20be=20done=20free=20of=20race=20conditions,=20if=20= two=20servers=20attempt=0A-=20=20=20=20=20to=20archive=20the=20same=20= file=20at=20the=20same=20time.=0A+=20=20=20=20=20from=20the=20archive=20= or=20by=20streaming=20replication.=0A+=20=20=20=0A+=0A+=20=20=20= =0A+=20=20=20=20=20For=20a=20shared=20archive=20where=20both=20= primary=20and=20standby=20can=20write,=20use=0A+=20=20=20=20=20= always=20mode=20as=20well,=20but=20the=20= archive_command=0A+=20=20=20=20=20or=20= archive_library=20must=20test=20if=20the=20file=20= being=20archived=0A+=20=20=20=20=20exists=20already,=20and=20if=20the=20= existing=20file=20has=20identical=20contents.=20This=20requires=0A+=20=20= =20=20=20more=20care=20in=20the=20archive_command=20= or=20archive_library,=0A+=20=20=20=20=20as=20it=20= must=20be=20careful=20to=20not=20overwrite=20an=20existing=20file=20with=20= different=20contents,=0A+=20=20=20=20=20but=20return=20success=20if=20= the=20exactly=20same=20file=20is=20archived=20twice.=20And=20all=20that=20= must=0A+=20=20=20=20=20be=20done=20free=20of=20race=20conditions,=20if=20= two=20servers=20attempt=20to=20archive=20the=20same=20file=0A+=20=20=20=20= =20at=20the=20same=20time.=0A+=20=20=20=0A+=0A+=20=20=20=0A= +=20=20=20=20=20For=20coordinated=20archiving=20in=20high=20availability=20= setups,=20use=0A+=20=20=20=20=20= archive_mode=3Dshared.=20In=20this=20= mode,=20only=0A+=20=20=20=20=20the=20primary=20archives=20WAL=20= segments.=20The=20standby=20creates=20.ready=0A+=20=20= =20=20=20files=20for=20received=20segments=20but=20defers=20actual=20= archiving.=20The=20primary=20periodically=0A+=20=20=20=20=20sends=20= archival=20status=20updates=20to=20the=20standby=20via=20streaming=20= replication,=20informing=0A+=20=20=20=20=20it=20which=20segments=20have=20= been=20archived.=20The=20standby=20then=20marks=20these=20as=20archived=0A= +=20=20=20=20=20and=20allows=20them=20to=20be=20recycled.=20Upon=20= promotion,=20the=20standby=20automatically=20starts=0A+=20=20=20=20=20= archiving=20any=20remaining=20WAL=20segments=20that=20weren't=20= confirmed=20as=20archived=20by=20the=0A+=20=20=20=20=20former=20primary.=20= This=20prevents=20WAL=20history=20loss=20during=20failover=20while=20= avoiding=0A+=20=20=20=20=20the=20complexity=20of=20coordinating=20= concurrent=20archiving.=20This=20mode=20works=20with=20cascading=0A+=20=20= =20=20=20replication,=20where=20each=20standby=20coordinates=20with=20= its=20immediate=20upstream=20server.=0A=20=20=20=20=0A=20=0A=20=20= =20=20=0A=20=20=20=20=20=20If=20archive_mode=20= is=20set=20to=20on,=20the=0A-=20=20=20=20=20archiver=20= is=20not=20enabled=20during=20recovery=20or=20standby=20mode.=20If=20the=20= standby=0A-=20=20=20=20=20server=20is=20promoted,=20it=20will=20start=20= archiving=20after=20the=20promotion,=20but=0A-=20=20=20=20=20will=20not=20= archive=20any=20WAL=20or=20timeline=20history=20files=20that=0A-=20=20=20= =20=20it=20did=20not=20generate=20itself.=20To=20get=20a=20complete=0A-=20= =20=20=20=20series=20of=20WAL=20files=20in=20the=20archive,=20you=20must=20= ensure=20that=20all=20WAL=20is=0A-=20=20=20=20=20archived,=20before=20it=20= reaches=20the=20standby.=20This=20is=20inherently=20true=20with=0A-=20=20= =20=20=20file-based=20log=20shipping,=20as=20the=20standby=20can=20only=20= restore=20files=20that=0A-=20=20=20=20=20are=20found=20in=20the=20= archive,=20but=20not=20if=20streaming=20replication=20is=20enabled.=0A-=20= =20=20=20=20When=20a=20server=20is=20not=20in=20recovery=20mode,=20there=20= is=20no=20difference=20between=0A-=20=20=20=20=20on=20= and=20always=20modes.=0A+=20=20=20=20=20archiver=20is=20= not=20enabled=20during=20recovery=20or=20standby=20mode,=20and=20this=20= setting=0A+=20=20=20=20=20cannot=20be=20used=20on=20a=20standby.=20If=20= a=20standby=20with=20archive_mode=0A+=20=20=20=20=20= set=20to=20on=20is=20promoted,=20it=20will=20start=20= archiving=20after=20the=0A+=20=20=20=20=20promotion,=20but=20will=20not=20= archive=20any=20WAL=20or=20timeline=20history=20files=20that=20it=20did=0A= +=20=20=20=20=20not=20generate=20itself.=20To=20get=20a=20complete=20= series=20of=20WAL=20files=20in=20the=20archive,=20you=0A+=20=20=20=20=20= must=20ensure=20that=20all=20WAL=20is=20archived=20before=20it=20reaches=20= the=20standby.=20This=20is=0A+=20=20=20=20=20inherently=20true=20with=20= file-based=20log=20shipping,=20as=20the=20standby=20can=20only=20restore=0A= +=20=20=20=20=20files=20that=20are=20found=20in=20the=20archive,=20but=20= not=20if=20streaming=20replication=20is=20enabled.=0A+=20=20=20=0A= +=0A+=20=20=20=0A+=20=20=20=20=20When=20a=20server=20is=20not=20in=20= recovery=20mode,=20on,=0A+=20=20=20=20=20= shared,=20and=20always=20modes=20= all=20behave=0A+=20=20=20=20=20identically,=20archiving=20completed=20= WAL=20segments.=0A=20=20=20=20=0A=20=20=20=0A=20=20=20= =0Adiff=20--git=20a/src/backend/access/transam/xlog.c=20= b/src/backend/access/transam/xlog.c=0Aindex=2013ec6225b85..a751950b7cd=20= 100644=0A---=20a/src/backend/access/transam/xlog.c=0A+++=20= b/src/backend/access/transam/xlog.c=0A@@=20-195,6=20+195,7=20@@=20const=20= struct=20config_enum_entry=20archive_mode_options[]=20=3D=20{=0A=20=09= {"always",=20ARCHIVE_MODE_ALWAYS,=20false},=0A=20=09{"on",=20= ARCHIVE_MODE_ON,=20false},=0A=20=09{"off",=20ARCHIVE_MODE_OFF,=20false},=0A= +=09{"shared",=20ARCHIVE_MODE_SHARED,=20false},=0A=20=09{"true",=20= ARCHIVE_MODE_ON,=20true},=0A=20=09{"false",=20ARCHIVE_MODE_OFF,=20true},=0A= =20=09{"yes",=20ARCHIVE_MODE_ON,=20true},=0Adiff=20--git=20= a/src/backend/postmaster/pgarch.c=20b/src/backend/postmaster/pgarch.c=0A= index=2082731e452fc..0433126150c=20100644=0A---=20= a/src/backend/postmaster/pgarch.c=0A+++=20= b/src/backend/postmaster/pgarch.c=0A@@=20-385,6=20+385,15=20@@=20= pgarch_ArchiverCopyLoop(void)=0A=20{=0A=20=09char=09=09= xlog[MAX_XFN_CHARS=20+=201];=0A=20=0A+=09/*=0A+=09=20*=20In=20shared=20= archive=20mode=20during=20recovery,=20the=20archiver=20doesn't=20archive=0A= +=09=20*=20files.=20The=20primary=20is=20responsible=20for=20archiving,=20= and=20the=20walreceiver=0A+=09=20*=20marks=20files=20as=20.done=20when=20= the=20primary=20confirms=20archival.=20After=0A+=09=20*=20promotion,=20= the=20archiver=20starts=20working=20normally.=0A+=09=20*/=0A+=09if=20= (XLogArchiveMode=20=3D=3D=20ARCHIVE_MODE_SHARED=20&&=20= RecoveryInProgress())=0A+=09=09return;=0A+=0A=20=09/*=20force=20= directory=20scan=20in=20the=20first=20call=20to=20pgarch_readyXlog()=20= */=0A=20=09arch_files->arch_files_size=20=3D=200;=0A=20=0A@@=20-475,10=20= +484,10=20@@=20pgarch_ArchiverCopyLoop(void)=0A=20=09=09=09=09continue;=0A= =20=09=09=09}=0A=20=0A-=09=09=09if=20(pgarch_archiveXlog(xlog))=0A-=09=09= =09{=0A-=09=09=09=09/*=20successful=20*/=0A-=09=09=09=09= pgarch_archiveDone(xlog);=0A+=09=09if=20(pgarch_archiveXlog(xlog))=0A+=09= =09{=0A+=09=09=09/*=20successful=20*/=0A+=09=09=09= pgarch_archiveDone(xlog);=0A=20=0A=20=09=09=09=09/*=0A=20=09=09=09=09=20= *=20Tell=20the=20cumulative=20stats=20system=20about=20the=20WAL=20file=20= that=20we=0Adiff=20--git=20a/src/backend/replication/walreceiver.c=20= b/src/backend/replication/walreceiver.c=0Aindex=20= 10e64a7d1f4..ed0edd258bb=20100644=0A---=20= a/src/backend/replication/walreceiver.c=0A+++=20= b/src/backend/replication/walreceiver.c=0A@@=20-132,6=20+132,11=20@@=20= static=20TimestampTz=20wakeup[NUM_WALRCV_WAKEUPS];=0A=20=0A=20static=20= StringInfoData=20reply_message;=0A=20=0A+/*=20Last=20archived=20WAL=20= segment=20file=20reported=20by=20the=20primary=20*/=0A+static=20char=20= primary_last_archived[MAX_XFN_CHARS=20+=201];=0A+static=20TimeLineID=20= primary_last_archived_tli=20=3D=200;=0A+static=20XLogSegNo=20= primary_last_archived_segno=20=3D=200;=0A+=0A=20/*=20Prototypes=20for=20= private=20functions=20*/=0A=20static=20void=20= WalRcvFetchTimeLineHistoryFiles(TimeLineID=20first,=20TimeLineID=20= last);=0A=20static=20void=20WalRcvWaitForStartPosition(XLogRecPtr=20= *startpoint,=20TimeLineID=20*startpointTLI);=0A@@=20-145,6=20+150,7=20@@=20= static=20void=20XLogWalRcvClose(XLogRecPtr=20recptr,=20TimeLineID=20= tli);=0A=20static=20void=20XLogWalRcvSendReply(bool=20force,=20bool=20= requestReply);=0A=20static=20void=20XLogWalRcvSendHSFeedback(bool=20= immed);=0A=20static=20void=20ProcessWalSndrMessage(XLogRecPtr=20walEnd,=20= TimestampTz=20sendTime);=0A+static=20void=20ProcessArchivalReport(void);=0A= =20static=20void=20WalRcvComputeNextWakeup(WalRcvWakeupReason=20reason,=20= TimestampTz=20now);=0A=20=0A=20=0A@@=20-888,6=20+894,30=20@@=20= XLogWalRcvProcessMsg(unsigned=20char=20type,=20char=20*buf,=20Size=20= len,=20TimeLineID=20tli)=0A=20=09=09=09=09=09XLogWalRcvSendReply(true,=20= false);=0A=20=09=09=09=09break;=0A=20=09=09=09}=0A+=09=09case=20= PqReplMsg_ArchiveStatusReport:=0A+=09=09=09{=0A+=09=09=09=09/*=20Check=20= that=20the=20filename=20looks=20valid=20*/=0A+=09=09=09=09if=20(len=20>=3D= =20sizeof(primary_last_archived))=0A+=09=09=09=09=09ereport(ERROR,=0A+=09= =09=09=09=09=09=09(errcode(ERRCODE_PROTOCOL_VIOLATION),=0A+=09=09=09=09=09= =09=09=20errmsg_internal("invalid=20archival=20report=20message=20with=20= length=20%d",=0A+=09=09=09=09=09=09=09=09=09=09=09=20(int)=20len)));=0A+=0A= +=09=09=09=09memcpy(primary_last_archived,=20buf,=20len);=0A+=09=09=09=09= primary_last_archived[len]=20=3D=20'\0';=0A+=0A+=09=09=09=09/*=20Verify=20= it=20contains=20only=20valid=20characters=20*/=0A+=09=09=09=09if=20= (strspn(buf,=20VALID_XFN_CHARS)=20!=3D=20len)=0A+=09=09=09=09{=0A+=09=09=09= =09=09primary_last_archived[0]=20=3D=20'\0';=0A+=09=09=09=09=09= ereport(ERROR,=0A+=09=09=09=09=09=09=09= (errcode(ERRCODE_PROTOCOL_VIOLATION),=0A+=09=09=09=09=09=09=09=20= errmsg_internal("unexpected=20character=20in=20primary's=20last=20= archived=20filename")));=0A+=09=09=09=09}=0A+=0A+=09=09=09=09= ProcessArchivalReport();=0A+=09=09=09=09break;=0A+=09=09=09}=0A=20=09=09= default:=0A=20=09=09=09ereport(ERROR,=0A=20=09=09=09=09=09= (errcode(ERRCODE_PROTOCOL_VIOLATION),=0A@@=20-1095,12=20+1125,39=20@@=20= XLogWalRcvClose(XLogRecPtr=20recptr,=20TimeLineID=20tli)=0A=20=0A=20=09= /*=0A=20=09=20*=20Create=20.done=20file=20forcibly=20to=20prevent=20the=20= streamed=20segment=20from=20being=0A-=09=20*=20archived=20later.=0A+=09=20= *=20archived=20later,=20unless=20archive_mode=20is=20'always'=20or=20= 'shared'.=0A+=09=20*=0A+=09=20*=20In=20'always'=20mode,=20the=20standby=20= archives=20independently.=0A+=09=20*=0A+=09=20*=20In=20'shared'=20mode,=20= we=20optimize=20by=20checking=20if=20this=20segment=20is=20already=0A+=09= =20*=20covered=20by=20the=20last=20archival=20report=20from=20the=20= primary.=20If=20so,=20create=0A+=09=20*=20.done=20directly.=20Otherwise,=20= create=20.ready=20and=20wait=20for=20the=20next=20report.=0A=20=09=20*/=0A= -=09if=20(XLogArchiveMode=20!=3D=20ARCHIVE_MODE_ALWAYS)=0A-=09=09= XLogArchiveForceDone(xlogfname);=0A-=09else=0A+=09if=20(XLogArchiveMode=20= =3D=3D=20ARCHIVE_MODE_ALWAYS)=0A+=09{=0A=20=09=09= XLogArchiveNotify(xlogfname);=0A+=09}=0A+=09else=20if=20(XLogArchiveMode=20= =3D=3D=20ARCHIVE_MODE_SHARED)=0A+=09{=0A+=09=09/*=0A+=09=09=20*=20In=20= shared=20mode,=20check=20if=20this=20segment=20is=20already=20archived=20= on=20primary.=0A+=09=09=20*=20If=20we're=20on=20the=20same=20timeline=20= and=20this=20segment=20is=20<=3D=20last=20archived,=0A+=09=09=20*=20mark=20= it=20.done=20immediately.=20Otherwise=20create=20.ready.=0A+=09=09=20*/=0A= +=09=09if=20(primary_last_archived_tli=20=3D=3D=20recvFileTLI=20&&=0A+=09= =09=09recvSegNo=20<=3D=20primary_last_archived_segno)=0A+=09=09{=0A+=09=09= =09XLogArchiveForceDone(xlogfname);=0A+=09=09}=0A+=09=09else=0A+=09=09{=0A= +=09=09=09XLogArchiveNotify(xlogfname);=0A+=09=09}=0A+=09}=0A+=09else=0A= +=09{=0A+=09=09XLogArchiveForceDone(xlogfname);=0A+=09}=0A=20=0A=20=09= recvFile=20=3D=20-1;=0A=20}=0A@@=20-1277,6=20+1334,87=20@@=20= XLogWalRcvSendHSFeedback(bool=20immed)=0A=20=09=09= primary_has_standby_xmin=20=3D=20false;=0A=20}=0A=20=0A+/*=0A+=20*=20= Process=20archival=20report=20from=20primary.=0A+=20*=0A+=20*=20The=20= primary=20sends=20us=20the=20last=20WAL=20segment=20it=20has=20archived.=20= We=20scan=20the=0A+=20*=20archive_status=20directory=20for=20.ready=20= files=20and=20mark=20segments=20on=20the=20same=0A+=20*=20timeline=20as=20= .done=20if=20they're=20<=3D=20the=20reported=20segment.=0A+=20*/=0A= +static=20void=0A+ProcessArchivalReport(void)=0A+{=0A+=09TimeLineID=09= reported_tli;=0A+=09XLogSegNo=09reported_segno;=0A+=09DIR=09=09=20=20=20= *status_dir;=0A+=09struct=20dirent=20*status_de;=0A+=09char=09=09= status_path[MAXPGPATH];=0A+=0A+=09elog(DEBUG2,=20"received=20archival=20= report=20from=20primary:=20%s",=0A+=09=09=20primary_last_archived);=0A+=0A= +=09/*=20Parse=20the=20reported=20WAL=20filename=20*/=0A+=09if=20= (!IsXLogFileName(primary_last_archived))=0A+=09{=0A+=09=09elog(DEBUG2,=20= "invalid=20WAL=20filename=20in=20archival=20report:=20%s",=0A+=09=09=09=20= primary_last_archived);=0A+=09=09return;=0A+=09}=0A+=0A+=09= XLogFromFileName(primary_last_archived,=20&reported_tli,=20= &reported_segno,=0A+=09=09=09=09=09=20wal_segment_size);=0A+=0A+=09/*=20= Remember=20the=20last=20archived=20segment=20for=20XLogWalRcvClose()=20= */=0A+=09primary_last_archived_tli=20=3D=20reported_tli;=0A+=09= primary_last_archived_segno=20=3D=20reported_segno;=0A+=0A+=09/*=20Scan=20= archive_status=20directory=20for=20.ready=20files=20*/=0A+=09= snprintf(status_path,=20MAXPGPATH,=20XLOGDIR=20"/archive_status");=0A+=09= status_dir=20=3D=20AllocateDir(status_path);=0A+=09if=20(status_dir=20=3D=3D= =20NULL)=0A+=09{=0A+=09=09elog(DEBUG2,=20"could=20not=20open=20= archive_status=20directory:=20%m");=0A+=09=09return;=0A+=09}=0A+=0A+=09= while=20((status_de=20=3D=20ReadDir(status_dir,=20status_path))=20!=3D=20= NULL)=0A+=09{=0A+=09=09char=09=20=20=20*ready_suffix;=0A+=09=09char=09=09= walfile[MAXPGPATH];=0A+=09=09TimeLineID=09file_tli;=0A+=09=09XLogSegNo=09= file_segno;=0A+=09=09/*=20Look=20for=20.ready=20files=20only=20*/=0A+=09=09= ready_suffix=20=3D=20strstr(status_de->d_name,=20".ready");=0A+=09=09if=20= (ready_suffix=20=3D=3D=20NULL=20||=20ready_suffix[6]=20!=3D=20'\0')=0A+=09= =09=09continue;=0A+=0A+=09=09/*=20Extract=20WAL=20filename=20(remove=20= .ready=20suffix)=20*/=0A+=09=09strlcpy(walfile,=20status_de->d_name,=20= ready_suffix=20-=20status_de->d_name=20+=201);=0A+=0A+=09=09/*=20Parse=20= the=20WAL=20filename=20*/=0A+=09=09if=20(!IsXLogFileName(walfile))=0A+=09= =09=09continue;=0A+=0A+=09=09XLogFromFileName(walfile,=20&file_tli,=20= &file_segno,=20wal_segment_size);=0A+=0A+=09=09/*=0A+=09=09=20*=20Mark=20= as=20.done=20if=20it's=20on=20the=20same=20timeline=20and=20not=20after=20= the=0A+=09=09=20*=20reported=20segment.=20We=20only=20process=20the=20= reported=20timeline=20to=20avoid=0A+=09=09=20*=20marking=20segments=20= from=20parent=20or=20future=20timelines=20prematurely.=0A+=09=09=20*=20= XXX:=20Process=20possible=20TLI=20switches=20happened=20between=20status=20= reports.=0A+=09=09=20*=20For=20now,=20leave=20segments=20on=20previous=20= TLIs=20to=20archive_command.=0A+=09=09=20*/=0A+=09=09if=20(file_tli=20=3D=3D= =20reported_tli=20&&=20file_segno=20<=3D=20reported_segno)=0A+=09=09{=0A= +=09=09=09XLogArchiveForceDone(walfile);=0A+=09=09=09elog(DEBUG3,=20= "marked=20WAL=20segment=20%s=20as=20archived=20(primary=20archived=20up=20= to=20%s)",=0A+=09=09=09=09=20walfile,=20primary_last_archived);=0A+=09=09= }=0A+=09}=0A+=0A+=09FreeDir(status_dir);=0A+}=0A+=0A=20/*=0A=20=20*=20= Update=20shared=20memory=20status=20upon=20receiving=20a=20message=20= from=20primary.=0A=20=20*=0Adiff=20--git=20= a/src/backend/replication/walsender.c=20= b/src/backend/replication/walsender.c=0Aindex=202cde8ebc729..aa045a37d82=20= 100644=0A---=20a/src/backend/replication/walsender.c=0A+++=20= b/src/backend/replication/walsender.c=0A@@=20-189,6=20+189,17=20@@=20= static=20TimestampTz=20last_reply_timestamp=20=3D=200;=0A=20/*=20Have=20= we=20sent=20a=20heartbeat=20message=20asking=20for=20reply,=20since=20= last=20reply?=20*/=0A=20static=20bool=20waiting_for_ping_response=20=3D=20= false;=0A=20=0A+/*=0A+=20*=20Last=20archived=20WAL=20file.=20This=20is=20= fetched=20from=20pgstat=20periodically=20and=20sent=0A+=20*=20to=20the=20= standby.=20last_archival_report_timestamp=20tracks=20when=20we=20last=20= sent=0A+=20*=20the=20report=20to=20avoid=20excessive=20pgstat=20access.=0A= +=20*/=0A+static=20char=20last_archived_wal[MAX_XFN_CHARS=20+=201];=0A= +static=20TimestampTz=20last_archival_report_timestamp=20=3D=200;=0A+=0A= +/*=20Interval=20for=20sending=20archival=20reports=20(10=20seconds)=20= */=0A+#define=20ARCHIVAL_REPORT_INTERVAL=2010000=0A+=0A=20/*=0A=20=20*=20= While=20streaming=20WAL=20in=20Copy=20mode,=20streamingDoneSending=20is=20= set=20to=20true=0A=20=20*=20after=20we=20have=20sent=20CopyDone.=20We=20= should=20not=20send=20any=20more=20CopyData=20messages=0A@@=20-276,6=20= +287,7=20@@=20static=20void=20ProcessStandbyMessage(void);=0A=20static=20= void=20ProcessStandbyReplyMessage(void);=0A=20static=20void=20= ProcessStandbyHSFeedbackMessage(void);=0A=20static=20void=20= ProcessStandbyPSRequestMessage(void);=0A+static=20void=20= WalSndArchivalReport(void);=0A=20static=20void=20= ProcessRepliesIfAny(void);=0A=20static=20void=20= ProcessPendingWrites(void);=0A=20static=20void=20WalSndKeepalive(bool=20= requestReply,=20XLogRecPtr=20writePtr);=0A@@=20-2748,6=20+2760,84=20@@=20= ProcessStandbyHSFeedbackMessage(void)=0A=20=09}=0A=20}=0A=20=0A+/*=0A+=20= *=20Send=20archival=20status=20report=20to=20standby.=0A+=20*=0A+=20*=20= This=20is=20called=20periodically=20during=20physical=20replication=20to=20= inform=20the=0A+=20*=20standby=20about=20the=20last=20WAL=20segment=20= archived=20by=20the=20primary.=20The=20standby=0A+=20*=20can=20then=20= mark=20segments=20up=20to=20that=20point=20as=20.done,=20allowing=20them=20= to=20be=0A+=20*=20recycled.=20This=20prevents=20WAL=20loss=20during=20= standby=20promotion.=0A+=20*/=0A+static=20void=0A= +WalSndArchivalReport(void)=0A+{=0A+=09PgStat_ArchiverStats=20= *archiver_stats;=0A+=09TimestampTz=20now;=0A+=09char=09=20=20=20= *last_archived;=0A+=0A+=09/*=20Only=20send=20reports=20when=20= archive_mode=3Dshared=20*/=0A+=09if=20(XLogArchiveMode=20!=3D=20= ARCHIVE_MODE_SHARED)=0A+=09=09return;=0A+=0A+=09/*=20Only=20send=20= reports=20during=20physical=20streaming=20replication,=20not=20during=20= backup=20*/=0A+=09if=20(MyWalSnd->kind=20!=3D=20= REPLICATION_KIND_PHYSICAL)=0A+=09=09return;=0A+=09if=20(MyWalSnd->state=20= !=3D=20WALSNDSTATE_CATCHUP=20&&=0A+=09=09MyWalSnd->state=20!=3D=20= WALSNDSTATE_STREAMING)=0A+=09=09return;=0A+=0A+=09/*=0A+=09=20*=20Don't=20= send=20to=20temporary=20replication=20slots=20(used=20by=20= pg_basebackup).=0A+=09=20*=20Connections=20without=20slots=20(regular=20= standbys)=20are=20OK.=0A+=09=20*/=0A+=09if=20(MyReplicationSlot=20!=3D=20= NULL=20&&=0A+=09=09MyReplicationSlot->data.persistency=20=3D=3D=20= RS_TEMPORARY)=0A+=09=09return;=0A+=0A+=09now=20=3D=20= GetCurrentTimestamp();=0A+=0A+=09/*=0A+=09=20*=20Send=20report=20at=20= most=20once=20per=20ARCHIVAL_REPORT_INTERVAL=20(10=20seconds).=0A+=09=20= *=20This=20avoids=20excessive=20pgstat=20access.=0A+=09=20*/=0A+=09if=20= (now=20<=20TimestampTzPlusMilliseconds(last_archival_report_timestamp,=0A= +=09=09=09=09=09=09=09=09=09=09=20=20ARCHIVAL_REPORT_INTERVAL))=0A+=09=09= return;=0A+=09last_archival_report_timestamp=20=3D=20now;=0A+=09/*=0A+=09= =20*=20Get=20archiver=20statistics.=20We=20use=20non-blocking=20access=20= to=20avoid=20delaying=0A+=09=20*=20replication=20if=20stats=20collector=20= is=20slow.=20If=20stats=20are=20unavailable=20or=0A+=09=20*=20stale,=20= we'll=20just=20try=20again=20at=20the=20next=20interval.=0A+=09=20*/=0A+=09= archiver_stats=20=3D=20pgstat_fetch_stat_archiver();=0A+=09if=20= (archiver_stats=20=3D=3D=20NULL)=0A+=09=09return;=0A+=0A+=09= last_archived=20=3D=20archiver_stats->last_archived_wal;=0A+=09/*=0A+=09=20= *=20Only=20send=20a=20report=20if=20the=20last=20archived=20WAL=20has=20= changed.=20This=20is=20both=0A+=09=20*=20an=20optimization=20and=20= ensures=20we=20don't=20send=20empty=20reports=20on=20startup.=0A+=09=20= */=0A+=09if=20(strcmp(last_archived,=20last_archived_wal)=20=3D=3D=200)=0A= +=09=09return;=0A+=0A+=09/*=20Only=20send=20reports=20for=20WAL=20= segments,=20not=20backup=20history=20files=20or=20other=20archived=20= files=20*/=0A+=09if=20(!IsXLogFileName(last_archived))=0A+=09=09return;=0A= +=0A+=09elog(DEBUG2,=20"sending=20archival=20report:=20%s",=20= last_archived);=0A+=0A+=09/*=20Remember=20what=20we=20sent=20*/=0A+=09= strlcpy(last_archived_wal,=20last_archived,=20= sizeof(last_archived_wal));=0A+=0A+=09/*=20Construct=20the=20message...=20= */=0A+=09resetStringInfo(&output_message);=0A+=09= pq_sendbyte(&output_message,=20PqReplMsg_ArchiveStatusReport);=0A+=09= pq_sendbytes(&output_message,=20last_archived,=20strlen(last_archived));=0A= +=09/*=20...=20and=20send=20it=20wrapped=20in=20CopyData=20*/=0A+=09= pq_putmessage_noblock(PqMsg_CopyData,=20output_message.data,=20= output_message.len);=0A+}=0A+=0A=20/*=0A=20=20*=20Process=20the=20= request=20for=20a=20primary=20status=20update=20message.=0A=20=20*/=0A@@=20= -4227,6=20+4317,9=20@@=20WalSndKeepaliveIfNecessary(void)=0A=20=09=09if=20= (pq_flush_if_writable()=20!=3D=200)=0A=20=09=09=09WalSndShutdown();=0A=20= =09}=0A+=0A+=09/*=20Send=20archival=20status=20report=20if=20needed=20*/=0A= +=09WalSndArchivalReport();=0A=20}=0A=20=0A=20/*=0Adiff=20--git=20= a/src/include/access/xlog.h=20b/src/include/access/xlog.h=0Aindex=20= fdfb572467b..7b0caa5cbf6=20100644=0A---=20a/src/include/access/xlog.h=0A= +++=20b/src/include/access/xlog.h=0A@@=20-66,6=20+66,7=20@@=20typedef=20= enum=20ArchiveMode=0A=20=09ARCHIVE_MODE_OFF=20=3D=200,=09=09/*=20= disabled=20*/=0A=20=09ARCHIVE_MODE_ON,=09=09=09/*=20enabled=20while=20= server=20is=20running=20normally=20*/=0A=20=09ARCHIVE_MODE_ALWAYS,=09=09= /*=20enabled=20always=20(even=20during=20recovery)=20*/=0A+=09= ARCHIVE_MODE_SHARED,=09=09/*=20shared=20archive=20between=20primary=20= and=20standby=20*/=0A=20}=20ArchiveMode;=0A=20extern=20PGDLLIMPORT=20int=20= XLogArchiveMode;=0A=20=0Adiff=20--git=20a/src/include/libpq/protocol.h=20= b/src/include/libpq/protocol.h=0Aindex=20eae8f0e7238..d22aaf9e225=20= 100644=0A---=20a/src/include/libpq/protocol.h=0A+++=20= b/src/include/libpq/protocol.h=0A@@=20-72,6=20+72,7=20@@=0A=20=0A=20/*=20= Replication=20codes=20sent=20by=20the=20primary=20(wrapped=20in=20= CopyData=20messages).=20*/=0A=20=0A+#define=20= PqReplMsg_ArchiveStatusReport=20'a'=0A=20#define=20PqReplMsg_Keepalive=09= =09=09'k'=0A=20#define=20PqReplMsg_PrimaryStatusUpdate=20's'=0A=20= #define=20PqReplMsg_WALData=09=09=09'w'=0Adiff=20--git=20= a/src/test/recovery/t/050_archive_shared.pl=20= b/src/test/recovery/t/050_archive_shared.pl=0Anew=20file=20mode=20100644=0A= index=2000000000000..397b71ad79d=0A---=20/dev/null=0A+++=20= b/src/test/recovery/t/050_archive_shared.pl=0A@@=20-0,0=20+1,270=20@@=0A= +#=20Copyright=20(c)=202025,=20PostgreSQL=20Global=20Development=20Group=0A= +=0A+#=20Test=20archive_mode=3Dshared=20for=20coordinated=20WAL=20= archiving=20between=20primary=20and=20standby=0A+use=20strict;=0A+use=20= warnings=20FATAL=20=3D>=20'all';=0A+use=20PostgreSQL::Test::Cluster;=0A= +use=20PostgreSQL::Test::Utils;=0A+use=20Test::More;=0A+use=20File::Path=20= qw(rmtree);=0A+=0A+#=20Initialize=20primary=20node=20with=20archiving=0A= +my=20$archive_dir=20=3D=20PostgreSQL::Test::Utils::tempdir();=0A+my=20= $primary=20=3D=20PostgreSQL::Test::Cluster->new('primary');=0A= +$primary->init(has_archiving=20=3D>=201,=20allows_streaming=20=3D>=20= 1);=0A+$primary->append_conf('postgresql.conf',=20"=0A+archive_mode=20=3D=20= shared=0A+archive_command=20=3D=20'cp=20%p=20\"$archive_dir\"/%f'=0A= +wal_keep_size=20=3D=20128MB=0A+");=0A+$primary->start;=0A+=0A+#=20= Create=20a=20test=20table=20and=20generate=20some=20WAL=0A= +$primary->safe_psql('postgres',=20'CREATE=20TABLE=20test_table=20(id=20= int,=20data=20text);');=0A+$primary->safe_psql('postgres',=20"INSERT=20= INTO=20test_table=20SELECT=20i,=20'data'=20||=20i=20FROM=20= generate_series(1,=20500)=20i;");=0A+$primary->safe_psql('postgres',=20= 'SELECT=20pg_switch_wal();');=0A+$primary->safe_psql('postgres',=20= "INSERT=20INTO=20test_table=20SELECT=20i,=20'data'=20||=20i=20FROM=20= generate_series(501,=201000)=20i;");=0A+$primary->safe_psql('postgres',=20= 'SELECT=20pg_switch_wal();');=0A+=0A+#=20Wait=20for=20archiver=20to=20= archive=20segments=0A+$primary->poll_query_until('postgres',=0A+=09= "SELECT=20archived_count=20>=200=20FROM=20pg_stat_archiver")=0A+=09or=20= die=20"Timed=20out=20waiting=20for=20archiver=20to=20start";=0A+=0A+my=20= $archived_count=20=3D=20()=20=3D=20glob("$archive_dir/*");=0A= +ok($archived_count=20>=200,=20"primary=20has=20archived=20WAL=20files=20= to=20shared=20archive");=0A+note("Primary=20archived=20$archived_count=20= files");=0A+=0A+#=20Take=20backup=20for=20standby=0A+my=20$backup_name=20= =3D=20'standby_backup';=0A+$primary->backup($backup_name);=0A+=0A+#=20= Exclude=20possible=20race=20condition=20when=20backup=20WAL=20is=20last=20= archived=0A+$primary->safe_psql('postgres',=20"INSERT=20INTO=20= test_table=20SELECT=20i,=20'data'=20||=20i=20FROM=20generate_series(501,=20= 1000)=20i;");=0A+$primary->safe_psql('postgres',=20'SELECT=20= pg_switch_wal();');=0A+=0A+#=20Set=20up=20standby=20with=20= archive_mode=3Dshared=0A+my=20$standby=20=3D=20= PostgreSQL::Test::Cluster->new('standby');=0A= +$standby->init_from_backup($primary,=20$backup_name,=20has_streaming=20= =3D>=201);=0A+$standby->append_conf('postgresql.conf',=20"=0A= +archive_mode=20=3D=20shared=0A+archive_command=20=3D=20'cp=20%p=20= \"$archive_dir\"/%f'=0A+wal_receiver_status_interval=20=3D=201s=0A+");=0A= +$standby->start;=0A+=0A+#=20Wait=20for=20standby=20to=20catch=20up=0A= +$primary->wait_for_catchup($standby);=0A+=0A+#=20Generate=20more=20WAL=20= on=20primary=20(these=20are=20new=20segments=20not=20yet=20archived)=0A= +$primary->safe_psql('postgres',=20"INSERT=20INTO=20test_table=20SELECT=20= i,=20'data'=20||=20i=20FROM=20generate_series(1001,=201500)=20i;");=0A= +$primary->safe_psql('postgres',=20'SELECT=20pg_switch_wal();');=0A= +$primary->safe_psql('postgres',=20"INSERT=20INTO=20test_table=20SELECT=20= i,=20'data'=20||=20i=20FROM=20generate_series(1501,=202000)=20i;");=0A= +$primary->safe_psql('postgres',=20'SELECT=20pg_switch_wal();');=0A+=0A= +#=20Wait=20for=20standby=20to=20receive=20the=20new=20WAL=0A= +$primary->wait_for_catchup($standby);=0A+=0A+#=20Check=20that=20standby=20= has=20.ready=20or=20.done=20files=20for=20the=20newly=20received=20= segments.=0A+#=20Normally=20they=20should=20be=20.ready=20(not=20yet=20= archived=20by=20primary),=20but=20in=20rare=20cases=0A+#=20the=20= archiver=20could=20be=20very=20fast=20and=20an=20archive=20report=20sent=20= immediately,=20creating=0A+#=20.done=20files=20instead.=20Both=20are=20= correct=20behavior=20-=20the=20key=20is=20that=20files=20exist.=0A+my=20= $standby_archive_status=20=3D=20$standby->data_dir=20.=20= '/pg_wal/archive_status';=0A+my=20$status_count=20=3D=200;=0A+if=20= (opendir(my=20$dh,=20$standby_archive_status))=0A+{=0A+=09my=20@files=20= =3D=20grep=20{=20/\.(ready|done)$/=20}=20readdir($dh);=0A+=09= $status_count=20=3D=20scalar(@files);=0A+=09my=20$ready_count=20=3D=20= scalar(grep=20{=20/\.ready$/=20}=20@files);=0A+=09my=20$done_count=20=3D=20= scalar(grep=20{=20/\.done$/=20}=20@files);=0A+=09note("Standby=20has=20= $ready_count=20.ready=20files=20and=20$done_count=20.done=20files");=0A+=09= closedir($dh);=0A+}=0A+cmp_ok($status_count,=20'>',=200,=20"standby=20= creates=20archive=20status=20files=20for=20received=20WAL");=0A+=0A+#=20= Generate=20more=20WAL=20and=20wait=20for=20archiving=20on=20primary=0A= +my=20$initial_archived=20=3D=20$primary->safe_psql('postgres',=20= 'SELECT=20archived_count=20FROM=20pg_stat_archiver');=0A= +$primary->safe_psql('postgres',=20"INSERT=20INTO=20test_table=20SELECT=20= i,=20'more-data'=20||=20i=20FROM=20generate_series(2001,=202500)=20i;");=0A= +$primary->safe_psql('postgres',=20'SELECT=20pg_switch_wal();');=0A= +$primary->safe_psql('postgres',=20"INSERT=20INTO=20test_table=20SELECT=20= i,=20'more-data2'=20||=20i=20FROM=20generate_series(2501,=203000)=20= i;");=0A+$primary->safe_psql('postgres',=20'SELECT=20pg_switch_wal();');=0A= +=0A+#=20Wait=20for=20primary=20to=20archive=20the=20new=20segments=0A= +$primary->poll_query_until('postgres',=0A+=09"SELECT=20archived_count=20= >=20$initial_archived=20FROM=20pg_stat_archiver")=0A+=09or=20die=20= "Timed=20out=20waiting=20for=20primary=20to=20archive=20new=20segments";=0A= +=0A+#=20Wait=20for=20standby=20to=20catch=20up=20(archive=20status=20is=20= sent=20during=20replication)=0A+$primary->wait_for_catchup($standby);=0A= +=0A+#=20Wait=20for=20primary=20to=20send=20archival=20status=20updates=20= and=20standby=20to=20process=20them=0A+#=20The=20standby=20should=20mark=20= segments=20as=20.done=20after=20receiving=20archive=20status=20from=20= primary=0A+my=20$done_count=20=3D=200;=0A+for=20(my=20$i=20=3D=200;=20$i=20= <=20$PostgreSQL::Test::Utils::timeout_default;=20$i++)=0A+{=0A+=09= $done_count=20=3D=200;=0A+=09if=20(opendir(my=20$dh,=20= $standby_archive_status))=0A+=09{=0A+=09=09$done_count=20=3D=20= scalar(grep=20{=20/\.done$/=20}=20readdir($dh));=0A+=09=09closedir($dh);=0A= +=09}=0A+=09last=20if=20$done_count=20>=200;=0A+=09sleep(1);=0A+}=0A= +ok($done_count=20>=200,=20"standby=20marked=20segments=20as=20.done=20= after=20primary's=20archival=20report");=0A+note("Standby=20has=20= $done_count=20.done=20files");=0A+=0A= +#########################################################################= ######=0A+#=20Test=202:=20Standby=20promotion=20-=20verify=20archiver=20= activates=0A= +#########################################################################= ######=0A+=0A+#=20Before=20promotion,=20verify=20archiver=20is=20not=20= running=20on=20standby=20(shared=20mode=20during=20recovery)=0A+#=20In=20= shared=20mode,=20the=20standby's=20archiver=20should=20not=20be=20= archiving=20during=20recovery=0A+my=20$archived_before=20=3D=20= $standby->safe_psql('postgres',=20=0A+=09"SELECT=20archived_count=20FROM=20= pg_stat_archiver");=0A+is($archived_before,=20'0',=20=0A+=09"archiver=20= not=20active=20on=20standby=20before=20promotion=20(archived_count=3D0)");= =0A+=0A+#=20Verify=20standby=20is=20still=20in=20recovery=20before=20= promoting=0A+my=20$in_recovery=20=3D=20$standby->safe_psql('postgres',=20= "SELECT=20pg_is_in_recovery();");=0A+is($in_recovery,=20't',=20"standby=20= is=20in=20recovery=20before=20promotion");=0A+=0A+#=20Promote=20the=20= standby=0A+$standby->promote;=0A+$standby->poll_query_until('postgres',=20= "SELECT=20NOT=20pg_is_in_recovery();");=0A+=0A+#=20Generate=20WAL=20on=20= new=20primary=20(former=20standby)=0A+$standby->safe_psql('postgres',=20= "INSERT=20INTO=20test_table=20SELECT=20i,=20'post-promotion'=20||=20i=20= FROM=20generate_series(2001,=202500)=20i;");=0A= +$standby->safe_psql('postgres',=20'SELECT=20pg_switch_wal();');=0A+=0A= +#=20Wait=20for=20archiver=20to=20activate=20and=20archive=20the=20new=20= WAL=0A+#=20Check=20pg_stat_archiver=20to=20verify=20archiving=20is=20= happening=0A+$standby->poll_query_until('postgres',=0A+=09"SELECT=20= archived_count=20>=200=20FROM=20pg_stat_archiver")=0A+=09or=20die=20= "Timed=20out=20waiting=20for=20promoted=20standby=20to=20start=20= archiving";=0A+pass("promoted=20standby=20started=20archiving");=0A+=0A= +#=20Verify=20data=20integrity=0A+my=20$count=20=3D=20= $standby->safe_psql('postgres',=20'SELECT=20COUNT(*)=20FROM=20= test_table;');=0A+ok($count=20>=3D=202500,=20"promoted=20standby=20has=20= all=20data=20(got=20$count=20rows)");=0A+=0A= +#########################################################################= ######=0A+#=20Test=203:=20Cascading=20replication=0A= +#########################################################################= ######=0A+=0A+#=20Take=20a=20backup=20from=20the=20promoted=20standby=20= (now=20the=20new=20primary)=0A+my=20$promoted_backup=20=3D=20= 'promoted_backup';=0A+$standby->backup($promoted_backup);=0A+=0A+#=20Set=20= up=20second-level=20standby=20(cascading=20from=20first=20standby,=20now=20= promoted)=0A+my=20$standby2=20=3D=20= PostgreSQL::Test::Cluster->new('standby2');=0A= +$standby2->init_from_backup($standby,=20$promoted_backup,=20= has_streaming=20=3D>=201);=0A+$standby2->append_conf('postgresql.conf',=20= "=0A+archive_mode=20=3D=20shared=0A+archive_command=20=3D=20'cp=20%p=20= \"$archive_dir\"/%f'=0A+wal_receiver_status_interval=20=3D=201s=0A+");=0A= +$standby2->start;=0A+=0A+#=20Generate=20WAL=20on=20promoted=20standby=20= (now=20primary=20for=20standby2)=0A+my=20$cascading_archived_before=20=3D=20= $standby->safe_psql('postgres',=20'SELECT=20archived_count=20FROM=20= pg_stat_archiver');=0A+$standby->safe_psql('postgres',=20"INSERT=20INTO=20= test_table=20SELECT=20i,=20'cascading'=20||=20i=20FROM=20= generate_series(2501,=203000)=20i;");=0A+$standby->safe_psql('postgres',=20= 'SELECT=20pg_switch_wal();');=0A+=0A+#=20Wait=20for=20the=20promoted=20= standby=20(acting=20as=20primary)=20to=20archive=20the=20new=20segment=0A= +$standby->poll_query_until('postgres',=0A+=09"SELECT=20archived_count=20= >=20$cascading_archived_before=20FROM=20pg_stat_archiver")=0A+=09or=20= die=20"Timed=20out=20waiting=20for=20primary=20to=20archive=20segment=20= in=20cascading=20test";=0A+=0A+#=20Wait=20for=20cascading=20standby=20to=20= catch=20up=0A+$standby->wait_for_catchup($standby2);=0A+=0A+#=20Wait=20= for=20cascading=20standby=20to=20receive=20archive=20status=20and=20mark=20= segments=20as=20.done=0A+my=20$standby2_archive_status=20=3D=20= $standby2->data_dir=20.=20'/pg_wal/archive_status';=0A+my=20= $standby2_done_count=20=3D=200;=0A+for=20(my=20$i=20=3D=200;=20$i=20<=20= $PostgreSQL::Test::Utils::timeout_default;=20$i++)=0A+{=0A+=09= $standby2_done_count=20=3D=200;=0A+=09if=20(opendir(my=20$dh,=20= $standby2_archive_status))=0A+=09{=0A+=09=09$standby2_done_count=20=3D=20= scalar(grep=20{=20/\.done$/=20}=20readdir($dh));=0A+=09=09closedir($dh);=0A= +=09}=0A+=09last=20if=20$standby2_done_count=20>=200;=0A+=09sleep(1);=0A= +}=0A+ok($standby2_done_count=20>=200,=20"cascading=20standby=20marks=20= segments=20as=20.done");=0A+note("Cascading=20standby=20has=20= $standby2_done_count=20.done=20files");=0A+=0A+#=20Verify=20cascading=20= standby=20has=20all=20data=0A+my=20$standby2_count=20=3D=20= $standby2->safe_psql('postgres',=20'SELECT=20COUNT(*)=20FROM=20= test_table;');=0A+ok($standby2_count=20>=3D=203000,=20"cascading=20= standby=20has=20all=20data=20(got=20$standby2_count=20rows)");=0A+=0A= +#########################################################################= ######=0A+#=20Test=204:=20Multiple=20standbys=20from=20same=20primary=0A= +#########################################################################= ######=0A+=0A+#=20Create=20third=20standby=20from=20promoted=20standby=20= (current=20primary)=0A+my=20$standby3=20=3D=20= PostgreSQL::Test::Cluster->new('standby3');=0A+my=20$backup2=20=3D=20= 'multi_standby_backup';=0A+$standby->backup($backup2);=0A= +$standby3->init_from_backup($standby,=20$backup2,=20has_streaming=20=3D>=20= 1);=0A+$standby3->append_conf('postgresql.conf',=20"=0A+archive_mode=20=3D= =20shared=0A+archive_command=20=3D=20'cp=20%p=20\"$archive_dir\"/%f'=0A= +wal_receiver_status_interval=20=3D=201s=0A+");=0A+$standby3->start;=0A+=0A= +#=20Generate=20WAL=20and=20ensure=20both=20standbys=20receive=20it=0A= +my=20$standby_archived_before=20=3D=20$standby->safe_psql('postgres',=20= 'SELECT=20archived_count=20FROM=20pg_stat_archiver');=0A= +$standby->safe_psql('postgres',=20"INSERT=20INTO=20test_table=20SELECT=20= i,=20'multi'=20||=20i=20FROM=20generate_series(3001,=203500)=20i;");=0A= +$standby->safe_psql('postgres',=20'SELECT=20pg_switch_wal();');=0A+=0A= +#=20Wait=20for=20the=20promoted=20standby=20(acting=20as=20primary)=20= to=20archive=20the=20new=20segment=0A= +$standby->poll_query_until('postgres',=0A+=09"SELECT=20archived_count=20= >=20$standby_archived_before=20FROM=20pg_stat_archiver")=0A+=09or=20die=20= "Timed=20out=20waiting=20for=20primary=20to=20archive=20segment=20in=20= multi-standby=20test";=0A+=0A+$standby->wait_for_catchup($standby2);=0A= +$standby->wait_for_catchup($standby3);=0A+=0A+#=20Verify=20both=20= standbys=20eventually=20mark=20segments=20as=20.done=0A+my=20= $standby3_archive_status=20=3D=20$standby3->data_dir=20.=20= '/pg_wal/archive_status';=0A+=0A+for=20(my=20$i=20=3D=200;=20$i=20<=20= $PostgreSQL::Test::Utils::timeout_default;=20$i++)=0A+{=0A+=09= $standby2_done_count=20=3D=200;=0A+=09if=20(opendir(my=20$dh,=20= $standby2_archive_status))=0A+=09{=0A+=09=09$standby2_done_count=20=3D=20= scalar(grep=20{=20/\.done$/=20}=20readdir($dh));=0A+=09=09closedir($dh);=0A= +=09}=0A+=09last=20if=20$standby2_done_count=20>=200;=0A+=09sleep(1);=0A= +}=0A+=0A+my=20$standby3_done_count=20=3D=200;=0A+for=20(my=20$i=20=3D=20= 0;=20$i=20<=20$PostgreSQL::Test::Utils::timeout_default;=20$i++)=0A+{=0A= +=09$standby3_done_count=20=3D=200;=0A+=09if=20(opendir(my=20$dh,=20= $standby3_archive_status))=0A+=09{=0A+=09=09$standby3_done_count=20=3D=20= scalar(grep=20{=20/\.done$/=20}=20readdir($dh));=0A+=09=09closedir($dh);=0A= +=09}=0A+=09last=20if=20$standby3_done_count=20>=200;=0A+=09sleep(1);=0A= +}=0A+=0A+ok($standby2_done_count=20>=200,=20"standby2=20marks=20= segments=20as=20.done");=0A+ok($standby3_done_count=20>=200,=20"standby3=20= marks=20segments=20as=20.done");=0A+note("standby2=20has=20= $standby2_done_count=20.done=20files,=20standby3=20has=20= $standby3_done_count=20.done=20files");=0A+=0A+#=20Verify=20both=20= standbys=20have=20all=20data=0A+$standby2_count=20=3D=20= $standby2->safe_psql('postgres',=20'SELECT=20COUNT(*)=20FROM=20= test_table;');=0A+my=20$standby3_count=20=3D=20= $standby3->safe_psql('postgres',=20'SELECT=20COUNT(*)=20FROM=20= test_table;');=0A+ok($standby2_count=20>=3D=203500,=20"standby2=20has=20= all=20data=20(got=20$standby2_count=20rows)");=0A+ok($standby3_count=20= >=3D=203500,=20"standby3=20has=20all=20data=20(got=20$standby3_count=20= rows)");=0A+=0A+done_testing();=0A--=20=0A2.51.2=0A=0A= --Apple-Mail=_0C363831-9101-4A4D-87C3-689941891F87 Content-Disposition: attachment; filename=v4-0003-Optimize-ProcessArchivalReport-to-avoid-directory.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="v4-0003-Optimize-ProcessArchivalReport-to-avoid-directory.patch" Content-Transfer-Encoding: quoted-printable =46rom=203478544f5c9603a3f7816f03d8d48390dd870be1=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Andrey=20Borodin=20=0ADate:=20= Wed,=2011=20Feb=202026=2018:17:25=20+0500=0ASubject:=20[PATCH=20v4=20= 3/3]=20Optimize=20ProcessArchivalReport=20to=20avoid=20directory=0A=20= scans=0A=0AWhen=20archive=20status=20reports=20arrive=20sequentially=20= on=20the=20same=20timeline,=0Adirectly=20generate=20expected=20WAL=20= filenames=20and=20mark=20them=20as=20archived=0Ainstead=20of=20scanning=20= the=20entire=20archive_status=20directory.=0A=0AThis=20optimization=20= reduces=20overhead=20in=20the=20common=20case=20where=20the=20primary=0A= continuously=20archives=20segments.=20Directory=20scan=20is=20still=20= used=20when:=0A-=20Timeline=20changes=20(to=20handle=20ancestor=20= timelines)=0A-=20First=20report=20received=0A-=20Non-sequential=20= reports=0A=0AXLogArchiveForceDone()=20handles=20all=20cases=20internally=20= (checking=20if=20.done=0Aexists,=20if=20.ready=20exists,=20or=20creating=20= .done=20if=20neither=20exists),=20so=20no=0Apre-check=20is=20needed.=0A= ---=0A=20src/backend/replication/walreceiver.c=20|=20196=20= +++++++++++++++++---------=0A=201=20file=20changed,=20132=20= insertions(+),=2064=20deletions(-)=0A=0Adiff=20--git=20= a/src/backend/replication/walreceiver.c=20= b/src/backend/replication/walreceiver.c=0Aindex=20= 1613a5f8752..cda6a5d2df2=20100644=0A---=20= a/src/backend/replication/walreceiver.c=0A+++=20= b/src/backend/replication/walreceiver.c=0A@@=20-137,6=20+137,14=20@@=20= static=20char=20primary_last_archived[MAX_XFN_CHARS=20+=201];=0A=20= static=20TimeLineID=20primary_last_archived_tli=20=3D=200;=0A=20static=20= XLogSegNo=20primary_last_archived_segno=20=3D=200;=0A=20=0A+/*=0A+=20*=20= Last=20segment=20we=20successfully=20marked=20as=20.done.=20Used=20to=20= optimize=0A+=20*=20ProcessArchivalReport()=20by=20generating=20expected=20= filenames=20instead=0A+=20*=20of=20scanning=20the=20archive_status=20= directory.=0A+=20*/=0A+static=20TimeLineID=20last_processed_tli=20=3D=20= 0;=0A+static=20XLogSegNo=20last_processed_segno=20=3D=200;=0A+=0A=20/*=20= Prototypes=20for=20private=20functions=20*/=0A=20static=20void=20= WalRcvFetchTimeLineHistoryFiles(TimeLineID=20first,=20TimeLineID=20= last);=0A=20static=20void=20WalRcvWaitForStartPosition(XLogRecPtr=20= *startpoint,=20TimeLineID=20*startpointTLI);=0A@@=20-1351,10=20+1359,9=20= @@=20ProcessArchivalReport(void)=0A=20{=0A=20=09TimeLineID=09= reported_tli;=0A=20=09XLogSegNo=09reported_segno;=0A-=09DIR=09=09=20=20=20= *status_dir;=0A-=09struct=20dirent=20*status_de;=0A=20=09char=09=09= status_path[MAXPGPATH];=0A-=09List=09=20=20=20*tli_history=20=3D=20NIL;=0A= +=09bool=09=09use_direct_check=20=3D=20false;=0A+=09XLogSegNo=09= start_segno;=0A=20=0A=20=09elog(DEBUG2,=20"received=20archival=20report=20= from=20primary:=20%s",=0A=20=09=09=20primary_last_archived);=0A@@=20= -1374,90=20+1381,151=20@@=20ProcessArchivalReport(void)=0A=20=09= primary_last_archived_tli=20=3D=20reported_tli;=0A=20=09= primary_last_archived_segno=20=3D=20reported_segno;=0A=20=0A-=09/*=20= Scan=20archive_status=20directory=20for=20.ready=20files=20*/=0A-=09= snprintf(status_path,=20MAXPGPATH,=20XLOGDIR=20"/archive_status");=0A-=09= status_dir=20=3D=20AllocateDir(status_path);=0A-=09if=20(status_dir=20=3D=3D= =20NULL)=0A+=09/*=0A+=09=20*=20Optimization:=20If=20the=20new=20report=20= is=20on=20the=20same=20timeline=20as=20the=20last=0A+=09=20*=20processed=20= segment=20and=20moves=20forward,=20we=20can=20directly=20check=20for=20= .ready=0A+=09=20*=20files=20for=20segments=20between=20= last_processed_segno=20and=20reported_segno=0A+=09=20*=20instead=20of=20= scanning=20the=20entire=20archive_status=20directory.=0A+=09=20*=0A+=09=20= *=20Fall=20back=20to=20directory=20scan=20if:=0A+=09=20*=20-=20Timeline=20= changed=20(need=20to=20handle=20ancestor=20timelines)=0A+=09=20*=20-=20= This=20is=20the=20first=20report=20(last_processed_tli=20=3D=3D=200)=0A+=09= =20*=20-=20Reported=20segment=20is=20not=20ahead=20(nothing=20new=20to=20= process)=0A+=09=20*/=0A+=09if=20(last_processed_tli=20=3D=3D=20= reported_tli=20&&=0A+=09=09last_processed_tli=20!=3D=200=20&&=0A+=09=09= reported_segno=20>=20last_processed_segno)=0A=20=09{=0A-=09=09= elog(DEBUG2,=20"could=20not=20open=20archive_status=20directory:=20%m");=0A= -=09=09return;=0A+=09=09use_direct_check=20=3D=20true;=0A+=09=09= start_segno=20=3D=20last_processed_segno=20+=201;=0A=20=09}=0A=20=0A-=09= while=20((status_de=20=3D=20ReadDir(status_dir,=20status_path))=20!=3D=20= NULL)=0A+=09if=20(use_direct_check)=0A=20=09{=0A-=09=09char=09=20=20=20= *ready_suffix;=0A-=09=09char=09=09walfile[MAXPGPATH];=0A-=09=09= TimeLineID=09file_tli;=0A-=09=09XLogSegNo=09file_segno;=0A-=09=09/*=20= Look=20for=20.ready=20files=20only=20*/=0A-=09=09ready_suffix=20=3D=20= strstr(status_de->d_name,=20".ready");=0A-=09=09if=20(ready_suffix=20=3D=3D= =20NULL=20||=20ready_suffix[6]=20!=3D=20'\0')=0A-=09=09=09continue;=0A-=0A= -=09=09/*=20Extract=20WAL=20filename=20(remove=20.ready=20suffix)=20*/=0A= -=09=09strlcpy(walfile,=20status_de->d_name,=20ready_suffix=20-=20= status_de->d_name=20+=201);=0A-=0A-=09=09/*=20Parse=20the=20WAL=20= filename=20*/=0A-=09=09if=20(!IsXLogFileName(walfile))=0A-=09=09=09= continue;=0A-=0A-=09=09XLogFromFileName(walfile,=20&file_tli,=20= &file_segno,=20wal_segment_size);=0A-=0A=20=09=09/*=0A-=09=09=20*=20Mark=20= as=20.done=20if:=0A-=09=09=20*=201.=20Same=20timeline=20and=20segment=20= <=3D=20reported=20segment,=20OR=0A-=09=09=20*=202.=20Ancestor=20timeline=20= and=20segment=20is=20before=20the=20timeline=20switch=20point=0A-=09=09=20= *=0A-=09=09=20*=20For=20ancestor=20timelines:=20if=20primary=20archived=20= segment=20X=20on=20timeline=20T,=0A-=09=09=20*=20then=20all=20segments=20= on=20ancestor=20timelines=20before=20the=20switch=20to=20T=20must=0A-=09=09= =20*=20have=20been=20archived=20(they're=20required=20to=20reach=20= timeline=20T).=0A+=09=09=20*=20Direct=20check:=20generate=20filenames=20= for=20expected=20segments.=0A+=09=09=20*=20XLogArchiveForceDone()=20will=20= handle=20the=20case=20where=20.ready=20doesn't=0A+=09=09=20*=20exist=20= or=20.done=20already=20exists,=20so=20no=20need=20to=20stat()=20first.=0A= =20=09=09=20*/=0A-=09=09if=20(file_tli=20=3D=3D=20reported_tli=20&&=20= file_segno=20<=3D=20reported_segno)=0A+=09=09XLogSegNo=09segno;=0A+=0A+=09= =09for=20(segno=20=3D=20start_segno;=20segno=20<=3D=20reported_segno;=20= segno++)=0A=20=09=09{=0A-=09=09=09/*=20Same=20timeline,=20segment=20= already=20archived=20*/=0A+=09=09=09char=09=09walfile[MAXFNAMELEN];=0A+=0A= +=09=09=09/*=20Generate=20WAL=20filename=20and=20mark=20as=20archived=20= */=0A+=09=09=09XLogFileName(walfile,=20reported_tli,=20segno,=20= wal_segment_size);=0A=20=09=09=09XLogArchiveForceDone(walfile);=0A=20=09=09= =09elog(DEBUG3,=20"marked=20WAL=20segment=20%s=20as=20archived=20= (primary=20archived=20up=20to=20%s)",=0A=20=09=09=09=09=20walfile,=20= primary_last_archived);=0A+=0A+=09=09=09/*=20Track=20the=20last=20= segment=20we=20processed=20*/=0A+=09=09=09last_processed_tli=20=3D=20= reported_tli;=0A+=09=09=09last_processed_segno=20=3D=20segno;=0A+=09=09}=0A= +=09}=0A+=09else=0A+=09{=0A+=09=09/*=0A+=09=09=20*=20Directory=20scan:=20= needed=20when=20timeline=20changed=20or=20first=20report.=0A+=09=09=20*=20= This=20handles=20both=20same-timeline=20and=20ancestor-timeline=20cases.=0A= +=09=09=20*/=0A+=09=09DIR=09=09=20=20=20*status_dir;=0A+=09=09struct=20= dirent=20*status_de;=0A+=09=09List=09=20=20=20*tli_history=20=3D=20NIL;=0A= +=0A+=09=09snprintf(status_path,=20MAXPGPATH,=20XLOGDIR=20= "/archive_status");=0A+=09=09status_dir=20=3D=20= AllocateDir(status_path);=0A+=09=09if=20(status_dir=20=3D=3D=20NULL)=0A+=09= =09{=0A+=09=09=09elog(DEBUG2,=20"could=20not=20open=20archive_status=20= directory:=20%m");=0A+=09=09=09return;=0A=20=09=09}=0A-=09=09else=20if=20= (file_tli=20!=3D=20reported_tli)=0A+=0A+=09=09while=20((status_de=20=3D=20= ReadDir(status_dir,=20status_path))=20!=3D=20NULL)=0A=20=09=09{=0A+=09=09= =09char=09=20=20=20*ready_suffix;=0A+=09=09=09char=09=09= walfile[MAXPGPATH];=0A+=09=09=09TimeLineID=09file_tli;=0A+=09=09=09= XLogSegNo=09file_segno;=0A+=0A+=09=09=09/*=20Look=20for=20.ready=20files=20= only=20*/=0A+=09=09=09ready_suffix=20=3D=20strstr(status_de->d_name,=20= ".ready");=0A+=09=09=09if=20(ready_suffix=20=3D=3D=20NULL=20||=20= ready_suffix[6]=20!=3D=20'\0')=0A+=09=09=09=09continue;=0A+=0A+=09=09=09= /*=20Extract=20WAL=20filename=20(remove=20.ready=20suffix)=20*/=0A+=09=09= =09strlcpy(walfile,=20status_de->d_name,=20ready_suffix=20-=20= status_de->d_name=20+=201);=0A+=0A+=09=09=09/*=20Parse=20the=20WAL=20= filename=20*/=0A+=09=09=09if=20(!IsXLogFileName(walfile))=0A+=09=09=09=09= continue;=0A+=0A+=09=09=09XLogFromFileName(walfile,=20&file_tli,=20= &file_segno,=20wal_segment_size);=0A+=0A=20=09=09=09/*=0A-=09=09=09=20*=20= Different=20timeline=20-=20check=20if=20it's=20an=20ancestor=20and=20if=20= this=0A-=09=09=09=20*=20segment=20is=20before=20the=20timeline=20switch=20= point.=20Only=20read=20timeline=0A-=09=09=09=20*=20history=20if=20we=20= haven't=20already=20(lazy=20loading).=0A+=09=09=09=20*=20Mark=20as=20= .done=20if:=0A+=09=09=09=20*=201.=20Same=20timeline=20and=20segment=20<=3D= =20reported=20segment,=20OR=0A+=09=09=09=20*=202.=20Ancestor=20timeline=20= and=20segment=20is=20before=20the=20timeline=20switch=20point=0A=20=09=09= =09=20*=0A-=09=09=09=20*=20Note:=20Timelines=20form=20a=20tree=20= structure,=20not=20a=20linear=20sequence,=0A-=09=09=09=20*=20so=20we=20= can't=20use=20<=20or=20>=20to=20compare=20them.=0A+=09=09=09=20*=20For=20= ancestor=20timelines:=20if=20primary=20archived=20segment=20X=20on=20= timeline=20T,=0A+=09=09=09=20*=20then=20all=20segments=20on=20ancestor=20= timelines=20before=20the=20switch=20to=20T=20must=0A+=09=09=09=20*=20= have=20been=20archived=20(they're=20required=20to=20reach=20timeline=20= T).=0A=20=09=09=09=20*/=0A-=09=09=09if=20(tli_history=20=3D=3D=20NIL)=0A= -=09=09=09=09tli_history=20=3D=20readTimeLineHistory(reported_tli);=0A-=0A= -=09=09=09if=20(tliInHistory(file_tli,=20tli_history))=0A+=09=09=09if=20= (file_tli=20=3D=3D=20reported_tli=20&&=20file_segno=20<=3D=20= reported_segno)=0A+=09=09=09{=0A+=09=09=09=09/*=20Same=20timeline,=20= segment=20already=20archived=20*/=0A+=09=09=09=09= XLogArchiveForceDone(walfile);=0A+=09=09=09=09elog(DEBUG3,=20"marked=20= WAL=20segment=20%s=20as=20archived=20(primary=20archived=20up=20to=20= %s)",=0A+=09=09=09=09=09=20walfile,=20primary_last_archived);=0A+=09=09=09= }=0A+=09=09=09else=20if=20(file_tli=20!=3D=20reported_tli)=0A=20=09=09=09= {=0A-=09=09=09=09XLogRecPtr=09switchpoint;=0A-=09=09=09=09XLogSegNo=09= switchpoint_segno;=0A-=0A-=09=09=09=09/*=20Get=20the=20point=20where=20= we=20switched=20away=20from=20this=20timeline=20*/=0A-=09=09=09=09= switchpoint=20=3D=20tliSwitchPoint(file_tli,=20tli_history,=20NULL);=0A-=0A= =20=09=09=09=09/*=0A-=09=09=09=09=20*=20If=20the=20segment=20is=20at=20= or=20before=20the=20switch=20point,=20it=20must=20have=0A-=09=09=09=09=20= *=20been=20archived=20(it's=20required=20to=20reach=20the=20reported=20= timeline).=0A-=09=09=09=09=20*=20The=20segment=20containing=20the=20= switch=20point=20belongs=20to=20the=20old=0A-=09=09=09=09=20*=20timeline=20= up=20to=20the=20switch=20point=20and=20should=20be=20archived.=0A+=09=09=09= =09=20*=20Different=20timeline=20-=20check=20if=20it's=20an=20ancestor=20= and=20if=20this=0A+=09=09=09=09=20*=20segment=20is=20before=20the=20= timeline=20switch=20point.=20Only=20read=20timeline=0A+=09=09=09=09=20*=20= history=20if=20we=20haven't=20already=20(lazy=20loading).=0A+=09=09=09=09= =20*=0A+=09=09=09=09=20*=20Note:=20Timelines=20form=20a=20tree=20= structure,=20not=20a=20linear=20sequence,=0A+=09=09=09=09=20*=20so=20we=20= can't=20use=20<=20or=20>=20to=20compare=20them.=0A=20=09=09=09=09=20*/=0A= -=09=09=09=09XLByteToSeg(switchpoint,=20switchpoint_segno,=20= wal_segment_size);=0A-=09=09=09=09if=20(file_segno=20<=3D=20= switchpoint_segno)=0A+=09=09=09=09if=20(tli_history=20=3D=3D=20NIL)=0A+=09= =09=09=09=09tli_history=20=3D=20readTimeLineHistory(reported_tli);=0A+=0A= +=09=09=09=09if=20(tliInHistory(file_tli,=20tli_history))=0A=20=09=09=09=09= {=0A-=09=09=09=09=09XLogArchiveForceDone(walfile);=0A-=09=09=09=09=09= elog(DEBUG3,=20"marked=20ancestor=20timeline=20segment=20%s=20as=20= archived=20(before=20switch=20to=20timeline=20%u)",=0A-=09=09=09=09=09=09= =20walfile,=20reported_tli);=0A+=09=09=09=09=09XLogRecPtr=09switchpoint;=0A= +=09=09=09=09=09XLogSegNo=09switchpoint_segno;=0A+=0A+=09=09=09=09=09/*=20= Get=20the=20point=20where=20we=20switched=20away=20from=20this=20= timeline=20*/=0A+=09=09=09=09=09switchpoint=20=3D=20= tliSwitchPoint(file_tli,=20tli_history,=20NULL);=0A+=0A+=09=09=09=09=09= /*=0A+=09=09=09=09=09=20*=20If=20the=20segment=20is=20at=20or=20before=20= the=20switch=20point,=20it=20must=20have=0A+=09=09=09=09=09=20*=20been=20= archived=20(it's=20required=20to=20reach=20the=20reported=20timeline).=0A= +=09=09=09=09=09=20*=20The=20segment=20containing=20the=20switch=20point=20= belongs=20to=20the=20old=0A+=09=09=09=09=09=20*=20timeline=20up=20to=20= the=20switch=20point=20and=20should=20be=20archived.=0A+=09=09=09=09=09=20= */=0A+=09=09=09=09=09XLByteToSeg(switchpoint,=20switchpoint_segno,=20= wal_segment_size);=0A+=09=09=09=09=09if=20(file_segno=20<=3D=20= switchpoint_segno)=0A+=09=09=09=09=09{=0A+=09=09=09=09=09=09= XLogArchiveForceDone(walfile);=0A+=09=09=09=09=09=09elog(DEBUG3,=20= "marked=20ancestor=20timeline=20segment=20%s=20as=20archived=20(before=20= switch=20to=20timeline=20%u)",=0A+=09=09=09=09=09=09=09=20walfile,=20= reported_tli);=0A+=09=09=09=09=09}=0A=20=09=09=09=09}=0A=20=09=09=09}=0A=20= =09=09}=0A-=09}=0A=20=0A-=09FreeDir(status_dir);=0A+=09=09= FreeDir(status_dir);=0A+=0A+=09=09/*=0A+=09=09=20*=20After=20a=20full=20= directory=20scan=20following=20a=20timeline=20change,=20update=0A+=09=09=20= *=20our=20tracking=20to=20the=20newly=20reported=20position=20for=20= future=20optimizations.=0A+=09=09=20*/=0A+=09=09last_processed_tli=20=3D=20= reported_tli;=0A+=09=09last_processed_segno=20=3D=20reported_segno;=0A+=09= }=0A=20}=0A=20=0A=20/*=0A--=20=0A2.51.2=0A=0A= --Apple-Mail=_0C363831-9101-4A4D-87C3-689941891F87 Content-Disposition: attachment; filename=v4-0002-Mark-ancestor-timeline-WAL-segments-as-archived.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="v4-0002-Mark-ancestor-timeline-WAL-segments-as-archived.patch" Content-Transfer-Encoding: quoted-printable =46rom=20b991c5785cffe44e2d42de3a607ddda8e64ca08d=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Andrey=20Borodin=20=0ADate:=20= Tue,=2010=20Feb=202026=2016:45:10=20+0500=0ASubject:=20[PATCH=20v4=20= 2/3]=20Mark=20ancestor=20timeline=20WAL=20segments=20as=20archived=0A=0A= When=20standby=20receives=20archive=20status=20report,=20check=20if=20= .ready=20files=0Abelong=20to=20ancestor=20timelines=20before=20the=20= switch=20point=20and=20mark=20them=0Aas=20.done=20if=20already=20= archived=20by=20primary.=0A---=0A=20= src/backend/replication/walreceiver.c=20|=2055=20= ++++++++++++++++++++++++---=0A=201=20file=20changed,=2050=20= insertions(+),=205=20deletions(-)=0A=0Adiff=20--git=20= a/src/backend/replication/walreceiver.c=20= b/src/backend/replication/walreceiver.c=0Aindex=20= ed0edd258bb..1613a5f8752=20100644=0A---=20= a/src/backend/replication/walreceiver.c=0A+++=20= b/src/backend/replication/walreceiver.c=0A@@=20-1143,6=20+1143,11=20@@=20= XLogWalRcvClose(XLogRecPtr=20recptr,=20TimeLineID=20tli)=0A=20=09=09=20*=20= In=20shared=20mode,=20check=20if=20this=20segment=20is=20already=20= archived=20on=20primary.=0A=20=09=09=20*=20If=20we're=20on=20the=20same=20= timeline=20and=20this=20segment=20is=20<=3D=20last=20archived,=0A=20=09=09= =20*=20mark=20it=20.done=20immediately.=20Otherwise=20create=20.ready.=0A= +=09=09=20*=0A+=09=09=20*=20We=20don't=20check=20ancestor=20timeline=20= cases=20here=20to=20avoid=20reading=20timeline=0A+=09=09=20*=20history=20= files=20on=20every=20segment=20close.=20ProcessArchivalReport()=20will=0A= +=09=09=20*=20handle=20marking=20ancestor=20timeline=20segments=20as=20= .done=20when=20it=20scans=0A+=09=09=20*=20the=20archive_status=20= directory.=0A=20=09=09=20*/=0A=20=09=09if=20(primary_last_archived_tli=20= =3D=3D=20recvFileTLI=20&&=0A=20=09=09=09recvSegNo=20<=3D=20= primary_last_archived_segno)=0A@@=20-1349,6=20+1354,7=20@@=20= ProcessArchivalReport(void)=0A=20=09DIR=09=09=20=20=20*status_dir;=0A=20=09= struct=20dirent=20*status_de;=0A=20=09char=09=09status_path[MAXPGPATH];=0A= +=09List=09=20=20=20*tli_history=20=3D=20NIL;=0A=20=0A=20=09elog(DEBUG2,=20= "received=20archival=20report=20from=20primary:=20%s",=0A=20=09=09=20= primary_last_archived);=0A@@=20-1398,18=20+1404,57=20@@=20= ProcessArchivalReport(void)=0A=20=09=09XLogFromFileName(walfile,=20= &file_tli,=20&file_segno,=20wal_segment_size);=0A=20=0A=20=09=09/*=0A-=09= =09=20*=20Mark=20as=20.done=20if=20it's=20on=20the=20same=20timeline=20= and=20not=20after=20the=0A-=09=09=20*=20reported=20segment.=20We=20only=20= process=20the=20reported=20timeline=20to=20avoid=0A-=09=09=20*=20marking=20= segments=20from=20parent=20or=20future=20timelines=20prematurely.=0A-=09=09= =20*=20XXX:=20Process=20possible=20TLI=20switches=20happened=20between=20= status=20reports.=0A-=09=09=20*=20For=20now,=20leave=20segments=20on=20= previous=20TLIs=20to=20archive_command.=0A+=09=09=20*=20Mark=20as=20= .done=20if:=0A+=09=09=20*=201.=20Same=20timeline=20and=20segment=20<=3D=20= reported=20segment,=20OR=0A+=09=09=20*=202.=20Ancestor=20timeline=20and=20= segment=20is=20before=20the=20timeline=20switch=20point=0A+=09=09=20*=0A= +=09=09=20*=20For=20ancestor=20timelines:=20if=20primary=20archived=20= segment=20X=20on=20timeline=20T,=0A+=09=09=20*=20then=20all=20segments=20= on=20ancestor=20timelines=20before=20the=20switch=20to=20T=20must=0A+=09=09= =20*=20have=20been=20archived=20(they're=20required=20to=20reach=20= timeline=20T).=0A=20=09=09=20*/=0A=20=09=09if=20(file_tli=20=3D=3D=20= reported_tli=20&&=20file_segno=20<=3D=20reported_segno)=0A=20=09=09{=0A+=09= =09=09/*=20Same=20timeline,=20segment=20already=20archived=20*/=0A=20=09=09= =09XLogArchiveForceDone(walfile);=0A=20=09=09=09elog(DEBUG3,=20"marked=20= WAL=20segment=20%s=20as=20archived=20(primary=20archived=20up=20to=20= %s)",=0A=20=09=09=09=09=20walfile,=20primary_last_archived);=0A=20=09=09= }=0A+=09=09else=20if=20(file_tli=20!=3D=20reported_tli)=0A+=09=09{=0A+=09= =09=09/*=0A+=09=09=09=20*=20Different=20timeline=20-=20check=20if=20it's=20= an=20ancestor=20and=20if=20this=0A+=09=09=09=20*=20segment=20is=20before=20= the=20timeline=20switch=20point.=20Only=20read=20timeline=0A+=09=09=09=20= *=20history=20if=20we=20haven't=20already=20(lazy=20loading).=0A+=09=09=09= =20*=0A+=09=09=09=20*=20Note:=20Timelines=20form=20a=20tree=20structure,=20= not=20a=20linear=20sequence,=0A+=09=09=09=20*=20so=20we=20can't=20use=20= <=20or=20>=20to=20compare=20them.=0A+=09=09=09=20*/=0A+=09=09=09if=20= (tli_history=20=3D=3D=20NIL)=0A+=09=09=09=09tli_history=20=3D=20= readTimeLineHistory(reported_tli);=0A+=0A+=09=09=09if=20= (tliInHistory(file_tli,=20tli_history))=0A+=09=09=09{=0A+=09=09=09=09= XLogRecPtr=09switchpoint;=0A+=09=09=09=09XLogSegNo=09switchpoint_segno;=0A= +=0A+=09=09=09=09/*=20Get=20the=20point=20where=20we=20switched=20away=20= from=20this=20timeline=20*/=0A+=09=09=09=09switchpoint=20=3D=20= tliSwitchPoint(file_tli,=20tli_history,=20NULL);=0A+=0A+=09=09=09=09/*=0A= +=09=09=09=09=20*=20If=20the=20segment=20is=20at=20or=20before=20the=20= switch=20point,=20it=20must=20have=0A+=09=09=09=09=20*=20been=20archived=20= (it's=20required=20to=20reach=20the=20reported=20timeline).=0A+=09=09=09=09= =20*=20The=20segment=20containing=20the=20switch=20point=20belongs=20to=20= the=20old=0A+=09=09=09=09=20*=20timeline=20up=20to=20the=20switch=20= point=20and=20should=20be=20archived.=0A+=09=09=09=09=20*/=0A+=09=09=09=09= XLByteToSeg(switchpoint,=20switchpoint_segno,=20wal_segment_size);=0A+=09= =09=09=09if=20(file_segno=20<=3D=20switchpoint_segno)=0A+=09=09=09=09{=0A= +=09=09=09=09=09XLogArchiveForceDone(walfile);=0A+=09=09=09=09=09= elog(DEBUG3,=20"marked=20ancestor=20timeline=20segment=20%s=20as=20= archived=20(before=20switch=20to=20timeline=20%u)",=0A+=09=09=09=09=09=09= =20walfile,=20reported_tli);=0A+=09=09=09=09}=0A+=09=09=09}=0A+=09=09}=0A= =20=09}=0A=20=0A=20=09FreeDir(status_dir);=0A--=20=0A2.51.2=0A=0A= --Apple-Mail=_0C363831-9101-4A4D-87C3-689941891F87--