Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1asrfr-0007Pe-Oj for pgsql-docs@arkaria.postgresql.org; Wed, 20 Apr 2016 12:56:07 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1asrfq-0007jI-Od for pgsql-docs@arkaria.postgresql.org; Wed, 20 Apr 2016 12:56:06 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1asrfU-0007Kq-OD for pgsql-docs@postgresql.org; Wed, 20 Apr 2016 12:55:44 +0000 Received: from edmmai02.een.elster.de ([217.237.175.35]) by magus.postgresql.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.84_2) (envelope-from ) id 1asrfO-0006bi-Lz for pgsql-docs@postgresql.org; Wed, 20 Apr 2016 12:55:43 +0000 X-IronPort-AV: E=Sophos;i="5.24,509,1454972400"; d="asc'?scan'208";a="2731088" Received: from webmail.een.elster.de (HELO edmzarafa01.een.elster.de) ([10.1.20.13]) by edmmai02.een.elster.de with ESMTP; 20 Apr 2016 14:55:35 +0200 Received: from [10.200.128.23] (enmfwa02ext.een.elster.de [192.168.90.2]) by edmzarafa01.een.elster.de (Postfix) with ESMTPSA id DC1B62D60326; Wed, 20 Apr 2016 14:55:34 +0200 (CEST) To: pgsql-docs@postgresql.org Cc: Gunnar Nick Bluth From: "Gunnar \"Nick\" Bluth" Subject: Actual RC of "restore_command" is relevant for DB startup X-Enigmail-Draft-Status: N1211 Message-ID: <57177C46.6040604@elster.de> Date: Wed, 20 Apr 2016 14:55:34 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="51h0LEmmBNkxsa7hVdBNvFgD9wXjnkAHX" X-Pg-Spam-Score: -2.9 (--) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-docs Precedence: bulk Sender: pgsql-docs-owner@postgresql.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --51h0LEmmBNkxsa7hVdBNvFgD9wXjnkAHX Content-Type: multipart/mixed; boundary="------------020206010006040402060604" This is a multi-part message in MIME format. --------------020206010006040402060604 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello, I've just stumbled across a certain oddity with "restore_command" while setting up a fresh environment with segmented (i.e., firewalled) networks= =2E I configured the restore_command as found in the PGBARMan docs (using ssh) and was a bit stunned that after a restart, I saw this in the logs: 2016-04-20 13:22:45 CEST [3788]: [2-1] db=3D,user=3D FATAL: could not restore file "00000002.history" from archive: child process exited with exit code 255 2016-04-20 13:22:45 CEST [3786]: [3-1] db=3D,user=3D LOG: startup proces= s (PID 3788) exited with exit code 1 2016-04-20 13:22:45 CEST [3786]: [4-1] db=3D,user=3D LOG: aborting start= up due to startup process failure Which was obviously caused by ssh: connect to host port 22: Connection timed out rsync: connection unexpectedly closed (0 bytes received so far) [Receiver= ] rsync error: unexplained error (code 255) at io.c(226) [Receiver=3D3.1.0]= Now, the firewall does not let ssh through (yet), so the root cause is quite obvious. However, the docs[1] only state that: "(...) if the command was terminated by a signal (other than SIGTERM, which is used as part of a database server shutdown) or an error by the shell (such as command not found), then recovery will abort and the server will not start up." In [2], Kevin Grittner stated that it might be that the commands RC should by <=3D 255, otherwise it will be assessed as "failed badly; give = up". And indeed, after amending the restore_command with a "|| exit 1", the server starts up just fine, using replication to fetch the missing WALs. Which is ok for me right now as a workaround, however: had I found this not while setting everything up from scratch, but in case of a disaster (or simply a downtime or very high load of the archive server while restarting a slave), this (basically undocumented!) behavior would have caused me quite a headache...! I reckon only few users will expect a connection timeout to fall into the category of "command not found"... Maybe the part "error by the shell (such as command not found)" could be changed to "error by the shell (RC > 254, e.g. command not found or ssh connection failure)" (actually, whatever the real behaviour is, I didn't check the sources...)? 1 http://www.postgresql.org/docs/current/static/archive-recovery-settings.h= tml 2 http://stackoverflow.com/questions/10524458/postgresql-9-1-streaming-repl= ication-restore-command-special-meaning-of-exit-co Best regards, --=20 Gunnar "Nick" Bluth DBA ELSTER Tel: +49 911/991-4665 Mobil: +49 172/8853339 --------------020206010006040402060604 Content-Type: application/pgp-keys; name="0xAD4790A7.asc" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0xAD4790A7.asc" -----BEGIN PGP PUBLIC KEY BLOCK----- Version: GnuPG v2.0.22 (GNU/Linux) mQINBFY7JcoBEACit+rXv/Wqyoq4BtAmluptFkiIkmXbN8VSkUiZtceC8X0GTaYW FXevNMA9ZQP1k1ehVxJQ3hu7Nefp7j0+rp2D5Tz5jVkFR17DACerEMtQgOFY4LPf WepAcjAQg9froQpfMsTJ2bZH2QCfO+CmWWR/sEnwHuIxBYu08zBBbvY+OVCv0TmE CPGt9UtAwP1CzJlHclp+VjizevWb32SudaPdv3GHvHy4pnhy2kVyznkXLQcNCODs rNMU5pHpgCs3/WRVj+iLnwYdwYcIEvlXZ34mrdK4QQnE6K3eD/NFtnjNKppiFTKz 2FDFgeVQE5sigfl4U4AFi+WyA38OA16j17uBu/M/AD+1Xl5s2p6gTShhOB6wmLt0 rqW2qceaJLLLYvnP0gbWaj7vVhr+GgZnHBWUrbpveaz56G7nbtUXbh3bXE/lANFs 4NtrbW1HPphTNpDgjzO7dl8MK4IPKcMuIhpavtjHSAsp37/3lRUv5IZbb24uMIQ+ 4Uvc+w5IyifwLfYenbWy+zR4dmPW8s+YoBALmULMvgVCfqURMcRsMfAnxWIAy/4l A2Acoip/d3x+OfB2EemKdo5qNcLwkmJf7JZFDYR3N/979quzzxO3pwFdQTJlohWF 4Mj016MDNhX0b5CGWLknD27ZmXm45lLGzZhPP/lzWYfvR/smyAUHUb5ItQARAQAB tDNHdW5uYXIgIk5pY2siIEJsdXRoIDxndW5uYXIuYmx1dGguZXh0ZXJuQGVsc3Rl ci5kZT6JAj8EEwECACkFAlY7JcoCGyMFCQlmAYAHCwkIBwMCAQYVCAIJCgsEFgID AQIeAQIXgAAKCRA4dDCdrUeQpweWD/41TsO7AMQQnn5ZJd0Xd45yKQh5c2kdt1LK XUj8161D/CMQJRuuuSxqaeKqKGcsoxG1o4R77olxlLnCc9K2z9vd9F6dd8r8wNdV fQdvGUdJ1+soeHN9GQaatKmQKedTnFIKQ9yyRyloekTfrEApAzO09cnMd+MHUe3N LBZohw4rhvw27PqHaEBKUhYSL6T6Bl1h1J7qKw4rmJOdNshMHcRm2CJ0UBOCEf4I dCVV9sS52pTDlSVldg10lSVeo+ALmgnZVIt0w5UO/o1wDvYBJO/p0PFnE88TGfgx HCJe2S9T+d0/Qsg6+kcXWA/pc/q7E7mcamC6rQwOst0gX3zFCNcRTrF3wqJcSP/O 3V5n9ESJPR5RB7mW8rr9TI3ArxO/slSZEMDy+XvYLqVuglsqp4lfYWMnnPVGFk1/ wH6fbXyMKzDtE10bGqUaDJZo+xwplIYmbQimkea0eIWqCBfrsZPkcb121IbIXXhS 9nMKe556BzMmre146zzjri/+k8Z4CCaIoE6B1zLEtKsRxsGMsE/V2WV9pGS/OCol CiP+eNRGMh0ImVQy9MbYDx+VN/GEOB3KhJDtkDg41ByE60uGgy3Uw/ndGITJrcbc 7l0692dF6S8yik+pYOQ3xU4kY/u4rpYkx49geEpRawtJ4jLbknv6Oih0ILJ25b9L IfqqjxwoubkCDQRWOyXKARAA02Q9xyHLVreG0Y8jfUtEkhoiGMzQ7z/J73U7rgIm iuEXUaqwdWZ3fjnizoVz3hBQ7XwQdds8ADX9/An4IiGdEA8ON3aHGQVPTS8M8Tz0 sm+s0JYPg5AhhDYkrXL0VUQGVV1h0TcPKDXHa3HJeR2QA/TR9rDoaI0FqGuubhB9 8okmNUq+sm3gORo1kHKzq+GGXkg3IS5kOn8B2bZVj8QSNn/uGD02o7WvR6PbO4Xn 0JIxJ1q9gWt7wXTNT5eiF3Dali+ws4e/SfRtdcXc4K6AKJsvciJItuHAbeATchKV A/IjUvivc0QfnpZU8NaqKsKHvn3JPk1+WX5+5JdYXoa3sUeuHxnlN4B6eVQldcGv ZCdWebnzbNrK4wFol9bPLcW0oHtOB8P7xMwTQZ5XdLJtTQiJFWn/51p2CZF3Mr2J /TufBfuAC0OaVNViNrouYq6D/3uJ3ReznhvW9DOWB9KEMrzU9N3R00dbwjrf1qJn eeRPdQSWbI7Si/9tbKHpKxUMSO40kt+jPoV8vyVmzhoPDUWsxbGML/vACdWwNjEj yvtFWkOs9jsM3ixyvPu7tG/P+QddtR5AfDipiADk8PAtQLikdrg8+LI4pE150Eh0 B+ibyLwOWw3Pi1UpwCUI3SQYC/rCvOPE2lhnWUBzWNtU+NsfE72sSGtChe/dlm0q zncAEQEAAYkCJQQYAQIADwUCVjslygIbDAUJCWYBgAAKCRA4dDCdrUeQp2L1D/4z /ZdMXo1BrpWkCj5/xQMlLMU+dcp3gu4Zhh18wdQ7RkexEDoPXhb50y8THMHs+zQi cCQcCR3wS4Sx8wrFqwXo7N4xtkFjxhpM9w1eQ+p1Diyb+narpOBFnmprvZGPhlJc oE/0Yx5uRtI4hZd6TRBUHiOz3XQzl4kK+qb6wfT+btxCC2Jhyf3439H9LD18odYz D1BX36XOXhShAKKJ5E5ecWL0/iGUzuWkl905pIrj1+YQEXTslx3wEtf9wSqzOX59 i/w2DTVd2l7q1C6CmbPGBqo3rHMJ7Ig5pVz0o56Xjz+boVq7MacDMZlZ3ZHMSSyy WGI84mg3rP40Kx4RIyu113TCUjTs2UQ4+qyt3naMnXngyGclLTDRjzsucmF93Xxc wdcZbFod4L8JR/x/Lhp10boqTh2+UeXHr00YkS1zMoO9oB0ioUtJg5lIwEkQpLJc dx3Ax+1E6I1h1TUHMrec9wMQWJD57qkpLyxzSRWfKwtxnmVsU4Gp55RRRMzKPIfN piTja655gM9so2Jjex6NeQTceghbt5FEpDxzIHkQIS4LYX5gSVHs2k2EO+3Xp+Sa V5eu4O00J3k7uROiaAYIb9n9wQisIa8LtlwJvG6J8UpcgV6jXnvPwSfjeI2oAVhl n3gVdx02BRnNjAIz+qJu/lebfAqslP7yfB3jtdMvfw=3D=3D =3Dv4TS -----END PGP PUBLIC KEY BLOCK----- --------------020206010006040402060604-- --51h0LEmmBNkxsa7hVdBNvFgD9wXjnkAHX Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJXF3xGAAoJEDh0MJ2tR5CnfrwP/10fPJ27vKrFrN4uxirU+xUL nuInL5pG33FEQwA9tNfDuyzkyNEE22wJyZP0F4RBMeW52WizE5IbebYtx8dsJz0u pxkEOqzCB7QGprWWQeD00loDdjfH0Aeonfptvt0kNde+Kw5liIBikzkZe0m/0Zph XMkgskdyG2Y5H/3CkO5mET0PlxNyiGllAWerZEYXo6AxiLMw6JfrzJCoIxDh5/8K 4jrGIg09XcLGfvmDARDNght6IaGAvjeczo+uZPueI3tz1fCn7tF5swX+5/y4r64D 1eNpVUW7OUm/zcbtuwD7LhI82bMHxj3SGnqvoW4azRTkREbOGwJCg/j+0KpQpzNg KVahJ5yG6fNaxK42JhT5AZqBZ7FDcC1HULIj48Kou3dl2sFPd3P/JfF9U5Y/3pmj SBqTIQuRIPjprEIADqoWRQloE0DD4CUIo8nXdtqnHqTpueHRYWky0ydYafeDW7mc pP5U528+TT9oG19kGhceSUQaYR2rOZ9JBC5/Q3Yl3Y1z7ro0hvt59lKC+XeBmvOG fNA4NmPiXxdS9/mGPvVt3KVMuGthXq0NBF0LPFjSWTwbOQwS+thWLCdp+rZ+rsmB oumtvh3/P5K02HSxefWByfzWpGxMk+qjWF2wxKWsvSCrK6VNYXPK3r2UdEO7LIEz hPPhs9EpwZN7yvvhSARZ =Ht9X -----END PGP SIGNATURE----- --51h0LEmmBNkxsa7hVdBNvFgD9wXjnkAHX--