Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sezgN-00FVXZ-Nh for pgsql-pkg-debian@arkaria.postgresql.org; Fri, 16 Aug 2024 16:20:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sezgL-006Qjs-By for pgsql-pkg-debian@arkaria.postgresql.org; Fri, 16 Aug 2024 16:20:05 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sezgK-006Qjk-T5 for pgsql-pkg-debian@lists.postgresql.org; Fri, 16 Aug 2024 16:20:05 +0000 Received: from mail-ej1-x62e.google.com ([2a00:1450:4864:20::62e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sezgH-0054gT-56 for pgsql-pkg-debian@postgresql.org; Fri, 16 Aug 2024 16:20:03 +0000 Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-a7ac469e4c4so371882466b.0 for ; Fri, 16 Aug 2024 09:19:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kartgis-com-pl.20230601.gappssmtp.com; s=20230601; t=1723825198; x=1724429998; darn=postgresql.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kmnCDJE+2oq1CMQSZpnt9PAufuXeGAOAGLajSZTLqJo=; b=Bw+t0zdEJDdFDGLAZQayoF+YLr3tJ7GCidddpi2QxtTe3KoC8PLagnf/eALbfSg0W8 NJEeuIjlvOkHQsFM++YJOrgZw2f6pUWhSKYbJ4cHgkjG0Qfgb5+AQSUfxyTwlzKaWkLc TnTwMg7Z11af/LBU3Mq0qTfk+KKEHoI5TKUZ/PxMivuK+Vda5A48vej28Kq/ZbqwS00o 9RUMUzwcoLPhXqQ2Em+wXuj2gaTz3/gaLfFRekjovDR2bDNrzx0vUXJj0RzeFa++tXBz yC1Qei3iz0n+fJsSQgOZoW7lYMzR77a5MijAmRa47PoWk6+SxcvaqtcsLx0fLTCXF/QP dWTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723825198; x=1724429998; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kmnCDJE+2oq1CMQSZpnt9PAufuXeGAOAGLajSZTLqJo=; b=RQqYIeVG5sjpLLVI4ZWNjLjfYdKcS2ProwlR/Dfr89p4+/Ww+CTiRV7vEoAXjR8Isw E6s3j2biRZ/ejG6nv6qwSP/y8VNyEl/pT68J5D5M5cEHJqQ8J4OyBxHkZ3mf4P2/tZgM zIdmj9n0ZgXO0CpxWz10N0L+leyDLcdHiRSM/kCRvXx6Gf+a62DzM4FEEzQBySMhwunp 6m6zZZ2cUFWPnLtM3E4DtmTbxWabpmoKUbm/e32W/Z/sHrj3m8d+KfoPzOws7KcI/1vl ktAiKVA3yHXiekqB1JjFGqkrcVlgMD1CJm3xIsL6igLzQMFx9ILN5lTxTeBI9ciFSTnc 6G4w== X-Forwarded-Encrypted: i=1; AJvYcCUt0QuZMGP6/TGiylx9FY2iGoCURQRI0ybGdF5nJIyLyjj/iXhWDDVU9ZAou1AphqiBsRnrJ8Diw9/sOWkkA4Q9@postgresql.org X-Gm-Message-State: AOJu0YzVikb87cjxEtgjabolQwvmpHkfyug10iUx/CXFEBtc/zqS4bcm me1zyDi6zEdldUwneQFwWrjhxsTY7fyDTbEdPcmV4dKtBJvKQNkNCWiOFK+kwF6lfCHm2nqLFgS NWGW7hA1HZuhKHGQ45ItrJOPEOq/hWm1BbCSMS43NdNMHrBBNJg== X-Google-Smtp-Source: AGHT+IE5TKa8A7grdhVI5yoP32VdijoZ2eCPZPzKAsxAEu6C3GKTxX1HzNBh9eh1dhrDKUz9d3rgf+QTKilzMnjCFOE= X-Received: by 2002:a17:907:60cb:b0:a7a:bae8:f2b5 with SMTP id a640c23a62f3a-a839521bec9mr237262466b.36.1723825198219; Fri, 16 Aug 2024 09:19:58 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Krzysztof Tomaszewski Date: Fri, 16 Aug 2024 18:19:46 +0200 Message-ID: Subject: Re: Systemd may start PostgreSQL cluster before time is properly setup on the host machine To: Christoph Berg , Krzysztof Tomaszewski , pgsql-pkg-debian@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi > Re: Krzysztof Tomaszewski > > I previously published following analysis on redmine.postgresql.org as > > an issue #8009 about 2 months ago. As this system seems to be dormant > > I took liberty to re-post it here. Hope it is OK. > I had seen it, but didn't have the spoons to look closer it it back > then. Thank you very much for taking time to look into this, I really appreciate = it. Also, I hadn't mean to put any additional pressure, just wasn't sure do my previous message reached some wise eyes or not :) > > According to systemd documentatnion (systemd.special(7) and > > systemd-sysv-generator(8)) when systemd generates unit for SysV init > > script, it transform dependency on $time to dependency on > > time-sync.target so that time-sync.target seems more appropriate than > > time-set.target at least from consistency standpoint. > (...) > It seems to me that the correct thing to do would be simply: > > After=3Dtime-sync.target That would also be my understanding. > ... and leave the FS dependencies the automatic dependencies added by > "RequiresMountsFor=3D/etc/postgresql/%I /var/lib/postgresql/%I" which > already exists. > > > For example, when machine clock is setup in UTC (as it usually should) > > and local time is different, PostgreSQL during start may interpret > > time without timezone applied as one with it. > > I don't think that's a problem, the system time will always be UTC > internally, and the system time zone just changes how it is formatted. > PostgreSQL is always timezone aware. > > > As esoteric and contrived as it sounds, I recently stumbled upon a > > case in production environment, where `pg_postmaster_start_time()` was > > returning time in the future, with shift consistent with timezone > > shift in that environment. Investigation of which case led me to above > > mentioned findings. > > If that went wrong, perhaps the machine clock wasn't set to UTC? Hm, I looked at this again and on system that I observed the problem, "RTC" is in UTC (as it run in virtual machine, it is not true hardware clock). Nevertheless my line of reasoning about (lack of) of time zone information in early boot stage was probably wrong, as you pointed out. It seams that RTC on that system had drifted substantially (and by similar time amount to zone shift which tricked me), and that is the reason why PostgreSQL is getting wrong time when started before time-sync.target. As it it virtual system, OS can not truly (re)set the RTC, so this drift reoccur after reboot. Solution (beyond properly managing RTC of course) seems to stil be the same, depending on running after time-sync.target. > > This probably also should be kept consistent among starting > > mechanisms, i.e. it should be added to unit file or dropped from init > > script stanza. > > TBH, I'm not going to touch the sysv script. It still works in > chroots/containers without systemd when testing something there, but > it's not relevant for anything that actually boots. Sure. My thinking was really in direction of enhancing unit file only. I just was not sure if time dependency was not cary out into unit file intentionally for some reason. > > Another thing of some potential interest may be how RPM packages > > provided by PostgreSQL project, handle similar unit file. Unit file > > from RPM package also lacks dependency on any time related target but > > has additional dependency on syslog.target which may not (do not?) > > exists at all. As syslog providers do not add dependency on time > > related targets (only network related), this will not position > > PostgreSQL start after time is properly setup even in implicit > > (transitive) way. > > Again, we can consider that if there's any "best practise" set of > dependencies we should add to the service, but since the default > config isn't set to syslog, I don't see we should include > syslog.service. I probably made this point to convoluted, sorry. I did not and do not understand way unit file in RPM package depends on systlog.service, too. I tried to figure that out by analyzing other potential dependencies pulled in by that dependency, but found none of actual interest. As you pointed out, reasoning about systemd is not always trivial. > > There are some other differences between unit files provided directly > > by PostgreSQL project for Debian and RPM based distros, that lead to > > different behavior among them but are unrelated to this issue (as they > > mostly relate to how they handle timeouts, with infinity for start and > > stop in RPM based systems and 1h limit for stopping Postgres cluster > > in Debian). > > The suggested service file from the PG documentation is this: > > [Unit] > Description=3DPostgreSQL database server > Documentation=3Dman:postgres(1) > After=3Dnetwork-online.target > Wants=3Dnetwork-online.target > > [Service] > Type=3Dnotify > User=3Dpostgres > ExecStart=3D/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data > ExecReload=3D/bin/kill -HUP $MAINPID > KillMode=3Dmixed > KillSignal=3DSIGINT > TimeoutSec=3Dinfinity > > [Install] > WantedBy=3Dmulti-user.target Maybe documentation should mention After=3Dtime-sync.target too? > I added the TimeoutStopSec=3D1h so rebooting a server never hangs > indefinitely (and if 1h isn't enough to write out a checkpoint, I > don't know). I pointed out differences between rpm and deb packaged service unit files mostly because I was surprised by they existence, as one of the initial promise of using systemd unit files over init scripts was consistency across distributions. Also the reasoning behind those differences was not clear to me. Thanks for providing your line of thoughts behind it. If I may provide my thinking about it, having predictable timeout by default is valuable. If one needs to make it longer or get rid of it completely, then using unit file drop-ins to redefine it is always an option, that can be applied on instance that would benefit from it. My guess would be also, that having machine stuck during closing process, probably with access over network cut out already, would trigger operators to power off such machine anyway. And having TimeoutStopSec set explicitly may at lest hint administrators, that they may need to tune it for particular environment. Kind regards, Krzysztof --=20 ktomaszewski@kartgis.com.pl *KartGIS sp. z o.o.* | www.kartgis.com.pl Aleje Jerozolimskie 81 02-001 Warszawa NIP 9512276974, REGON 141747787 Fax 22-213-96-40 Zarejestrowana w S=C4=85dzie Rejonowym dla m.st. Warszawy w Warszawie, XII Wydzia=C5=82 Gospodarczy Krajowego Rejestru S=C4=85dowego pod numerem KRS: 0000517511 Warto=C5=9B=C4=87 Kapita=C5=82u Zak=C5=82adowego: 611 300,00 PLN