Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rjysv-009PXK-3f for pgsql-general@arkaria.postgresql.org; Tue, 12 Mar 2024 09:57:25 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rjyst-009SlK-GF for pgsql-general@arkaria.postgresql.org; Tue, 12 Mar 2024 09:57:23 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rjyst-009SlC-2z for pgsql-general@lists.postgresql.org; Tue, 12 Mar 2024 09:57:23 +0000 Received: from mail.arcict.com ([91.183.106.63] helo=arcict.com) by magus.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rjysq-0045f8-U4 for pgsql-general@lists.postgresql.org; Tue, 12 Mar 2024 09:57:22 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=simple/simple; d=arcict.com; s=cra20arc; bh=7wosQfAnPqrH7bazrfW+xVgN8V8kVOxPrp5+shDySYw=; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:To:From; b=cReNDX47VaM/bYi1eJqsMFZX+JE463hXyXazcS4tTY Wqo4PzQ/ZAt0gBcHHp/GKywX4r5VuwampbcewsYBKo6FjKFrZ83qXQGjQO2tEQk4MNzS6NpS8HFqt qxdYMAPoKEve86ZNkFcNc/1lwlXAmlWGH/YLOnl3JvGImjmgucPw= Received: from [192.163.1.153] (HELO [169.254.126.238]) by arcict.com (CommuniGate Pro SMTP 6.3.18) with ESMTPS id 5952481 for pgsql-general@lists.postgresql.org; Tue, 12 Mar 2024 10:57:19 +0100 From: Nick Renders To: pgsql-general@lists.postgresql.org Subject: Re: could not open file "global/pg_filenode.map": Operation not permitted Date: Tue, 12 Mar 2024 10:57:19 +0100 X-Mailer: MailMate (1.14r5964) Message-ID: <19556056-40E7-4FA3-A2A1-0A345AEBFD9E@arcict.com> In-Reply-To: References: <4D67E594-098F-4234-87D8-68F827AF2531@arcict.com> <2E2F11F8-718A-4E6A-81E0-4F5CC1F1273A@arcict.com> MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 11 Mar 2024, at 16:04, Adrian Klaver wrote: > On 3/11/24 03:11, Nick Renders wrote: >> Thank you for your reply Laurenz. >> I don't think it is related to any third party security software. We h= ave several other machines with a similar setup, but this is the only ser= ver that has this issue. >> >> The one thing different about this machine however, is that it runs 2 = instances of Postgres: >> - cluster A on port 165 >> - cluster B on port 164 >> Cluster A is actually a backup from another Postgres server that is re= stored on a daily basis via Barman. This means that we login remotely fro= m the Barman server over SSH, stop cluster A's service (port 165), clear = the Data folder, restore the latest back into the Data folder, and start = up the service again. >> Cluster B's Data and service (port 164) remain untouched during all th= is time. This is the cluster that experiences the intermittent "operation= not permitted" issue. >> >> Over the past 2 weeks, I have suspended our restore script and the iss= ue did not occur. >> I have just performed another restore on cluster A and now cluster B i= s throwing errors in the log again. > > Since it seems to be the trigger, what are the contents of the restore = script? > >> >> Any idea why this is happening? It does not occur with every restore, = but it seems to be related anyway. >> >> Thanks, >> >> Nick Renders >> > > > -- = > Adrian Klaver > adrian.klaver@aklaver.com > ...how are A and B connected? The 2 cluster are not connected. They run on the same macOS 14 machine wi= th a single Postgres installation ( /Library/PostgreSQL/16/ ) and their r= espective Data folders are located on the same volume ( /Volumes/Postgres= _Data/PostgreSQL/16/data and /Volumes/Postgres_Data/PostgreSQL/16-DML/dat= a ). Beside that, they run independently on 2 different ports, specified = in the postgresql.conf. > ...run them under different users on the system. Are you referring to the "postgres" user / role? Does that also mean sett= ing up 2 postgres installation directories? > ...what are the contents of the restore script? ## stop cluster A ssh postgres@10.0.0.1 '/Library/PostgreSQL/16/bin/pg_ctl -D /Volumes/Post= gres_Data/PostgreSQL/16/data stop' ## save config files (ARC_postgresql_16.conf is included in postgresql.co= nf and contains cluster-specific information like the port number) ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cp= ARC_postgresql_16.conf ../ARC_postgresql_16.conf' ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cp= pg_hba.conf ../pg_hba.conf' ## clear data directory ssh postgres@10.0.0.1 'rm -r /Volumes/Postgres_Data/PostgreSQL/16/data/*'= ## transfer recovery (this will copy the backup "20240312T040106" and any= lingering WAL files into the Data folder) barman recover --remote-ssh-command 'ssh postgres@10.0.0.1' pg 20240312T0= 40106 /Volumes/Postgres_Data/PostgreSQL/16/data ## restore config files ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cd= .. && mv ARC_postgresql_16.conf /Volumes/Postgres_Data/PostgreSQL/16/dat= a/ARC_postgresql_16.conf' ssh postgres@10.0.0.1 'cd /Volumes/Postgres_Data/PostgreSQL/16/data && cd= .. && mv pg_hba.conf /Volumes/Postgres_Data/PostgreSQL/16/data/pg_hba.co= nf' ## start cluster A ssh postgres@10.0.0.1 '/Library/PostgreSQL/16/bin/pg_ctl -D /Volumes/Post= gres_Data/PostgreSQL/16/data start > /dev/null' This script runs on a daily basis at 4:30 AM. It did so this morning and = there was no issue with cluster B. So even though the issue is most likel= y related to the script, it does not cause it every time. Best regards, Nick Renders