Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sx8GI-0088mZ-KZ for buildfarm-members@arkaria.postgresql.org; Sat, 05 Oct 2024 17:08:10 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sx8GH-002RcH-AS for buildfarm-members@arkaria.postgresql.org; Sat, 05 Oct 2024 17:08:09 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sx8GG-002Rc8-RP for buildfarm-members@lists.postgresql.org; Sat, 05 Oct 2024 17:08:09 +0000 Received: from mail-oa1-x29.google.com ([2001:4860:4864:20::29]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sx8GE-002jPO-Dw for buildfarm-members@postgresql.org; Sat, 05 Oct 2024 17:08:08 +0000 Received: by mail-oa1-x29.google.com with SMTP id 586e51a60fabf-2877d7ae431so1637875fac.0 for ; Sat, 05 Oct 2024 10:08:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leadboat.com; s=google; t=1728148084; x=1728752884; darn=postgresql.org; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=xTMfCsZKNRw07JcWtGUFSHlLqdbqNNMMhjGTgRz8Xvg=; b=d41qMGHWAsWbEhV9I7anA/CD/ULTqiP+0wt2E26gbSmM25PjK2Lh39sVNueCn6AqLO 73t8RoIjFZjTd51zNJ48QWP/p7cL5GnJiJnRykg4IpIWkWXWluvde2okYM9eWeGSMsJr W0AmDtGUDlk5KXyn9N/iTskeyEV0IXeyNcUtY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728148084; x=1728752884; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xTMfCsZKNRw07JcWtGUFSHlLqdbqNNMMhjGTgRz8Xvg=; b=J3kQRd3+ghAb5G73RIUHmGOvJXkLFYfwOIwUuLqLCxAPTnH4qVAGPH15p2m3nq3zz3 HcfOuZbbWvlqFNTcL05yPLBOa+T1mypu+GJkzgNIJnpV00Em3DeghKDI1riTt/gm5r4b duNvSArCL0q09hGLZVsMhvoMMNJqCcKNG/3/ahsew2rlt6UN/1wxmSeJwmE3jekygtlE b66WCvVXex1RHQhGppFFSdmWWdvF9SozIGvAyNf3g68j5RLvFvMHweWFgHPBwkAoyJkt SuRklNJXC90b8HsShL+4AIw3/AmKWsedsYBB8UrfERWvFJGOyVMSPOFDspdIFC2dp8n4 He+g== X-Gm-Message-State: AOJu0YyDzg/BvG8JicAdfkDCCGjpU0OXOND2hTpY9NPCEhiZHcFf0e8P W0x7NgjjD0KBCJLwB7FRwmli3OpsRzMJujWSzIPrvOePrYAQ6lSRQ5GYw6U7ZA== X-Google-Smtp-Source: AGHT+IH37BPT1vbq6uTKgTt8ueDnRXa7MB4rA5Fj/6lbaWymnpFiiG6kxRK4auDWNOtN62nrJX+oww== X-Received: by 2002:a05:6870:41ce:b0:25e:eab:6d32 with SMTP id 586e51a60fabf-287c1d38d30mr4072568fac.5.1728148083671; Sat, 05 Oct 2024 10:08:03 -0700 (PDT) Received: from rfd.leadboat.com ([2600:1702:a20:5750::48]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-287d7144249sm910559fac.24.2024.10.05.10.08.02 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 05 Oct 2024 10:08:02 -0700 (PDT) Date: Sat, 5 Oct 2024 10:08:00 -0700 From: Noah Misch To: Alexander Lakhin Cc: buildfarm-members@postgresql.org Subject: Re: shmat() fails with Not enough space on wrasse Message-ID: <20241005170800.29@rfd.leadboat.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/2.2.12 (2023-09-09) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Oct 03, 2024 at 07:00:00AM +0300, Alexander Lakhin wrote: > Could you please look at sporadic test failures occurring on wrasse? > > Yesterday wrasse failed pg_rewindCheck ([3]) and it's the third failure of > the same kind since July. The failure log regress_log_001_basic contains: > # Running: pg_rewind --debug --source-server port=17304 > host=/home/nm/farm/tmp/sbGTlHPVwD dbname='postgres' user=rewind_user --target-pgdata=/export/home/nm/farm/studio64v12_6/REL_14_STABLE/pgsql.build/src/bin/pg_rewind/tmp_check/t_001_basic_primary_remote_data/pgdata > --no-sync --write-recovery-conf > pg_rewind: executing ".../inst/bin/postgres" for target server to complete crash recovery > 2024-10-03 01:32:24.149 CEST [29411:1] FATAL:  shmat(id=1392509011, addr=0, flags=0x4000) failed: Not enough space > pg_rewind: error: postgres single-user mode in target cluster failed > pg_rewind: fatal: Command was: ".../inst/bin/postgres" --single -F -D > "...src/bin/pg_rewind/tmp_check/t_001_basic_primary_remote_data/pgdata" > template1 < "/dev/null" > not ok 11 - pg_rewind remote > > I don't know SPARC Solaris, but I've found a recommendation to increase > zone.max-locked-memory [4] in a similar situation. I'm seeing no limit on that one and generous limits on other memory-related parameters. (I checked it from inside a cron job, just in case that environment had been different.) prctl $$ | ggrep -A4 -E '(memory|shm|^process:)' process: 21611: sh -c { date; prctl $$ | ggrep -A4 -E '(memory|shm|^process:)'; } >/ex NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-deferred-posts basic 32 - deny 21611 privileged 100 - deny - -- project.max-adi-metadata-memory usage 0B system 16.0EB max deny - project.max-locked-memory usage 1.02MB system 16.0EB max deny - project.max-port-ids privileged 8.19K - deny - -- project.max-shm-memory usage 1.02MB privileged 7.85GB - deny - system 16.0EB max deny - project.max-shm-ids privileged 128 - deny - system 16.8M max deny - project.max-msg-ids privileged 128 - deny - -- project.max-crypto-memory usage 0B privileged 7.85GB - deny - system 16.0EB max deny - project.max-tasks -- zone.max-adi-metadata-memory usage 0B system 16.0EB max deny - zone.max-locked-memory usage 1.02MB system 16.0EB max deny - zone.max-mrp-ids usage 0 -- zone.max-shm-memory usage 1.02MB system 16.0EB max deny - zone.max-shm-ids usage 3 system 16.8M max deny - zone.max-sem-ids usage 8