Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v3uti-005Zij-2h for pgsql-hackers@arkaria.postgresql.org; Wed, 01 Oct 2025 11:21:26 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v3utf-0012VL-It for pgsql-hackers@arkaria.postgresql.org; Wed, 01 Oct 2025 11:21:24 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v3utf-0012V1-7o for pgsql-hackers@lists.postgresql.org; Wed, 01 Oct 2025 11:21:23 +0000 Received: from mail-lj1-x22a.google.com ([2a00:1450:4864:20::22a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v3utZ-000rQp-1m for pgsql-hackers@lists.postgresql.org; Wed, 01 Oct 2025 11:21:22 +0000 Received: by mail-lj1-x22a.google.com with SMTP id 38308e7fff4ca-36a6a3974fdso43284191fa.0 for ; Wed, 01 Oct 2025 04:21:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1759317675; x=1759922475; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=F9c91nEHc0wJzYG2OhsFUlpvMAFOunHy9k6Q2Hvj/QE=; b=lg5loEf3sXClUprr1a/60KjUy77y2kRy9k+NzbSIp8Y2czImmFx/qck8yXSFQkMjcx xxGHZUgfggvhxABQaBZtSsR96eZOWeJIUkYUQYnb5XR/vxgLCj3STJ72ejvgf94zyyIv 7jWLtS1aTLv0fuRrL6tw1VW/cbgEkgdNBUoHtV1YjhuEe/p9OiyfMuaIcQhOvtX3UNG3 Sd5gaHpdev8Cd0iB3siAHqbEZK3Z+LdKlqfmEkoFircyQfozuqQ99u83XCOgsuBPdrm0 b3EUhyxn4uEVHMLkShqhIDSHXe9uqGPXVp7xPFavQILt9//UkLUeYO+AUX+ZnMoe/BSV 00/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759317675; x=1759922475; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=F9c91nEHc0wJzYG2OhsFUlpvMAFOunHy9k6Q2Hvj/QE=; b=i5ksrFpF0O5/qA76BSSAW4TCmEO6vQgR/2xaMP+u2FQwnHvJb2FLIvxYYWGmK7l8CC kfWUGtgDm5wla0qOjamFBgRLrGBLHtHf4pe2UEIdQKC0vnjxJaNvsq94xQyVK4/T1MfY jyVyit/WWJH61iVn6H1eAjy6ZMxC04MEEXkaQpYAHSw0zP809bs4IIWduGPp5/Uontdw oHNR8OTi9wQ48Lr8eYU44gJ4h1NN5rfHqTSIkUkj/bSq2qDlI0M7+DGu/xNlan/55Gox QFDG7Fw/d2fjqnE2I/qm4V9eyt2p4qkoMma5dhAwBcfa1kh14/ZerFoU5a3q3lpppmcJ ofFg== X-Forwarded-Encrypted: i=1; AJvYcCU8kYnD6o/M5atn/Reqb8PPDugIpJrkZhv8H38/Wtt3juhmpb7XXiuK0K+iTJ8uBeEJmUbJec2IivuLSlMo@lists.postgresql.org X-Gm-Message-State: AOJu0Yw9bcfvIt9zgXs3Nocarz5MrWf91xXMIIB7d5ZQb3Sz+nJNTDIq lVc+ymDuQIxf5DsFsJQLG6S1mNledCXRiPOxlEkX3lMwcdM+IjAKRa8Cao0/0AGXEI9rQFyCuyQ TcvCyqK27+tL2xtWJSx/uhFAqzPSmi8IYa5W9WS5x X-Gm-Gg: ASbGnct9L+VdhuFWcVtx34bdLDyqqQCd0xggMLl/4Uj4FxvWdVxBmf09OeViYwnWCWW D0HMdjiZawhDHSMEZEsvLT/IO6nsOKR1G7NHmdLpPKM0ErqqOsNWqsDV2tshcz84P7Snq+KfR27 GzixpdTVSWF8K4wMmBZI0oBOfLjzInxirDQLQJrQABdDYfmBpCaK34JiDqoNGRa7XJKJ+37bG5A 8GowAyRbKKUP05eWpHgxA9ZLOvCAZPj X-Google-Smtp-Source: AGHT+IFcXJ1b93ryyXYYVeUgLMWvLF9f/Gg30w56H8OH6IszUhaQOni7kUDzEjZ8tSfFK0HVLxS/Wk387KHs0og9qfM= X-Received: by 2002:a2e:a5c1:0:b0:372:89e0:7ef4 with SMTP id 38308e7fff4ca-373a754adefmr8111181fa.41.1759317675217; Wed, 01 Oct 2025 04:21:15 -0700 (PDT) MIME-Version: 1.0 References: <013D63E2-5D75-492E-85FF-1D5CC0148C82@gmail.com> <499686.1759250489@sss.pgh.pa.us> <68dcd1f2.df0a0220.3300c0.f7af@mx.google.com> In-Reply-To: <68dcd1f2.df0a0220.3300c0.f7af@mx.google.com> From: Jakub Wartak Date: Wed, 1 Oct 2025 13:20:48 +0200 X-Gm-Features: AS18NWAfZhb5AE8ESgS42zlhkp8_rG8Zg35ennVfzlau7oJZx7KpCrZhLpUWHV4 Message-ID: Subject: Re: The ability of postgres to determine loss of files of the main fork To: Michael Banck Cc: Tom Lane , Aleksander Alekseev , Frits Hoogland , pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Oct 1, 2025 at 9:02=E2=80=AFAM Michael Banck wrote= : > > Hi, > > wow, this is one of the most terrifying threads I've ever seen... Same. > On Tue, Sep 30, 2025 at 12:41:29PM -0400, Tom Lane wrote: > > Aleksander Alekseev writes: > > >> Therefore, I would like to request an enhancement: add an option to > > >> verify_heapam() that causes the primary key index to be scanned and = makes > > >> sure that all line pointers in the index point to existing tuples. > > > > > ... IMO there is little value in adding a check for the existence of > > > the segments for a single table. And the *real* check will not differ > > > much from something like SELECT * FROM my_table, or from making a > > > complete backup of the database. > > > > As Frits mentioned, neither of those actions will really notice if a > > table has been truncated via loss of a segment. > > Is there a valid case for a missing segment? If not, couldn't this be > caught somewhere in the storage manager? > I've took a look on PG17 and in _mfd_openseg() there's if fd < 0 return NULL after open(), but out of it's callers only _mdfd_getseg() seems to be alerting on that NULL. To me this seems like a bug, because i've seen way too many times people and software deleting files randomly. Even simple crashes (with e2fsck, xfs_repair) could put orphaned inodes into /lost+found. IMHO all files should be opened at least on startup to check integrity, because the non-zero return code (during such SELECT) for openat(2) seems o be coming out of RelationGetNumberOfBlocksInFork()->table_block_relation_size()->smgrnblocks= ()->mdnblocks()->_mdfd_openseg(). Now if the 1st seg file would be missing we would complain in mdopenfork(). mdnblocks() says even "all active segments of the relation are opened...", but even that apparently is not true. The bigger context seems to be be that 049469e7e7cfe0c69 (2015) could be culprit here as well, as it is stated there that mdnblocks() could earlier create zero-length files back in day and it removed that ereport(ERROR) when unable to access that file. Another idea (than this being a bug) is that Thomas had a large relation patchset back in [1], but I wouldn't be a fan of us operating on 31-32TB files ;) -J. [1] - https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BBGXwMbrvzXAjL8= VMGf25y_ga_XnO741g10y0%3Dm6dDiA%40mail.gmail.com