Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wI7Su-007o3Q-2a for pgsql-bugs@arkaria.postgresql.org; Wed, 29 Apr 2026 16:08:45 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wI7Su-004F1f-0H for pgsql-bugs@arkaria.postgresql.org; Wed, 29 Apr 2026 16:08:44 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wI7St-004F1X-2H for pgsql-bugs@lists.postgresql.org; Wed, 29 Apr 2026 16:08:43 +0000 Received: from mail-wr1-x42d.google.com ([2a00:1450:4864:20::42d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wI7Sr-00000003pME-0u8n for pgsql-bugs@lists.postgresql.org; Wed, 29 Apr 2026 16:08:43 +0000 Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-441209fb77eso679526f8f.1 for ; Wed, 29 Apr 2026 09:08:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1777478919; cv=none; d=google.com; s=arc-20240605; b=e6aEFuGRS+KNonegSgr2UeSYxYiUQT/AcLup+fU4fspF0cUAfkT4B8m/3QG4WfDBvZ o0m7zZNhxbVcqANLUDqpMnqFfWE9osVUL5HW9xJB1p3zKvdlFO03KXyOiuTa5l3CKtPh mCz8rFQjABV7CmVd8XqZS+AHDAo6Lux1zJAtH+uUrNqEVdA2NnDV2g8I7Ybistx5Jx77 rNr9bKRx94q/51DwzZxk4sKfT4FZQRib6PijeutOT7bYqRwUQWbUNzPEu5LYFsyvVxSC 9hSPiYu95btAMF3fNPpRceCrOFQJkH8Ym3X+7GYcVbIAmsvQQnckWcrfO0FRIItDlLC+ 00ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=LXPR5T+Jpg/ItZbezpu7yTuPz+fzZsVXig35MK7a+Ms=; fh=fN/EvBx7LRt/Vw5OH9ykP/WrL28XQDHiMbAsfrf8jeU=; b=buWRQpcoEv6p12niOEjF2Dq7fbpGcshwymVbM14cn3t5cvX+5hixlkXqz0qvrrFqZ4 b9Ri5kxYzFPQAvgRYTe51uZBI3REFNbLpCaqlMje8JxJIneXOYec3+x0aouqqy87MW84 GwWfQd/524evptLl2isH2HPU+Q2i0tumWEq+Un3BGjRE33vjB8K7M7NraHQNAmlozPL0 B4EWCLwVP2fMpYLtepKIBJf7WzYPy8mQOldh3w00kDS9BB6h40hdBYF1vfM9OZ7UomHk aBm6HFY1tQ7dXSkdRMvixmxxh03ZvPpDB2GMjeceB9cbSjGv2pcvdIkfwFyOnEsV8OJP ujwQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bowt-ie.20251104.gappssmtp.com; s=20251104; t=1777478919; x=1778083719; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LXPR5T+Jpg/ItZbezpu7yTuPz+fzZsVXig35MK7a+Ms=; b=T1OVuTZN5BbAEcp8wdSLgrNkjnPsbnuGRxGYc5aGijDSPUN6uAwLMC25GbDIP3Ls6x sX9Yy3Ot4LqEsl+cvBwRS5SphvieSsdm8Vr9pD3LHdYcY3Dn4tmWP7GVn5vhCMNEwr3U Mao7YLEQO1SPmN8Rb1CX9QmiAGyfISrAjKByDjH5QAppa4RamJudYTCmtqCdMd2GOoVs pXaTEsFmW4y/LUXdIXhhQIrQvb0374TjABAEB73OKDgZk/ucfkMuQODKsdTBBTJGAHIy yb3BPKf0SF4BwQdd9Moeh3hMiW7YFZcmKa0MqeyoW9jFttfLiHvslM/kegFef81DAnqM 60OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777478919; x=1778083719; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=LXPR5T+Jpg/ItZbezpu7yTuPz+fzZsVXig35MK7a+Ms=; b=mJNeaSjbdzen+FPk4GvlKPTM/l1ykS3WM8W8Y/rCVtmoRVQRG+nm/8KIrSZSjmKAXY YB6J92VAWTE7mLTxuhr7pW2OOpZHGeCCFhp3ZkbojPIPS1/1+NrWOvRAq7BaqwgSs7AM 2OrCfvWihoZQl1CrJvUlaAWwdoF4CfJCDK3Cp9o5+bRDs5BaoBRSjEX+JcbJq87uYQka dBup9UOE/3LbpQ6eHFWoAjzYyAYvIITBRBhPuEqyHNByH4moJM5Ks/rtLGywxJ1aywnG l4HFw7r4lxPqXjCJtTr2UWU7sy2mXA0h0iFoAig/2N6Ewy12aG0/9eghbBhQBlXjFQcg VYkw== X-Forwarded-Encrypted: i=1; AFNElJ+wcR+CrXWnK1f5fUFYFsuxuVEjytGRrJ5yWb87B1oDINIDuOzaYOOYEkM0EXoyWw9FaQl6SYoe7Vf6@lists.postgresql.org X-Gm-Message-State: AOJu0YzncVw5djdQh6HhfdKrnpQec8P2ka/nzvl+vuG/m13dumxEEVnf IyvoTEzuD/BfmwqpvRx7x10aFTS4XlXJc5yhA5NnHZZElY3S3tkesL/tnVQhaw7449ykGZAnV6K HdhJv8EGKK+CnpjQNoEr16zpWZFI2Lie2t6nJ18f1JQ== X-Gm-Gg: AeBDietIt1f6bibj7uQlyZX4ROMX82FqFPXURQ4XA9JQ0uacXWJG60xXkmo9UWmrzgB Ght8NHbtqwBBJYX1hh1Ck6CYbsp9JE+UcMmR7wg1SmRyVvcT0/V5R11kMv/ypU8xXhjX1zE64+T IWNxCY8qyKkWgeUtYTBt1T9fMRkxrsky6jhJsweTz7uf1Hq0hWER4JIi8dMmGxcBsIYxYNZmywl QwHCTvlmIv7jdjofu3tBuLa5qij3rUk85+IvHGBmFROqNF9iddOeWfq9sYuyJDr49WGjOAzEj64 swB1R2WMFiJKZHKmTnEML2/x3jUdiQFIeGD9/yUDcioL6xV/nWE2IUQO1hm6E57X X-Received: by 2002:a5d:6a07:0:b0:43f:e56a:636e with SMTP id ffacd0b85a97d-447a58a7a08mr4180438f8f.18.1777478918950; Wed, 29 Apr 2026 09:08:38 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Peter Geoghegan Date: Wed, 29 Apr 2026 12:08:12 -0400 X-Gm-Features: AVHnY4KJqB-9pTlMdqHmkAFhqda5vmZIt_jgQ_FfL7TFHCJaRYjYEIyrQtRXhHs Message-ID: Subject: Re: Fix size estimation for parallel B-Tree scans with skip arrays To: Tomas Vondra Cc: Siddharth Kothari , pgsql-bugs@lists.postgresql.org, Vaibhav Jain , Madhukar , Xun Cheng Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Apr 29, 2026 at 11:42=E2=80=AFAM Tomas Vondra wro= te: > Thanks for the report. I'm able to reproduce the crash using your > reproducer script. At first I've been confused why you need a BRIN index > when this report is about btree, but I suppose that's just to force a > parallel index scan. There are easier ways to do that, though, e.g. by > increasing cpu_tuple_cost. Then it's enough to query just the one rel. I pushed the fix a short while ago, but didn't include the tests. I don't think that the added test cycles would have paid for themselves. > How did you discover this issue? I don't think anyone else reported such > crashes, so presumably it's not quite common. There were many ways that this issue could accidentally fail to fail. For example, if even one of the skip arrays happened to be on a text column, there'd almost certainly have been no crash. In general we're very conservative about the space we request. We have to be, because the request is made only once, long before we really know what nbtree preprocessing will do/how many arrays it'll output. > It does fix it for me, but I don't know enough about the skip scan > internals to say if the fix is right. _bt_parallel_serialize_arrays assumes that btscan->btps_arrElems[] has so->numArrayKeys[]-many elements -- with and without the fix. The easiest way to see that the fix is correct is by noticing that _bt_parallel_serialize_arrays expects a certain layout in shared memory that btestimateparallelscan wasn't fully handling. When btestimateparallelscan estimated the amount of shared memory that the scan will require, it previously neglected to account for how skip arrays could contribute to the size of so->numArrayKeys[]. With Siddharth's fix in place, we conservatively assume that preprocessing will add the maximum possible number of skip arrays/use the largest possible so->numArrayKeys[]/so->numArrayKeys when we determine btscan->btps_arrElems[] space overhead. Making _bt_parallel_serialize_arrays agree with btestimateparallelscan. > Is there something we could do to deal with this class of bugs (buffer > overflow in shared memory)? For buffers in private memory we have tools > like valgrind and sentinels to make these issues more obvious, but for > shared memory that's not the case ... :-( I'm not sure that Valgrind style instrumentation would have actually caught this issue. As I said, our conservative approach could mask the issue in many ways. Plus the test case involved an index with the maximum 32 index columns, and an input scan key on the very last index column, which is obviously very atypical. --=20 Peter Geoghegan