public inbox for [email protected]
help / color / mirror / Atom feedFrom: Tomas Vondra <[email protected]>
To: Siddharth Kothari <[email protected]>
To: [email protected]
Cc: Vaibhav Jain <[email protected]>
Cc: Madhukar <[email protected]>
Cc: Xun Cheng <[email protected]>
Cc: [email protected]
Subject: Re: Fix size estimation for parallel B-Tree scans with skip arrays
Date: Wed, 29 Apr 2026 17:42:55 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAGCUe0Lwk3C0qdkBa+OLpYc7yXwW=pbaz8Sju4xMXEQAmyp+5g@mail.gmail.com>
References: <CAGCUe0Lwk3C0qdkBa+OLpYc7yXwW=pbaz8Sju4xMXEQAmyp+5g@mail.gmail.com>
On 4/29/26 08:54, Siddharth Kothari wrote:
> Hi folks.
>
> This commit <https://github.com/postgres/postgres/
> commit/92fe23d93aa3bbbc40fca669cabc4a4d7975e327#diff-
> db0039b5ba12b5915e91ed6eedd78744e3cf7a77082af072d9626a5ae306c579> introduced parallel scan skip support, however it underestimates the required memory, causing it to write past the allocated shared memory boundary. This can corrupt any entity using the adjacent shared memory segment, leading to unpredictable behavior.
>
> I reproduced the issue manually on stock postgres and raised a patch
> that fixes it along with regress tests. In my repro, the issue
> manifested as postgres server crashing unexpectedly.
>
Thanks for the report. I'm able to reproduce the crash using your
reproducer script. At first I've been confused why you need a BRIN index
when this report is about btree, but I suppose that's just to force a
parallel index scan. There are easier ways to do that, though, e.g. by
increasing cpu_tuple_cost. Then it's enough to query just the one rel.
How did you discover this issue? I don't think anyone else reported such
crashes, so presumably it's not quite common.
> Root cause:
>
> In src/backend/access/nbtree/nbtree.c, the loop
> in btestimateparallelscan assumes that every index column might require
> a skip array and adds sizeof(int) to the estimated size:
>
> However, every skip array actually needs space for its slot in
> the btps_arrElems array AND space to store its scan key's sk_flags.
> Therefore, it requires sizeof(int) * 2.
>
>
> The attached patch fixes this by allocating sizeof(int) * 2 per
> attribute in btestimateparallelscan.
>
It does fix it for me, but I don't know enough about the skip scan
internals to say if the fix is right.
Is there something we could do to deal with this class of bugs (buffer
overflow in shared memory)? For buffers in private memory we have tools
like valgrind and sentinels to make these issues more obvious, but for
shared memory that's not the case ... :-(
regards
--
Tomas Vondra
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Fix size estimation for parallel B-Tree scans with skip arrays
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox