public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andres Freund <[email protected]>
To: Tom Lane <[email protected]>
Cc: Peter 'PMc' Much <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: [email protected]
Subject: Re: Need help debugging SIGBUS crashes
Date: Tue, 17 Mar 2026 17:19:49 -0400
Message-ID: <klcmv23wupsyquzm5soktrndehvrydh4yqg2osopgmtcf44mmv@akc53a66xndz> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
Hi,
On 2026-03-17 16:56:48 -0400, Tom Lane wrote:
> "Peter 'PMc' Much" <[email protected]> writes:
> > On Tue, Mar 17, 2026 at 10:12:07AM -0400, Tom Lane wrote:
> > ! Why it was okay in older FreeBSD and not so much in v14, who knows?
>
> > Maybe it wasn't. Here it appeared out of thin air in February, while
> > the system was upgraded from 13.5 to 14.3 in July'25, and did run
> > without problems for these eight months.
> > So this is not directly or solely related to FBSD R.14, and while it
> > happens more likely during massive memory use, but this also is not
> > stingent. Neither did I find any other solid determining condition.
>
> Yeah, it seems likely that there is some additional triggering
> condition that we don't understand; otherwise there would be more
> people complaining than just you.
One issue we've seen in the past (on some other BSD, I think NetBSD?) is
signal handlers used a C function in a shared library, the function was never
used before the signal handler, and that dynamic symbol resolution allocated
memory. Which then contributed to deadlocks and/or corruption of alloctor
metadata.
You could check if that's a factor by exporting LD_BIND_NOW.
The way the signal handling worked before 16 should not really lead to corrupt
allocator datastructures, as the signal handler is only allowed to run in a
period in which the normal execution is suspended (or only calls async signal
safe code, e.g. after waking up, until reaching the sigmask calls to block the
signal again). ISTM, there either needed to be another signal handler that
allocated memory that was interrupted by SIGUSR1 or that postmaster allocated
memory while the signal was unmasked. The dynamic linker doing function
resolution could be an explanation.
Greetings,
Andres Freund
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Need help debugging SIGBUS crashes
In-Reply-To: <klcmv23wupsyquzm5soktrndehvrydh4yqg2osopgmtcf44mmv@akc53a66xndz>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox