Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wA8CK-0025mM-2c for pgsql-bugs@arkaria.postgresql.org; Tue, 07 Apr 2026 15:18:37 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wA8CJ-0013k0-1F for pgsql-bugs@arkaria.postgresql.org; Tue, 07 Apr 2026 15:18:35 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wA2Bv-00Fyzf-2v for pgsql-bugs@lists.postgresql.org; Tue, 07 Apr 2026 08:53:48 +0000 Received: from fout-a7-smtp.messagingengine.com ([103.168.172.150]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wA2Bt-000000010D9-0VfD for pgsql-bugs@lists.postgresql.org; Tue, 07 Apr 2026 08:53:46 +0000 Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfout.phl.internal (Postfix) with ESMTP id CBD2BEC01AD; Tue, 7 Apr 2026 04:53:43 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-02.internal (MEProxy); Tue, 07 Apr 2026 04:53:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zeha.at; h=cc:cc :content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1775552023; x=1775638423; bh=8iUkKyNioqrBQiWyKgm1r3G2oMeKyInvUjLs62xnP6M=; b= WU0sXIO6qdqTUj60M3JjGcTBoyEGGaZLXJXfjke9m6IC+ioCra/gsx06M7m6LvE/ v+A1R94BIUz/ONEzy4O8sCOw/6LhKeU74J+h8OVzxoIxJldoxs2e2DXb5cjE/veu IGCs6/75ae7Ximrb6PxgY4pAti4UT08flG1SUvfHObNfo+z0ZmYxG1Z2H8oVRiI2 6cNhpHgn+5idHoZn75QqARaOZZX07zgcmGNcivddujqjrAUrP7QXn1UGXdSk9Mn2 B/y8PI9zJlmkPh9+NtqccaEpHRvQoxzUf3OeFYVBkjugXyMzz2g1MfWxcfnBC4r+ pqsqtdf+z8wAj4swa0PESA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1775552023; x= 1775638423; bh=8iUkKyNioqrBQiWyKgm1r3G2oMeKyInvUjLs62xnP6M=; b=J 3yRIjwqyICl5jgCZpv4XmQkxgOhmZpopqel+Wd6fcMgR+oX/Pjt+mgh8yOwjLuAV JP5Ut4o/kUjt9oEPuf4SF1lRERsVvUHB6HNzQRLbkvYodQP9iOlUNOW+bH8Rj07X nEvPGqdm06KcExRMmIUVcFuIyCCeKwjdHGApkSVB7OsVKUw+8TFZoLOmAlvyOkZW xN9x6wGlDJJMMChxy5LffSjhSuwhnuHnMU9N0uETmtkABUhS0xCVOg+z/yHXEWVz Ox22/yA8vm5TFDBuERbk27+UX0eLDQjNq7V6lZM8NWt9S2y8vxd1j0z0jlwQIchh y70Xck7IMc9XJQUcRMG2Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgddvtddvudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpeffhffvvefukfhfgggtugfgjgesthekredttddtjeenucfhrhhomhepvehhrhhishcu jfhofhhsthgrvgguthhlvghruceotghhodhpghesiigvhhgrrdgrtheqnecuggftrfgrth htvghrnhepleeuheeihffgueeufeetffdvgeeiveekfefhffehtddugfffuddtveelgffh hfegnecuffhomhgrihhnpehpohhsthhgrhdrvghsnecuvehluhhsthgvrhfuihiivgeptd enucfrrghrrghmpehmrghilhhfrhhomheptghhodhpghesiigvhhgrrdgrthdpnhgspghr tghpthhtohepvddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepthhhohhmrghsrd hmuhhnrhhosehgmhgrihhlrdgtohhmpdhrtghpthhtohepphhgshhqlhdqsghughhssehl ihhsthhsrdhpohhsthhgrhgvshhqlhdrohhrgh X-ME-Proxy: Feedback-ID: id0494954:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 7 Apr 2026 04:53:41 -0400 (EDT) Date: Tue, 7 Apr 2026 10:53:36 +0200 From: Chris Hofstaedtler To: Thomas Munro Cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #19416: Backend SIGSEGV in ExecShutdownHashJoin/ExecHashTableDetach/dsa_free Message-ID: References: <19416-967b71bb54d8fe9b@postgresql.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk * Thomas Munro [260328 08:03]: >On Thu, Feb 26, 2026 at 11:28 AM PG Bug reporting form > wrote: >> #0 0x0000564830c3c9d1 in dsa_free (area=0x5648621a9280, dp=) >> at ./build/../src/backend/utils/mmgr/dsa.c:845 >> 845 ./build/../src/backend/utils/mmgr/dsa.c: No such file or directory. >> (gdb) bt full >> #0 0x0000564830c3c9d1 in dsa_free (area=0x5648621a9280, dp=) >> at ./build/../src/backend/utils/mmgr/dsa.c:845 >> segment_map = 0x5648621a92b8 >> pageno = 0 >> span_pointer = 0 >> span = >> superblock = >> object = >> size = >> size_class = >> #1 0x0000564830917ae1 in ExecHashTableDetach (hashtable=0x564862147fa0) at >> ./build/../src/backend/executor/nodeHash.c:3439 >> i = >> pstate = 0x7ef9970b5040 > >Hmm, it's not supposed to be possible for two backends to reach the >dsa_free() at nodeHash.c:3439. But it does smell a bit like that >that's what happened... hmmm. We were in _RUN phase, and then this >barrier should only let one process through to _FREE. I'd be >interested in the contents of pstate->build_barrier in frame 1 if you >get another core file with the same stack in it. Sorry for the lack of followup. Currently we are thinking our problem matches the bug described in https://postgr.es/m/044401dcabac$fe432490$fac96db0$@icloud.com ("[BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path"). >Can you share the >outline of the query plan? Does it happen to have a full/right PHJ >in it? It appears we forgot to capture more of the logs (before they expired). Here is a plan from a smaller machine carrying a similar dataset - unfortunately I cannot guarantee it's the same plan, although it seems close. Indeed it has a "Parallel Hash Right Join" node. Nested Loop (cost=13030498.63..46923224.94 rows=21144 width=36) -> Nested Loop (cost=13030498.19..46911677.41 rows=21144 width=44) -> GroupAggregate (cost=13030497.75..46808339.62 rows=54669 width=2120) Group Key: subscriber.id Filter: (last_agg(nsr.voicemail_present ORDER BY nsr.sort) = 'true'::text) -> Gather Merge (cost=13030497.75..19559670.85 rows=53993659 width=23) Workers Planned: 5 -> Sort (cost=13029497.68..13056494.51 rows=10798732 width=23) Sort Key: subscriber.id, nsr.sort -> Parallel Hash Left Join (cost=10633766.78..11767970.44 rows=10798732 width=23) Hash Cond: (subscriber.id = subscription.subscriber_id) Filter: ((nsr.postpaid IS NULL) OR (nsr.postpaid = (subscription.id IS NOT NULL))) -> Hash Left Join (cost=10585690.41..11619799.44 rows=11184278 width=24) Hash Cond: (subscriber.id = subscriber_options.subscriber_id) Filter: ((nsr.option_ids IS NULL) OR (nsr.option_ids <@ subscriber_options.option_ids)) -> Parallel Hash Right Join (cost=9142123.87..10141640.84 rows=13177920 width=49) Hash Cond: ((optionvalue_lso.subscriber_id = subscriber.id) AND (optionvalue_lso.option_id = nsr.optionvalue_option_id) AND (optionvalue_lso.value = nsr.optionvalue_value)) Filter: ((nsr.optionvalue_option_id IS NULL) OR (optionvalue_lso.id IS NOT NULL)) -> Parallel Seq Scan on lnk_subscriber_option optionvalue_lso (cost=0.00..728660.20 rows=1901347 width=21) Filter: ((activationdate <= now()) AND (deletiondate > now())) -> Parallel Hash (cost=8769950.27..8769950.27 rows=13177920 width=62) -> Parallel Hash Join (cost=2888653.49..8769950.27 rows=13177920 width=62) Hash Cond: (lnk_subscriber_simcard.subscriber_id = subscriber.id) Join Filter: (((nsr.brand_id IS NULL) OR (nsr.brand_id = subscriber.brand_id)) AND ((nsr.registered IS NULL) OR (nsr.registered = subscriber.registered)) AND ((nsr.customer_category_ids IS NULL) OR (basedata.customercategory_id = ANY (nsr.customer_category_ids))) AND ((nsr.subscriberstatus_ids IS NULL) OR (lnk_subscriber_status.subscriberstatus_id = ANY (nsr.subscriberstatus_ids)))) -> Nested Loop (cost=1148075.17..6707980.64 rows=34348223 width=126) Join Filter: ((simcard.imsi >= nsr.imsi_start) AND (simcard.imsi <= nsr.imsi_end)) -> Parallel Hash Join (cost=1148075.17..1298129.01 rows=1807801 width=12) Hash Cond: (lnk_subscriber_simcard.simcard_id = simcard.id) -> Parallel Seq Scan on lnk_subscriber_simcard (cost=0.00..145308.36 rows=1807801 width=8) Filter: ((activationdate <= now()) AND (deletiondate > now())) -> Parallel Hash (cost=1118684.52..1118684.52 rows=2351252 width=12) -> Parallel Seq Scan on simcard (cost=0.00..1118684.52 rows=2351252 width=12) -> Materialize (cost=0.00..7.56 rows=171 width=138) -> Seq Scan on network_settings_rule nsr (cost=0.00..6.71 rows=171 width=138) -> Parallel Hash (cost=1717950.74..1717950.74 rows=1810207 width=21) -> Parallel Hash Left Join (cost=778201.62..1717950.74 rows=1810207 width=21) Hash Cond: (subscriber.id = lnk_subscriber_basedata.subscriber_id) -> Parallel Hash Join (cost=342951.95..1270264.89 rows=1810207 width=17) Hash Cond: (lnk_subscriber_status.subscriber_id = subscriber.id) -> Parallel Bitmap Heap Scan on lnk_subscriber_status (cost=102927.10..1025488.25 rows=1810207 width=8) Recheck Cond: (deletiondate > now()) Filter: (activationdate <= now()) -> Bitmap Index Scan on idx_subscriber_status_deletiondate (cost=0.00..100664.35 rows=9051037 width=0) Index Cond: (deletiondate > now()) -> Parallel Hash (cost=216653.82..216653.82 rows=1869682 width=9) -> Parallel Seq Scan on subscriber (cost=0.00..216653.82 rows=1869682 width=9) -> Parallel Hash (cost=413397.43..413397.43 rows=1748179 width=12) -> Parallel Hash Left Join (cost=228746.63..413397.43 rows=1748179 width=12) Hash Cond: (lnk_subscriber_basedata.basedata_id = basedata.id) -> Parallel Seq Scan on lnk_subscriber_basedata (cost=0.00..180061.84 rows=1748179 width=12) Filter: ((now() >= activationdate) AND (now() < deletiondate)) -> Parallel Hash (cost=201487.39..201487.39 rows=2180739 width=8) -> Parallel Seq Scan on basedata (cost=0.00..201487.39 rows=2180739 width=8) -> Hash (cost=1434914.41..1434914.41 rows=692170 width=36) -> Subquery Scan on subscriber_options (cost=1345069.22..1434914.41 rows=692170 width=36) -> HashAggregate (cost=1345069.22..1427992.71 rows=692170 width=36) Group Key: lnk_subscriber_option.subscriber_id Planned Partitions: 8 -> Index Scan using idx_option_subscriber_combined on lnk_subscriber_option (cost=0.57..1033129.48 rows=9506735 width=8) Index Cond: ((activationdate <= now()) AND (deletiondate > now())) -> Parallel Hash (cost=46189.54..46189.54 rows=150946 width=8) -> Parallel Hash Join (cost=14649.64..46189.54 rows=150946 width=8) Hash Cond: (lnk_subscription_status.subscription_id = subscription.id) -> Hash Join (cost=1.27..31144.93 rows=150946 width=4) Hash Cond: (lnk_subscription_status.subscriptionstatus_id = subscriptionstatus.id) -> Parallel Seq Scan on lnk_subscription_status (cost=0.00..30421.01 rows=211324 width=8) Filter: ((activationdate <= now()) AND (deletiondate > now())) -> Hash (cost=1.14..1.14 rows=10 width=4) -> Seq Scan on subscriptionstatus (cost=0.00..1.14 rows=10 width=4) Filter: activestate -> Parallel Hash (cost=11245.39..11245.39 rows=272239 width=8) -> Parallel Seq Scan on subscription (cost=0.00..11245.39 rows=272239 width=8) -> Index Scan using idx_lnk_subscriber_msisdn_subscriber_deletiondate on lnk_subscriber_msisdn (cost=0.44..1.87 rows=1 width=8) Index Cond: ((subscriber_id = subscriber.id) AND (deletiondate > now())) Filter: (activationdate <= now()) -> Index Scan using pkey_msisdn on msisdn (cost=0.43..0.53 rows=1 width=20) Index Cond: (id = lnk_subscriber_msisdn.msisdn_id) For now our plan is to wait for the next 18.x stable release, and then reenable enable_parallel_hash. Then see if it occours again. As a side-note, my colleagues noted that the problematic query has drastic better timings with enable_parallel_hash = off. Something on our list for future exploration. Best, Chris