public inbox for [email protected]  
help / color / mirror / Atom feed
From: Matheus Alcantara <[email protected]>
To: Alexander Korotkov <[email protected]>
Cc: Alexander Pyhalov <[email protected]>
Cc: Pgsql Hackers <[email protected]>
Subject: Re: Asynchronous MergeAppend
Date: Mon, 06 Apr 2026 20:48:05 -0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAPpHfdsO8zYpDW==D6T5N0cJ+AzK7a_OyXJoYU1kFi=xZFTLuQ@mail.gmail.com>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CAFY6G8d3Yvxa_kRQA24BsJhwqfmSCv1ujiv_7b6g5isf-ZTs_Q@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CAPpHfdu+3Eud0CBpdFT+osWiT=e=zOQUtBsx8Z5okKrqhgVAJg@mail.gmail.com>
	<[email protected]>
	<CAPpHfdsO8zYpDW==D6T5N0cJ+AzK7a_OyXJoYU1kFi=xZFTLuQ@mail.gmail.com>

On Sat Apr 4, 2026 at 11:24 PM -03, Alexander Korotkov wrote:
> I made more work on the patchset.
>
> Patch #1 now considers IncrementalSort as exclusion alongside with
> Sort.  Exclusion check is now on the top of the switch().
> Patch #2 is split into 3 patches: common structures, common sync
> append logic, and common async append logic.
> New structs are now named AppendBase/AppendBaseState, corresponding
> fields are "ab" and "as".
>
> Most importantly I noted that this patchset actually only makes
> initial heap filling asynchronous.  The steady work after that is
> still syncnronous.  Even that it used async infrastructure, it fetched
> tuples from children subplans one-by-one: effectively synchronous but
> paying for asynchronous infrastructure.  I think even with this
> limitation, this patchset is valuable: the startup cost for children
> foreignscans can be high.  But this understanding allowed me to
> significantly simplify the main patch including:
> 1) After initial heap filling, use ExecProcNode() to fetch from children plans.
> 2) Remove ms_has_asyncresults entirely. Async responses store directly
> into ms_slots[] (the existing heap slot array), which serves as both
> the merge state and the "result arrived" indicator via TupIsNull().
> 3) Removed needrequest usage from MergeAppend. Since MergeAppend only
> fires initial requests (via ExecAppendBaseAsyncBegin()) and never
> sends follow-up requests, needrequest tracking is unnecessary.
> ExecMergeAppendAsyncRequest() was eliminated entirely.
> 4)  ExecMergeAppendAsyncGetNext() reduced to a simple wait loop:
> 5)  asyncresults allocation reduced back to nasyncplans.  MergeAppend
> doesn't use it (stores in ms_slots), and Append only needs nasyncplans
> entries for its stack.
>
> Additionally, I made the following changes.
> 1) WAIT_EVENT_MERGE_APPEND_READY wait event instead of extending
> WAIT_EVENT_APPEND_READY.  That should be less confusing for monitoring
> purposes.
> 2) More tests: error handling with broken partition, plan-time
> partition pruning, and run-time partition pruning tests for async
> MergeAppend.
>

Thanks for v17, the split of 0002 into multiple patches seems much
better. Overall I agree with the changes on v17 compared with v16, the
removal of ms_has_asyncresults makes the code better to read and follow.
The separate WAIT_EVENT_MERGE_APPEND_READY for better monitoring is also
good, I've tested some long running queries and I've find the event on
pg_stat_activity. The steady work changes also looks good.

One minor issue on 0002:

+	bool		valid_subplans_identified;	/* is valid_asyncplans valid? */
+	Bitmapset  *valid_subplans;
+	Bitmapset  *valid_asyncplans;	/* valid asynchronous plans indexes */

Should be /* is valid_subplans valid? */

-----

Minor comment on 0005:

+static void
+ExecMergeAppendAsyncGetNext(MergeAppendState *node, int mplan)
+{
+	/*
+	 * All initial async requests were fired by ExecAppendBaseAsyncBegin.

Wondering if we should reference ExecMergeAppendAsyncBegin() instead of
ExecAppendBaseAsyncBegin() since this is on nodeMergeAppend, what do you
think?

--
Matheus Alcantara
EDB: https://www.enterprisedb.com





view thread (33+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Asynchronous MergeAppend
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox