public inbox for [email protected]
help / color / mirror / Atom feedFrom: Junwang Zhao <[email protected]>
To: cca5507 <[email protected]>
Cc: Amit Langote <[email protected]>
Cc: Daniil Davydov <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Cc: Tomas Vondra <[email protected]>
Subject: Re: Batching in executor
Date: Tue, 3 Feb 2026 23:54:48 +0800
Message-ID: <CAEG8a3K22s9LmDZa486suFu3AV=yRuTf09j+Ww+djG63y=R8hg@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CA+HiwqFfAY_ZFqN8wcAEMw71T9hM_kA8UtyHaZZEZtuT3UyogA@mail.gmail.com>
<[email protected]>
<CA+HiwqGqeS_94wxiYa8VpymiR_OtFPKDSpX+Me=MYWO45f5yig@mail.gmail.com>
<[email protected]>
<CA+HiwqGM0ZTeVHicSkGnCp-2U-jvU-KBQCkPJ0N7nAj_c2LjZg@mail.gmail.com>
<CA+HiwqHyE7-oOvtZ+OC-4N7DvKSr8Jbu75erMLQ7O4d6gfkBhg@mail.gmail.com>
<CA+HiwqEPMwhg6pUE4XML2rG4fRqKMpeGTtMwRPGes90f9iOqtg@mail.gmail.com>
<CA+HiwqEZja5rJ78p3FBDZNvynWsHwanxyt6h0YaK_r84NemXng@mail.gmail.com>
<[email protected]>
<CAJDiXggP41+-HrRzT+BmtgmS8wkoZM7b4skkQA5NAe+NFEMPSQ@mail.gmail.com>
<CA+HiwqGufj46_4kT6z5nQrCOxp3rcMDR8fn_t33n2pDKokNRQg@mail.gmail.com>
<CA+HiwqH-2GmTKLW9kHdnqV4KdFiPfuAdVK2TgqOM2JaaeUYXnw@mail.gmail.com>
<CAEG8a3JKNRGwBb-C_YV6Px4aFHwd_G=AsB5AbfyU4-8YZT54AQ@mail.gmail.com>
<[email protected]>
On Tue, Feb 3, 2026 at 9:30 PM cca5507 <[email protected]> wrote:
>
> Hi,
>
> Some comments for v5:
>
> 0001
> ====
>
> 1) heap_begin_batch()
>
> ```
> /* Single allocation for HeapBatch header + tupdata array */
> alloc_size = sizeof(HeapBatch) + sizeof(HeapTupleData) * maxitems;
> hb = palloc(alloc_size);
> hb->tupdata = (HeapTupleData *) ((char *) hb + sizeof(HeapBatch));
> ```
>
> Do we need a MAXALIGN() here to avoid unaligned access? Something like this:
TBH I don't think this single allocation helps too much, it's not on
the hot path,
but makes the code harder to read ;(
>
> ```
> /* Single allocation for HeapBatch header + tupdata array */
> alloc_size = MAXALIGN(sizeof(HeapBatch)) + sizeof(HeapTupleData) * maxitems;
> hb = palloc(alloc_size);
> hb->tupdata = (HeapTupleData *) ((char *) hb + MAXALIGN(sizeof(HeapBatch)));
> ```
>
> Or how about just using zero-length array:
>
> ```
> typedef struct HeapBatch
> {
> Buffer buf;
> int maxitems;
> int nitems;
> HeapTupleData tupdata[FLEXIBLE_ARRAY_MEMBER];
> } HeapBatch;
>
> // and
> hb = palloc(offsetof(HeapBatch, tupdata) + sizeof(HeapTupleData) * maxitems);
> ```
>
> 2) pgstat_count_heap_getnext_batch()
>
> ```
> #define pgstat_count_heap_getnext_batch(rel, n) \
> do { \
> if (pgstat_should_count_relation(rel)) \
> (rel)->pgstat_info->counts.tuples_returned += n; \
> } while (0)
> ```
>
> "+= n" -> "+= (n)", just like pgstat_count_index_tuples().
>
> 0002
> ====
>
> 1) TupleBatchCreate()
>
> ```
> /* Single allocation for TupleBatch + inslots + outslots arrays */
> alloc_size = sizeof(TupleBatch) + 2 * sizeof(TupleTableSlot *) * capacity;
> b = palloc(alloc_size);
> inslots = (TupleTableSlot **) ((char *) b + sizeof(TupleBatch));
> outslots = (TupleTableSlot **) ((char *) b + sizeof(TupleBatch) +
> sizeof(TupleTableSlot *) * capacity);
> ```
>
> Do we need a MAXALIGN() here to avoid unaligned access?
>
> 2) TupleBatchReset()
>
> ```
> for (int i = 0; i < b->maxslots; i++)
> {
> ExecClearTuple(b->inslots[i]);
> if (drop_slots)
> ExecDropSingleTupleTableSlot(b->inslots[i]);
> }
> ```
>
> ExecDropSingleTupleTableSlot() will call ExecClearTuple(), so ExecClearTuple() will be
> called twice if drop_slots is true, I think we can avoid this.
>
> 3) ScanCanUseBatching()
>
> In heap_beginscan(), we may disable page-at-a-time mode:
>
> ```
> /*
> * Disable page-at-a-time mode if it's not a MVCC-safe snapshot.
> */
> if (!(snapshot && IsMVCCSnapshot(snapshot)))
> scan->rs_base.rs_flags &= ~SO_ALLOW_PAGEMODE;
> ```
>
> It seems that ScanCanUseBatching() didn't consider this.
>
> 4) struct TupleBatch
>
> ```
> struct TupleTableSlot **inslots; /* slots for tuples read "into" batch */
> struct TupleTableSlot **outslots; /* slots for tuples going "out of"
> * batch */
> struct TupleTableSlot **activeslots;
> ```
>
> I think we can remove the word "struct".
>
> 5) ExecScanExtendedBatchSlot()
>
> ```
> /* Get next input slot from current batch, or refill */
> if (!TupleBatchHasMore(b))
> {
> if (!accessBatchMtd(node))
> return NULL;
> }
> ```
>
> I think we cannot just return NULL here, see comments in ExecScanExtended():
>
> ```
> /*
> * if the slot returned by the accessMtd contains NULL, then it means
> * there is nothing more to scan so we just return an empty slot,
> * being careful to use the projection result slot so it has correct
> * tupleDesc.
> */
> if (TupIsNull(slot))
> {
> if (projInfo)
> return ExecClearTuple(projInfo->pi_state.resultslot);
> else
> return slot;
> }
> ```
>
> And why not just write this function like ExecScanExtended() and ExecScanFetch()?
>
> --
> Regards,
> ChangAo Chen
--
Regards
Junwang Zhao
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Batching in executor
In-Reply-To: <CAEG8a3K22s9LmDZa486suFu3AV=yRuTf09j+Ww+djG63y=R8hg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox