public inbox for [email protected]  
help / color / mirror / Atom feed
From: =?utf-8?B?Y2NhNTUwNw==?= <[email protected]>
To: =?utf-8?B?SnVud2FuZyBaaGFv?= <[email protected]>
To: =?utf-8?B?QW1pdCBMYW5nb3Rl?= <[email protected]>
Cc: =?utf-8?B?RGFuaWlsIERhdnlkb3Y=?= <[email protected]>
Cc: =?utf-8?B?UG9zdGdyZVNRTC1kZXZlbG9wbWVudA==?= <[email protected]>
Cc: =?utf-8?B?VG9tYXMgVm9uZHJh?= <[email protected]>
Subject: Re: Batching in executor
Date: Tue, 3 Feb 2026 21:30:15 +0800
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAEG8a3JKNRGwBb-C_YV6Px4aFHwd_G=AsB5AbfyU4-8YZT54AQ@mail.gmail.com>
References: <CA+HiwqFfAY_ZFqN8wcAEMw71T9hM_kA8UtyHaZZEZtuT3UyogA@mail.gmail.com>
	<[email protected]>
	<CA+HiwqGqeS_94wxiYa8VpymiR_OtFPKDSpX+Me=MYWO45f5yig@mail.gmail.com>
	<[email protected]>
	<CA+HiwqGM0ZTeVHicSkGnCp-2U-jvU-KBQCkPJ0N7nAj_c2LjZg@mail.gmail.com>
	<CA+HiwqHyE7-oOvtZ+OC-4N7DvKSr8Jbu75erMLQ7O4d6gfkBhg@mail.gmail.com>
	<CA+HiwqEPMwhg6pUE4XML2rG4fRqKMpeGTtMwRPGes90f9iOqtg@mail.gmail.com>
	<CA+HiwqEZja5rJ78p3FBDZNvynWsHwanxyt6h0YaK_r84NemXng@mail.gmail.com>
	<[email protected]>
	<CAJDiXggP41+-HrRzT+BmtgmS8wkoZM7b4skkQA5NAe+NFEMPSQ@mail.gmail.com>
	<CA+HiwqGufj46_4kT6z5nQrCOxp3rcMDR8fn_t33n2pDKokNRQg@mail.gmail.com>
	<CA+HiwqH-2GmTKLW9kHdnqV4KdFiPfuAdVK2TgqOM2JaaeUYXnw@mail.gmail.com>
	<CAEG8a3JKNRGwBb-C_YV6Px4aFHwd_G=AsB5AbfyU4-8YZT54AQ@mail.gmail.com>

Hi,

Some comments for v5:

0001
====

1) heap_begin_batch()

```
	/* Single allocation for HeapBatch header + tupdata array */
	alloc_size = sizeof(HeapBatch) + sizeof(HeapTupleData) * maxitems;
	hb = palloc(alloc_size);
	hb->tupdata = (HeapTupleData *) ((char *) hb + sizeof(HeapBatch));
```

Do we need a MAXALIGN() here to avoid unaligned access? Something like this:

```
	/* Single allocation for HeapBatch header + tupdata array */
	alloc_size = MAXALIGN(sizeof(HeapBatch)) + sizeof(HeapTupleData) * maxitems;
	hb = palloc(alloc_size);
	hb->tupdata = (HeapTupleData *) ((char *) hb + MAXALIGN(sizeof(HeapBatch)));
```

Or how about just using zero-length array:

```
typedef struct HeapBatch
{
	Buffer			buf;
	int				maxitems;
	int				nitems;
	HeapTupleData	tupdata[FLEXIBLE_ARRAY_MEMBER];
} HeapBatch;

// and
hb = palloc(offsetof(HeapBatch, tupdata) + sizeof(HeapTupleData) * maxitems);
```

2) pgstat_count_heap_getnext_batch()

```
#define pgstat_count_heap_getnext_batch(rel, n)						\
	do {															\
		if (pgstat_should_count_relation(rel))						\
			(rel)->pgstat_info->counts.tuples_returned += n;		\
	} while (0)
```

"+= n" -> "+= (n)", just like pgstat_count_index_tuples().

0002
====

1) TupleBatchCreate()

```
	/* Single allocation for TupleBatch + inslots + outslots arrays */
	alloc_size = sizeof(TupleBatch) + 2 * sizeof(TupleTableSlot *) * capacity;
	b = palloc(alloc_size);
	inslots = (TupleTableSlot **) ((char *) b + sizeof(TupleBatch));
	outslots = (TupleTableSlot **) ((char *) b + sizeof(TupleBatch) +
		sizeof(TupleTableSlot *) * capacity);
```

Do we need a MAXALIGN() here to avoid unaligned access?

2) TupleBatchReset()

```
	for (int i = 0; i < b->maxslots; i++)
	{
		ExecClearTuple(b->inslots[i]);
		if (drop_slots)
			ExecDropSingleTupleTableSlot(b->inslots[i]);
	}
```

ExecDropSingleTupleTableSlot() will call ExecClearTuple(), so ExecClearTuple() will be
called twice if drop_slots is true, I think we can avoid this.

3) ScanCanUseBatching()

In heap_beginscan(), we may disable page-at-a-time mode:

```
	/*
	 * Disable page-at-a-time mode if it's not a MVCC-safe snapshot.
	 */
	if (!(snapshot && IsMVCCSnapshot(snapshot)))
		scan->rs_base.rs_flags &= ~SO_ALLOW_PAGEMODE;
```

It seems that ScanCanUseBatching() didn't consider this.

4) struct TupleBatch

```
	struct TupleTableSlot **inslots; /* slots for tuples read "into" batch */
	struct TupleTableSlot **outslots; /* slots for tuples going "out of"
									   * batch */
	struct TupleTableSlot **activeslots;
```

I think we can remove the word "struct".

5) ExecScanExtendedBatchSlot()

```
		/* Get next input slot from current batch, or refill */
		if (!TupleBatchHasMore(b))
		{
			if (!accessBatchMtd(node))
				return NULL;
		}
```

I think we cannot just return NULL here, see comments in ExecScanExtended():

```
		/*
		 * if the slot returned by the accessMtd contains NULL, then it means
		 * there is nothing more to scan so we just return an empty slot,
		 * being careful to use the projection result slot so it has correct
		 * tupleDesc.
		 */
		if (TupIsNull(slot))
		{
			if (projInfo)
				return ExecClearTuple(projInfo->pi_state.resultslot);
			else
				return slot;
		}
```

And why not just write this function like ExecScanExtended() and ExecScanFetch()?

--
Regards,
ChangAo Chen


view thread (9+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Batching in executor
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox