public inbox for [email protected]  
help / color / mirror / Atom feed
From: Bharath Rupireddy <[email protected]>
To: PostgreSQL-development <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Dilip Kumar <[email protected]>
Cc: Luc Vlaming <[email protected]>
Cc: Justin Pryzby <[email protected]>
Cc: Jeff Davis <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Matthias van de Meent <[email protected]>
Subject: Re: New Table Access Methods for Multi and Single Inserts
Date: Mon, 29 Jan 2024 17:16:47 +0530
Message-ID: <CALj2ACWT0Rz8oybWBm5W4CeS0DvFkwaw-pEvGArhDLyPbZnW_g@mail.gmail.com> (raw)
In-Reply-To: <CALj2ACVE2h=LnFnpr3rh+6SZzdwzW5EZOYG2Z0t=p28Fn75eag@mail.gmail.com>
References: <CALj2ACXdrOmB6Na9amHWZHKvRT3Z0nwTRsCwoMT-npOBtmXLXg@mail.gmail.com>
	<[email protected]>
	<CALj2ACX5UMWVFdrRNUE0KDrg54WV1cumBXwcETXhrPc1ibKAQA@mail.gmail.com>
	<CAAWbhmj5Pio3nOUakObzLGCSS9dwFfgsNVDhwTGzXNwZc00uCQ@mail.gmail.com>
	<CALj2ACVHC=c6eC9SRxhcTUrnXvNDNkEBgedi2WkVJYRb=0sWYw@mail.gmail.com>
	<CALj2ACVE2h=LnFnpr3rh+6SZzdwzW5EZOYG2Z0t=p28Fn75eag@mail.gmail.com>

On Mon, Jan 29, 2024 at 12:57 PM Bharath Rupireddy
<[email protected]> wrote:
>
> On Wed, Jan 17, 2024 at 10:57 PM Bharath Rupireddy
> <[email protected]> wrote:
> >
> > Thank you. I'm attaching v8 patch-set here which includes use of new
> > insert TAMs for COPY FROM. With this, postgres not only will have the
> > new TAM for inserts, but they also can make the following commands
> > faster - CREATE TABLE AS, SELECT INTO, CREATE MATERIALIZED VIEW,
> > REFRESH MATERIALIZED VIEW and COPY FROM. I'll perform some testing in
> > the coming days and post the results here, until then I appreciate any
> > feedback on the patches.
> >
> > I've also added this proposal to CF -
> > https://commitfest.postgresql.org/47/4777/.
>
> Some of the tests related to Incremental Sort added by a recent commit
> 0452b461bc4 in aggregates.sql are failing when the multi inserts
> feature is used for CTAS (like done in 0002 patch). I'm not so sure if
> it's because of the reduction in the CTAS execution times. Execution
> time for table 'btg' created with CREATE TABLE AS added by commit
> 0452b461bc4 with single inserts is 25.3 msec, with multi inserts is
> 17.7 msec. This means that the multi inserts are about 1.43 times or
> 30.04% faster  than the single inserts. Couple of ways to make these
> tests pick Incremental Sort as expected - 1) CLUSTER btg USING abc; or
> 2) increase the number of rows in table btg to 100K from 10K. FWIW, if
> I reduce the number of rows in the table from 10K to 1K, the
> Incremental Sort won't get picked on HEAD with CTAS using single
> inserts. Hence, I chose option (2) to fix the issue.
>
> Please find the attached v9 patch set.
>
> [1]
>  -- Engage incremental sort
>  explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
> -                   QUERY PLAN
> --------------------------------------------------
> +          QUERY PLAN
> +------------------------------
>   Group
>     Group Key: x, y, z, w
> -   ->  Incremental Sort
> +   ->  Sort
>           Sort Key: x, y, z, w
> -         Presorted Key: x, y
> -         ->  Index Scan using btg_x_y_idx on btg
> -(6 rows)
> +         ->  Seq Scan on btg
> +(5 rows)

CF bot machine with Windows isn't happy with the compilation [1], so
fixed those warnings and attached v10 patch set.

[1]
[07:35:25.458] [632/2212] Compiling C object
src/backend/postgres_lib.a.p/commands_copyfrom.c.obj
[07:35:25.458] c:\cirrus\src\include\access\tableam.h(1574) : warning
C4715: 'table_multi_insert_slots': not all control paths return a
value
[07:35:25.458] c:\cirrus\src\include\access\tableam.h(1522) : warning
C4715: 'table_insert_begin': not all control paths return a value
[07:35:25.680] c:\cirrus\src\include\access\tableam.h(1561) : warning
C4715: 'table_multi_insert_next_free_slot': not all control paths
return a value
[07:35:25.680] [633/2212] Compiling C object
src/backend/postgres_lib.a.p/commands_createas.c.obj
[07:35:25.680] c:\cirrus\src\include\access\tableam.h(1522) : warning
C4715: 'table_insert_begin': not all control paths return a value
[07:35:26.310] [646/2212] Compiling C object
src/backend/postgres_lib.a.p/commands_matview.c.obj
[07:35:26.310] c:\cirrus\src\include\access\tableam.h(1522) : warning
C4715: 'table_insert_begin': not all control paths return a value

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Attachments:

  [application/octet-stream] v10-0001-New-TAMs-for-inserts.patch (15.9K, 2-v10-0001-New-TAMs-for-inserts.patch)
  download | inline diff:
From 3c892cf5c2df949efac1ec5dc8fc390b868fe400 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <[email protected]>
Date: Mon, 29 Jan 2024 10:59:41 +0000
Subject: [PATCH v10 1/4] New TAMs for inserts

---
 src/backend/access/heap/heapam.c         | 224 +++++++++++++++++++++++
 src/backend/access/heap/heapam_handler.c |   9 +
 src/include/access/heapam.h              |  49 +++++
 src/include/access/tableam.h             | 138 ++++++++++++++
 4 files changed, 420 insertions(+)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 707460a536..7df305380e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -68,6 +68,7 @@
 #include "utils/datum.h"
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
+#include "utils/memutils.h"
 #include "utils/relcache.h"
 #include "utils/snapmgr.h"
 #include "utils/spccache.h"
@@ -2446,6 +2447,229 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	pgstat_count_heap_insert(relation, ntuples);
 }
 
+/*
+ * Initialize state required for an insert a single tuple or multiple tuples
+ * into a heap.
+ */
+TableInsertState *
+heap_insert_begin(Relation rel, CommandId cid, int am_flags, int insert_flags)
+{
+	TableInsertState *tistate;
+
+	tistate = palloc0(sizeof(TableInsertState));
+	tistate->rel = rel;
+	tistate->cid = cid;
+	tistate->am_flags = am_flags;
+	tistate->insert_flags = insert_flags;
+
+	if ((am_flags & TABLEAM_MULTI_INSERTS) != 0 ||
+		(am_flags & TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY))
+		tistate->am_data = palloc0(sizeof(HeapInsertState));
+
+	if ((am_flags & TABLEAM_MULTI_INSERTS) != 0)
+	{
+		HeapMultiInsertState *mistate;
+
+		mistate = palloc0(sizeof(HeapMultiInsertState));
+		mistate->slots = palloc0(sizeof(TupleTableSlot *) * HEAP_MAX_BUFFERED_SLOTS);
+
+		mistate->context = AllocSetContextCreate(CurrentMemoryContext,
+												 "heap_multi_insert_v2 memory context",
+												 ALLOCSET_DEFAULT_SIZES);
+
+		((HeapInsertState *) tistate->am_data)->mistate = mistate;
+	}
+
+	if ((am_flags & TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY) != 0)
+		((HeapInsertState *) tistate->am_data)->bistate = GetBulkInsertState();
+
+	return tistate;
+}
+
+/*
+ * Insert a single tuple into a heap.
+ */
+void
+heap_insert_v2(TableInsertState * state, TupleTableSlot *slot)
+{
+	bool		shouldFree = true;
+	HeapTuple	tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree);
+	BulkInsertState bistate = NULL;
+
+	Assert(state->am_data != NULL &&
+		   ((HeapInsertState *) state->am_data)->mistate == NULL);
+
+	/* Update tuple with table oid */
+	slot->tts_tableOid = RelationGetRelid(state->rel);
+	tuple->t_tableOid = slot->tts_tableOid;
+
+	if (state->am_data != NULL &&
+		((HeapInsertState *) state->am_data)->bistate != NULL)
+		bistate = ((HeapInsertState *) state->am_data)->bistate;
+
+	/* Perform insertion, and copy the resulting ItemPointer */
+	heap_insert(state->rel, tuple, state->cid, state->insert_flags,
+				bistate);
+	ItemPointerCopy(&tuple->t_self, &slot->tts_tid);
+
+	if (shouldFree)
+		pfree(tuple);
+}
+
+/*
+ * Create/return next free slot from multi-insert buffered slots array.
+ */
+TupleTableSlot *
+heap_multi_insert_next_free_slot(TableInsertState * state)
+{
+	TupleTableSlot *slot;
+	HeapMultiInsertState *mistate;
+
+	Assert(state->am_data != NULL &&
+		   ((HeapInsertState *) state->am_data)->mistate != NULL);
+
+	mistate = ((HeapInsertState *) state->am_data)->mistate;
+	slot = mistate->slots[mistate->cur_slots];
+
+	if (slot == NULL)
+	{
+		slot = table_slot_create(state->rel, NULL);
+		mistate->slots[mistate->cur_slots] = slot;
+	}
+	else
+		ExecClearTuple(slot);
+
+	return slot;
+}
+
+/*
+ * Store passed-in tuple into in-memory buffered slots. When full, insert
+ * multiple tuples from the buffers into heap.
+ */
+void
+heap_multi_insert_v2(TableInsertState * state, TupleTableSlot *slot)
+{
+	TupleTableSlot *dstslot;
+	HeapMultiInsertState *mistate;
+
+	Assert(state->am_data != NULL &&
+		   ((HeapInsertState *) state->am_data)->mistate != NULL);
+
+	mistate = ((HeapInsertState *) state->am_data)->mistate;
+	dstslot = mistate->slots[mistate->cur_slots];
+
+	if (dstslot == NULL)
+	{
+		dstslot = table_slot_create(state->rel, NULL);
+		mistate->slots[mistate->cur_slots] = dstslot;
+	}
+
+	/*
+	 * Caller may have got the slot using heap_multi_insert_next_free_slot,
+	 * filled it and passed. So, skip copying in such a case.
+	 */
+	if ((state->am_flags & TABLEAM_SKIP_MULTI_INSERTS_FLUSH) == 0)
+	{
+		ExecClearTuple(dstslot);
+		ExecCopySlot(dstslot, slot);
+	}
+	else
+		Assert(dstslot == slot);
+
+	mistate->cur_slots++;
+
+	/*
+	 * When passed-in slot is already materialized, memory allocated in slot's
+	 * memory context is a close approximation for us to track the required
+	 * space for the tuple in slot.
+	 *
+	 * For non-materialized slots, the flushing decision happens solely on the
+	 * number of tuples stored in the buffer.
+	 */
+	if (TTS_SHOULDFREE(slot))
+		mistate->cur_size += MemoryContextMemAllocated(slot->tts_mcxt, false);
+
+	if ((state->am_flags & TABLEAM_SKIP_MULTI_INSERTS_FLUSH) == 0 &&
+		(mistate->cur_slots >= HEAP_MAX_BUFFERED_SLOTS ||
+		 mistate->cur_size >= HEAP_MAX_BUFFERED_BYTES))
+		heap_multi_insert_flush(state);
+}
+
+/*
+ * Return pointer to multi-insert buffered slots array and number of currently
+ * occupied slots.
+ */
+TupleTableSlot **
+heap_multi_insert_slots(TableInsertState * state, int *num_slots)
+{
+	HeapMultiInsertState *mistate;
+
+	mistate = ((HeapInsertState *) state->am_data)->mistate;
+	*num_slots = mistate->cur_slots;
+
+	return mistate->slots;
+}
+
+/*
+ * Insert multiple tuples from in-memory buffered slots into heap.
+ */
+void
+heap_multi_insert_flush(TableInsertState * state)
+{
+	HeapMultiInsertState *mistate;
+	BulkInsertState bistate = NULL;
+	MemoryContext oldcontext;
+
+	mistate = ((HeapInsertState *) state->am_data)->mistate;
+
+	if (state->am_data != NULL &&
+		((HeapInsertState *) state->am_data)->bistate != NULL)
+		bistate = ((HeapInsertState *) state->am_data)->bistate;
+
+	oldcontext = MemoryContextSwitchTo(mistate->context);
+	heap_multi_insert(state->rel, mistate->slots, mistate->cur_slots,
+					  state->cid, state->insert_flags, bistate);
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextReset(mistate->context);
+
+	mistate->cur_slots = 0;
+	mistate->cur_size = 0;
+}
+
+/*
+ * Clean up state used to insert a single or multiple tuples into a heap.
+ */
+void
+heap_insert_end(TableInsertState * state)
+{
+	if (state->am_data != NULL &&
+		((HeapInsertState *) state->am_data)->mistate != NULL)
+	{
+		HeapMultiInsertState *mistate =
+			((HeapInsertState *) state->am_data)->mistate;
+
+		/* Insert remaining tuples from multi-insert buffers */
+		if (mistate->cur_slots > 0 || mistate->cur_size > 0)
+			heap_multi_insert_flush(state);
+
+		MemoryContextDelete(mistate->context);
+
+		for (int i = 0; i < HEAP_MAX_BUFFERED_SLOTS && mistate->slots[i] != NULL; i++)
+			ExecDropSingleTupleTableSlot(mistate->slots[i]);
+
+		pfree(mistate);
+		((HeapInsertState *) state->am_data)->mistate = NULL;
+	}
+
+	if (state->am_data != NULL &&
+		((HeapInsertState *) state->am_data)->bistate != NULL)
+		FreeBulkInsertState(((HeapInsertState *) state->am_data)->bistate);
+
+	pfree(state->am_data);
+	state->am_data = NULL;
+	pfree(state);
+}
+
 /*
  *	simple_heap_insert - insert a tuple
  *
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index d15a02b2be..795177812d 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2564,6 +2564,15 @@ static const TableAmRoutine heapam_methods = {
 	.tuple_insert_speculative = heapam_tuple_insert_speculative,
 	.tuple_complete_speculative = heapam_tuple_complete_speculative,
 	.multi_insert = heap_multi_insert,
+
+	.tuple_insert_begin = heap_insert_begin,
+	.tuple_insert_v2 = heap_insert_v2,
+	.tuple_multi_insert_next_free_slot = heap_multi_insert_next_free_slot,
+	.tuple_multi_insert_v2 = heap_multi_insert_v2,
+	.tuple_multi_insert_slots = heap_multi_insert_slots,
+	.tuple_multi_insert_flush = heap_multi_insert_flush,
+	.tuple_insert_end = heap_insert_end,
+
 	.tuple_delete = heapam_tuple_delete,
 	.tuple_update = heapam_tuple_update,
 	.tuple_lock = heapam_tuple_lock,
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 4b133f6859..053be18110 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -225,6 +225,40 @@ htsv_get_valid_status(int status)
 	return (HTSV_Result) status;
 }
 
+/*
+ * Maximum number of slots that multi-insert buffers can hold.
+ *
+ * Caution: Don't make this too big, as we could end up with this many tuples
+ * stored in multi insert buffer. For instance, increasing this can cause
+ * quadratic growth in memory requirements during copies into partitioned
+ * tables with a large number of partitions.
+ */
+#define HEAP_MAX_BUFFERED_SLOTS		1000
+
+/* Maximum size of all tuples that multi-insert buffers can hold */
+#define HEAP_MAX_BUFFERED_BYTES		65535
+
+typedef struct HeapMultiInsertState
+{
+	/* Memory context to use for flushing multi-insert buffers */
+	MemoryContext context;
+
+	/* Array of buffered slots */
+	TupleTableSlot **slots;
+
+	/* Number of slots that multi-insert buffers currently hold */
+	int			cur_slots;
+
+	/* Size of all tuples that multi-insert buffers currently hold */
+	Size		cur_size;
+}			HeapMultiInsertState;
+
+typedef struct HeapInsertState
+{
+	struct BulkInsertStateData *bistate;
+	HeapMultiInsertState *mistate;
+}			HeapInsertState;
+
 /* ----------------
  *		function prototypes for heap access method
  *
@@ -275,6 +309,21 @@ extern void heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 extern void heap_multi_insert(Relation relation, struct TupleTableSlot **slots,
 							  int ntuples, CommandId cid, int options,
 							  BulkInsertState bistate);
+
+extern TableInsertState * heap_insert_begin(Relation rel,
+											CommandId cid,
+											int am_flags,
+											int insert_flags);
+extern void heap_insert_v2(TableInsertState * state,
+						   TupleTableSlot *slot);
+extern TupleTableSlot *heap_multi_insert_next_free_slot(TableInsertState * state);
+extern void heap_multi_insert_v2(TableInsertState * state,
+								 TupleTableSlot *slot);
+extern TupleTableSlot **heap_multi_insert_slots(TableInsertState * state,
+												int *num_slots);
+extern void heap_multi_insert_flush(TableInsertState * state);
+extern void heap_insert_end(TableInsertState * state);
+
 extern TM_Result heap_delete(Relation relation, ItemPointer tid,
 							 CommandId cid, Snapshot crosscheck, bool wait,
 							 struct TM_FailureData *tmfd, bool changingPart);
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 5f8474871d..834de15b9b 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -247,6 +247,43 @@ typedef struct TM_IndexDeleteOp
 	TM_IndexStatus *status;
 } TM_IndexDeleteOp;
 
+/* Use multi inserts, i.e. buffer multiple tuples and insert them at once */
+#define TABLEAM_MULTI_INSERTS 0x000001
+
+/* Use BAS_BULKWRITE buffer access strategy */
+#define TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY 0x000002
+
+/*
+ * Skip flushing buffered tuples automatically. Responsibility lies with the
+ * caller to flush the buffered tuples.
+ */
+#define TABLEAM_SKIP_MULTI_INSERTS_FLUSH 0x000004
+
+
+/* Holds table insert state. */
+typedef struct TableInsertState
+{
+	/* Table AM-agnostic data starts here */
+
+	Relation	rel;			/* Target relation */
+
+	/*
+	 * Command ID for this insertion. If required, change this for each pass
+	 * of insert functions.
+	 */
+	CommandId	cid;
+
+	/* Table AM options (TABLEAM_XXX macros) */
+	int			am_flags;
+
+	/* table_tuple_insert performance options (TABLE_INSERT_XXX macros) */
+	int			insert_flags;
+
+	/* Table AM specific data starts here */
+
+	void	   *am_data;
+}			TableInsertState;
+
 /* "options" flag bits for table_tuple_insert */
 /* TABLE_INSERT_SKIP_WAL was 0x0001; RelationNeedsWAL() now governs */
 #define TABLE_INSERT_SKIP_FSM		0x0002
@@ -522,6 +559,20 @@ typedef struct TableAmRoutine
 	void		(*multi_insert) (Relation rel, TupleTableSlot **slots, int nslots,
 								 CommandId cid, int options, struct BulkInsertStateData *bistate);
 
+	TableInsertState *(*tuple_insert_begin) (Relation rel,
+											 CommandId cid,
+											 int am_flags,
+											 int insert_flags);
+	void		(*tuple_insert_v2) (TableInsertState * state,
+									TupleTableSlot *slot);
+	void		(*tuple_multi_insert_v2) (TableInsertState * state,
+										  TupleTableSlot *slot);
+	TupleTableSlot *(*tuple_multi_insert_next_free_slot) (TableInsertState * state);
+	TupleTableSlot **(*tuple_multi_insert_slots) (TableInsertState * state,
+												  int *num_slots);
+	void		(*tuple_multi_insert_flush) (TableInsertState * state);
+	void		(*tuple_insert_end) (TableInsertState * state);
+
 	/* see table_tuple_delete() for reference about parameters */
 	TM_Result	(*tuple_delete) (Relation rel,
 								 ItemPointer tid,
@@ -1456,6 +1507,93 @@ table_multi_insert(Relation rel, TupleTableSlot **slots, int nslots,
 								  cid, options, bistate);
 }
 
+static inline TableInsertState *
+table_insert_begin(Relation rel, CommandId cid, int am_flags,
+				   int insert_flags)
+{
+	if (rel->rd_tableam && rel->rd_tableam->tuple_insert_begin)
+		return rel->rd_tableam->tuple_insert_begin(rel, cid, am_flags,
+												   insert_flags);
+	else
+	{
+		elog(ERROR, "table_insert_begin access method is not implemented for relation \"%s\"",
+			 RelationGetRelationName(rel));
+		return NULL;			/* keep compiler quiet */
+	}
+}
+
+static inline void
+table_tuple_insert_v2(TableInsertState * state, TupleTableSlot *slot)
+{
+	if (state->rel->rd_tableam &&
+		state->rel->rd_tableam->tuple_insert_v2)
+		state->rel->rd_tableam->tuple_insert_v2(state, slot);
+	else
+		elog(ERROR, "table_tuple_insert_v2 access method is not implemented for relation \"%s\"",
+			 RelationGetRelationName(state->rel));
+}
+
+static inline void
+table_multi_insert_v2(TableInsertState * state, TupleTableSlot *slot)
+{
+	if (state->rel->rd_tableam &&
+		state->rel->rd_tableam->tuple_multi_insert_v2)
+		state->rel->rd_tableam->tuple_multi_insert_v2(state, slot);
+	else
+		elog(ERROR, "table_multi_insert_v2 access method is not implemented for relation \"%s\"",
+			 RelationGetRelationName(state->rel));
+}
+
+static inline TupleTableSlot *
+table_multi_insert_next_free_slot(TableInsertState * state)
+{
+	if (state->rel->rd_tableam &&
+		state->rel->rd_tableam->tuple_multi_insert_next_free_slot)
+		return state->rel->rd_tableam->tuple_multi_insert_next_free_slot(state);
+	else
+	{
+		elog(ERROR, "table_multi_insert_next_free_slot access method is not implemented for relation \"%s\"",
+			 RelationGetRelationName(state->rel));
+		return NULL;			/* keep compiler quiet */
+	}
+}
+
+static inline TupleTableSlot **
+table_multi_insert_slots(TableInsertState * state, int *num_slots)
+{
+	if (state->rel->rd_tableam &&
+		state->rel->rd_tableam->tuple_multi_insert_slots)
+		return state->rel->rd_tableam->tuple_multi_insert_slots(state, num_slots);
+	else
+	{
+		elog(ERROR, "table_multi_insert_slots access method is not implemented for relation \"%s\"",
+			 RelationGetRelationName(state->rel));
+		return NULL;			/* keep compiler quiet */
+	}
+}
+
+static inline void
+table_multi_insert_flush(TableInsertState * state)
+{
+	if (state->rel->rd_tableam &&
+		state->rel->rd_tableam->tuple_multi_insert_flush)
+		state->rel->rd_tableam->tuple_multi_insert_flush(state);
+	else
+		elog(ERROR, "table_multi_insert_flush access method is not implemented for relation \"%s\"",
+			 RelationGetRelationName(state->rel));
+}
+
+static inline void
+table_insert_end(TableInsertState * state)
+{
+	if (state->rel->rd_tableam &&
+		state->rel->rd_tableam->tuple_insert_end)
+		state->rel->rd_tableam->tuple_insert_end(state);
+	else
+		elog(ERROR, "table_insert_end access method is not implemented for relation \"%s\"",
+			 RelationGetRelationName(state->rel));
+}
+
 /*
  * Delete a tuple.
  *
-- 
2.34.1



  [application/octet-stream] v10-0002-Optimize-CTAS-with-multi-inserts.patch (3.8K, 3-v10-0002-Optimize-CTAS-with-multi-inserts.patch)
  download | inline diff:
From ab21bc9db0b6a033db3d6d00f72c5b1abf445240 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <[email protected]>
Date: Mon, 29 Jan 2024 11:01:56 +0000
Subject: [PATCH v10 2/4] Optimize CTAS with multi inserts

---
 src/backend/commands/createas.c          | 25 +++++++++---------------
 src/test/regress/expected/aggregates.out |  2 +-
 src/test/regress/sql/aggregates.sql      |  2 +-
 3 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 16a2fe65e6..3a02ea9578 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -58,9 +58,7 @@ typedef struct
 	/* These fields are filled by intorel_startup: */
 	Relation	rel;			/* relation to write to */
 	ObjectAddress reladdr;		/* address of rel, for ExecCreateTableAs */
-	CommandId	output_cid;		/* cmin to insert in output tuples */
-	int			ti_options;		/* table_tuple_insert performance options */
-	BulkInsertState bistate;	/* bulk insert state */
+	TableInsertState *ti_state; /* table insert state */
 } DR_intorel;
 
 /* utility functions for CTAS definition creation */
@@ -557,17 +555,19 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
 	 */
 	myState->rel = intoRelationDesc;
 	myState->reladdr = intoRelationAddr;
-	myState->output_cid = GetCurrentCommandId(true);
-	myState->ti_options = TABLE_INSERT_SKIP_FSM;
 
 	/*
 	 * If WITH NO DATA is specified, there is no need to set up the state for
 	 * bulk inserts as there are no tuples to insert.
 	 */
 	if (!into->skipData)
-		myState->bistate = GetBulkInsertState();
+		myState->ti_state = table_insert_begin(intoRelationDesc,
+											   GetCurrentCommandId(true),
+											   TABLEAM_MULTI_INSERTS |
+											   TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY,
+											   TABLE_INSERT_SKIP_FSM);
 	else
-		myState->bistate = NULL;
+		myState->ti_state = NULL;
 
 	/*
 	 * Valid smgr_targblock implies something already wrote to the relation.
@@ -595,11 +595,7 @@ intorel_receive(TupleTableSlot *slot, DestReceiver *self)
 		 * would not be cheap either. This also doesn't allow accessing per-AM
 		 * data (say a tuple's xmin), but since we don't do that here...
 		 */
-		table_tuple_insert(myState->rel,
-						   slot,
-						   myState->output_cid,
-						   myState->ti_options,
-						   myState->bistate);
+		table_multi_insert_v2(myState->ti_state, slot);
 	}
 
 	/* We know this is a newly created relation, so there are no indexes */
@@ -617,10 +613,7 @@ intorel_shutdown(DestReceiver *self)
 	IntoClause *into = myState->into;
 
 	if (!into->skipData)
-	{
-		FreeBulkInsertState(myState->bistate);
-		table_finish_bulk_insert(myState->rel, myState->ti_options);
-	}
+		table_insert_end(myState->ti_state);
 
 	/* close rel, but keep lock until commit */
 	table_close(myState->rel, NoLock);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 7a73c19314..2889fd315d 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2734,7 +2734,7 @@ CREATE TABLE btg AS SELECT
   i % 100 AS y,
   'abc' || i % 10 AS z,
   i AS w
-FROM generate_series(1,10000) AS i;
+FROM generate_series(1,100000) AS i;
 CREATE INDEX btg_x_y_idx ON btg(x,y);
 ANALYZE btg;
 -- GROUP BY optimization by reorder columns by frequency
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 916dbf908f..99f890bb85 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -1187,7 +1187,7 @@ CREATE TABLE btg AS SELECT
   i % 100 AS y,
   'abc' || i % 10 AS z,
   i AS w
-FROM generate_series(1,10000) AS i;
+FROM generate_series(1,100000) AS i;
 CREATE INDEX btg_x_y_idx ON btg(x,y);
 ANALYZE btg;
 
-- 
2.34.1



  [application/octet-stream] v10-0003-Optimize-RMV-with-multi-inserts.patch (2.9K, 4-v10-0003-Optimize-RMV-with-multi-inserts.patch)
  download | inline diff:
From 621aa97a1708ba178f5ebf6aca869788a2cf1b56 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <[email protected]>
Date: Mon, 29 Jan 2024 11:02:19 +0000
Subject: [PATCH v10 3/4] Optimize RMV with multi inserts

---
 src/backend/commands/matview.c | 34 ++++++++++++----------------------
 1 file changed, 12 insertions(+), 22 deletions(-)

diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 1dcfbe879b..f84c79f5f0 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -52,10 +52,7 @@ typedef struct
 	DestReceiver pub;			/* publicly-known function pointers */
 	Oid			transientoid;	/* OID of new heap into which to store */
 	/* These fields are filled by transientrel_startup: */
-	Relation	transientrel;	/* relation to write to */
-	CommandId	output_cid;		/* cmin to insert in output tuples */
-	int			ti_options;		/* table_tuple_insert performance options */
-	BulkInsertState bistate;	/* bulk insert state */
+	TableInsertState *ti_state; /* table insert state */
 } DR_transientrel;
 
 static int	matview_maintenance_depth = 0;
@@ -457,13 +454,13 @@ transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
 
 	transientrel = table_open(myState->transientoid, NoLock);
 
-	/*
-	 * Fill private fields of myState for use by later routines
-	 */
-	myState->transientrel = transientrel;
-	myState->output_cid = GetCurrentCommandId(true);
-	myState->ti_options = TABLE_INSERT_SKIP_FSM | TABLE_INSERT_FROZEN;
-	myState->bistate = GetBulkInsertState();
+	/* Fill private fields of myState for use by later routines */
+	myState->ti_state = table_insert_begin(transientrel,
+										   GetCurrentCommandId(true),
+										   TABLEAM_MULTI_INSERTS |
+										   TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY,
+										   TABLE_INSERT_SKIP_FSM |
+										   TABLE_INSERT_FROZEN);
 
 	/*
 	 * Valid smgr_targblock implies something already wrote to the relation.
@@ -488,12 +485,7 @@ transientrel_receive(TupleTableSlot *slot, DestReceiver *self)
 	 * cheap either. This also doesn't allow accessing per-AM data (say a
 	 * tuple's xmin), but since we don't do that here...
 	 */
-
-	table_tuple_insert(myState->transientrel,
-					   slot,
-					   myState->output_cid,
-					   myState->ti_options,
-					   myState->bistate);
+	table_multi_insert_v2(myState->ti_state, slot);
 
 	/* We know this is a newly created relation, so there are no indexes */
 
@@ -507,14 +499,12 @@ static void
 transientrel_shutdown(DestReceiver *self)
 {
 	DR_transientrel *myState = (DR_transientrel *) self;
+	Relation	transientrel = myState->ti_state->rel;
 
-	FreeBulkInsertState(myState->bistate);
-
-	table_finish_bulk_insert(myState->transientrel, myState->ti_options);
+	table_insert_end(myState->ti_state);
 
 	/* close transientrel, but keep lock until commit */
-	table_close(myState->transientrel, NoLock);
-	myState->transientrel = NULL;
+	table_close(transientrel, NoLock);
 }
 
 /*
-- 
2.34.1



  [application/octet-stream] v10-0004-Use-new-multi-insert-TAM-for-COPY-FROM.patch (6.3K, 5-v10-0004-Use-new-multi-insert-TAM-for-COPY-FROM.patch)
  download | inline diff:
From 169131f28e09c41b0b100f953b54dd16b2e3185a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <[email protected]>
Date: Mon, 29 Jan 2024 11:02:37 +0000
Subject: [PATCH v10 4/4] Use new multi insert TAM for COPY FROM

---
 src/backend/commands/copyfrom.c | 92 ++++++++++++++++++---------------
 1 file changed, 50 insertions(+), 42 deletions(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1fe70b9133..8abf33aa97 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -77,10 +77,9 @@
 /* Stores multi-insert data related to a single relation in CopyFrom. */
 typedef struct CopyMultiInsertBuffer
 {
-	TupleTableSlot *slots[MAX_BUFFERED_TUPLES]; /* Array to store tuples */
+	TableInsertState *ti_state; /* Table insert state; NULL if foreign table */
+	TupleTableSlot **slots;		/* Array to store tuples */
 	ResultRelInfo *resultRelInfo;	/* ResultRelInfo for 'relid' */
-	BulkInsertState bistate;	/* BulkInsertState for this rel if plain
-								 * table; NULL if foreign table */
 	int			nused;			/* number of 'slots' containing tuples */
 	uint64		linenos[MAX_BUFFERED_TUPLES];	/* Line # of tuple in copy
 												 * stream */
@@ -223,14 +222,31 @@ limit_printout_length(const char *str)
  * ResultRelInfo.
  */
 static CopyMultiInsertBuffer *
-CopyMultiInsertBufferInit(ResultRelInfo *rri)
+CopyMultiInsertBufferInit(CopyMultiInsertInfo *miinfo, ResultRelInfo *rri)
 {
 	CopyMultiInsertBuffer *buffer;
 
 	buffer = (CopyMultiInsertBuffer *) palloc(sizeof(CopyMultiInsertBuffer));
-	memset(buffer->slots, 0, sizeof(TupleTableSlot *) * MAX_BUFFERED_TUPLES);
+
+	if (rri->ri_FdwRoutine == NULL)
+	{
+		int			num_slots;
+
+		buffer->ti_state = table_insert_begin(rri->ri_RelationDesc,
+											  miinfo->mycid,
+											  TABLEAM_MULTI_INSERTS |
+											  TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY |
+											  TABLEAM_SKIP_MULTI_INSERTS_FLUSH,
+											  miinfo->ti_options);
+		buffer->slots = table_multi_insert_slots(buffer->ti_state, &num_slots);
+	}
+	else
+	{
+		buffer->slots = palloc0(sizeof(TupleTableSlot *) * MAX_BUFFERED_TUPLES);
+		buffer->ti_state = NULL;
+	}
+
 	buffer->resultRelInfo = rri;
-	buffer->bistate = (rri->ri_FdwRoutine == NULL) ? GetBulkInsertState() : NULL;
 	buffer->nused = 0;
 
 	return buffer;
@@ -245,7 +261,7 @@ CopyMultiInsertInfoSetupBuffer(CopyMultiInsertInfo *miinfo,
 {
 	CopyMultiInsertBuffer *buffer;
 
-	buffer = CopyMultiInsertBufferInit(rri);
+	buffer = CopyMultiInsertBufferInit(miinfo, rri);
 
 	/* Setup back-link so we can easily find this buffer again */
 	rri->ri_CopyMultiInsertBuffer = buffer;
@@ -322,8 +338,6 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
 		int			batch_size = resultRelInfo->ri_BatchSize;
 		int			sent = 0;
 
-		Assert(buffer->bistate == NULL);
-
 		/* Ensure that the FDW supports batching and it's enabled */
 		Assert(resultRelInfo->ri_FdwRoutine->ExecForeignBatchInsert);
 		Assert(batch_size > 1);
@@ -395,13 +409,8 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
 	}
 	else
 	{
-		CommandId	mycid = miinfo->mycid;
-		int			ti_options = miinfo->ti_options;
 		bool		line_buf_valid = cstate->line_buf_valid;
 		uint64		save_cur_lineno = cstate->cur_lineno;
-		MemoryContext oldcontext;
-
-		Assert(buffer->bistate != NULL);
 
 		/*
 		 * Print error context information correctly, if one of the operations
@@ -409,18 +418,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
 		 */
 		cstate->line_buf_valid = false;
 
-		/*
-		 * table_multi_insert may leak memory, so switch to short-lived memory
-		 * context before calling it.
-		 */
-		oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-		table_multi_insert(resultRelInfo->ri_RelationDesc,
-						   slots,
-						   nused,
-						   mycid,
-						   ti_options,
-						   buffer->bistate);
-		MemoryContextSwitchTo(oldcontext);
+		table_multi_insert_flush(buffer->ti_state);
 
 		for (i = 0; i < nused; i++)
 		{
@@ -435,7 +433,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
 				cstate->cur_lineno = buffer->linenos[i];
 				recheckIndexes =
 					ExecInsertIndexTuples(resultRelInfo,
-										  buffer->slots[i], estate, false,
+										  slots[i], estate, false,
 										  false, NULL, NIL, false);
 				ExecARInsertTriggers(estate, resultRelInfo,
 									 slots[i], recheckIndexes,
@@ -493,20 +491,15 @@ CopyMultiInsertBufferCleanup(CopyMultiInsertInfo *miinfo,
 	resultRelInfo->ri_CopyMultiInsertBuffer = NULL;
 
 	if (resultRelInfo->ri_FdwRoutine == NULL)
-	{
-		Assert(buffer->bistate != NULL);
-		FreeBulkInsertState(buffer->bistate);
-	}
+		table_insert_end(buffer->ti_state);
 	else
-		Assert(buffer->bistate == NULL);
-
-	/* Since we only create slots on demand, just drop the non-null ones. */
-	for (i = 0; i < MAX_BUFFERED_TUPLES && buffer->slots[i] != NULL; i++)
-		ExecDropSingleTupleTableSlot(buffer->slots[i]);
+	{
+		/* Since we only create slots on demand, just drop the non-null ones. */
+		for (i = 0; i < MAX_BUFFERED_TUPLES && buffer->slots[i] != NULL; i++)
+			ExecDropSingleTupleTableSlot(buffer->slots[i]);
 
-	if (resultRelInfo->ri_FdwRoutine == NULL)
-		table_finish_bulk_insert(resultRelInfo->ri_RelationDesc,
-								 miinfo->ti_options);
+		pfree(buffer->slots);
+	}
 
 	pfree(buffer);
 }
@@ -593,13 +586,25 @@ CopyMultiInsertInfoNextFreeSlot(CopyMultiInsertInfo *miinfo,
 {
 	CopyMultiInsertBuffer *buffer = rri->ri_CopyMultiInsertBuffer;
 	int			nused = buffer->nused;
+	TupleTableSlot *slot;
 
 	Assert(buffer != NULL);
 	Assert(nused < MAX_BUFFERED_TUPLES);
 
-	if (buffer->slots[nused] == NULL)
-		buffer->slots[nused] = table_slot_create(rri->ri_RelationDesc, NULL);
-	return buffer->slots[nused];
+	if (rri->ri_FdwRoutine == NULL)
+		slot = table_multi_insert_next_free_slot(buffer->ti_state);
+	else
+	{
+		if (buffer->slots[nused] == NULL)
+		{
+			slot = table_slot_create(rri->ri_RelationDesc, NULL);
+			buffer->slots[nused] = slot;
+		}
+		else
+			slot = buffer->slots[nused];
+	}
+
+	return slot;
 }
 
 /*
@@ -615,6 +620,9 @@ CopyMultiInsertInfoStore(CopyMultiInsertInfo *miinfo, ResultRelInfo *rri,
 	Assert(buffer != NULL);
 	Assert(slot == buffer->slots[buffer->nused]);
 
+	if (rri->ri_FdwRoutine == NULL)
+		table_multi_insert_v2(buffer->ti_state, slot);
+
 	/* Store the line number so we can properly report any errors later */
 	buffer->linenos[buffer->nused] = lineno;
 
-- 
2.34.1



view thread (43+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: New Table Access Methods for Multi and Single Inserts
  In-Reply-To: <CALj2ACWT0Rz8oybWBm5W4CeS0DvFkwaw-pEvGArhDLyPbZnW_g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox