Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kwjEg-0004zi-5n for pgsql-hackers@arkaria.postgresql.org; Tue, 05 Jan 2021 10:06:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1kwjEf-0008Sf-4a for pgsql-hackers@arkaria.postgresql.org; Tue, 05 Jan 2021 10:06:41 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kwjEe-0008SY-Sg for pgsql-hackers@lists.postgresql.org; Tue, 05 Jan 2021 10:06:40 +0000 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1kwjEc-0006AP-0d for pgsql-hackers@postgresql.org; Tue, 05 Jan 2021 10:06:40 +0000 Received: by mail-pf1-x431.google.com with SMTP id q22so18082639pfk.12 for ; Tue, 05 Jan 2021 02:06:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9OgNpk5yExMvT9myJyDi+5zGQX1QtATAXmpNhiAVDAQ=; b=Oshak8La6vJczLVMqxirTbtI0CKHM6x2ni8Nn6IZFrIyoR8TXr5ldKZ2wPFXJGvT2F x1DJJoVciYWFhQJVJO4pKq0WYP+DpdlLPjXLLUWmDLb304tMcy9lPanZtHdjc0rc9OXv ydI8WLbGZRG+3DHKwFjued3X3A4/cP8Q0dWMLuM3DbL81k0xSDXZQLJf9AQw7iuzRQc7 ysAe+5dqWF/VG5xZYUsp7Ol6fb916sGw3mC/R7gh5eksg9Y8Mvu55ArQDl6ipYiE+VYf gAbJccBu4XjkUD4WxwHtIq4KbdHz2EuEL3pNb1dCxiEYsN/EiFL+1YhGZVAitvHlu2ZY XjBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9OgNpk5yExMvT9myJyDi+5zGQX1QtATAXmpNhiAVDAQ=; b=BvN/G530XrvCxT6twna2748mYA9jBYqdvkAAu+mWpqr33WoQ1Tp42Bs8da32An28U5 Fd/UxfC61Dei0cYb9sIWwXaeq4qbIS/fz09TdnpE0bh6bOYHD3Nza0RsgFAiBUsr2NG5 0lHkSvp43zgEaplvTliAsTBnwU2aBpZ7EWBpMZDbNOF1PhqaHWAxtl4h20XXOK9lf3wZ jn3LOfwsY3m1bJjMHMdf9wS2CO/6vkSrXkEyPJmZMHxWRy7t7lSnbaJGuApJmiwO5clB mbMQW4PviHk0LrTRG7R+QSbszYhzKL3Syez1IxfALtCtTsRuC+LTxRqXlpKMSUZQQCTE /MXQ== X-Gm-Message-State: AOAM5304I21xjw/BROc761PwwKmm5tR3OeP9I56yBgrTyrT9O+XL1rgW 1iWGCHK8j/OhgGnvPqDJoN/PlZpWGS3VuRzPhXc= X-Google-Smtp-Source: ABdhPJyp7CqqVUldodT9SLJN/xzRq/4ezT3SXR9WzS+pSshLkhPoweYFeoRPRcXf1uT5zgiBYVOCyiJa46SxKre+0WQ= X-Received: by 2002:a63:551f:: with SMTP id j31mr54962901pgb.432.1609841195958; Tue, 05 Jan 2021 02:06:35 -0800 (PST) MIME-Version: 1.0 References: <20201217050522.GU30237@telsasoft.com> <20201217204442.GX30237@telsasoft.com> <20201218175439.GA30237@telsasoft.com> <20201221074725.GF30237@telsasoft.com> <20201225023958.GW30237@telsasoft.com> <96eaa813-4ad6-e80a-04a4-cc8082d356dd@swarm64.com> In-Reply-To: <96eaa813-4ad6-e80a-04a4-cc8082d356dd@swarm64.com> From: Bharath Rupireddy Date: Tue, 5 Jan 2021 15:36:24 +0530 Message-ID: Subject: Re: New Table Access Methods for Multi and Single Inserts To: Luc Vlaming Cc: Justin Pryzby , PostgreSQL-development , Andres Freund , Paul Guo , Jeff Davis Content-Type: multipart/alternative; boundary="000000000000a1456f05b8245cf4" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk --000000000000a1456f05b8245cf4 Content-Type: text/plain; charset="UTF-8" On Mon, Jan 4, 2021 at 1:29 PM Luc Vlaming wrote: > > table AM patch [2] be reviewed further? > As to the patches themselves: > > I think the API is a huge step forward! I assume that we want to have a > single-insert API like heap_insert_v2 so that we can encode the > knowledge that there will just be a single insert coming and likely a > commit afterwards? > > Reason I'm asking is that I quite liked the heap_insert_begin parameter > is_multi, which could even be turned into a "expected_rowcount" of the > amount of rows expected to be commited in the transaction (e.g. single, > several, thousands/stream). > If we were to make the API based on expected rowcounts, the whole > heap_insert_v2, heap_insert and heap_multi_insert could be turned into a > single function heap_insert, as the knowledge about buffering of the > slots is then already stored in the TableInsertState, creating an API like: > > // expectedRows: -1 = streaming, otherwise expected rowcount. > TableInsertState* heap_insert_begin(Relation rel, CommandId cid, int > options, int expectedRows); > heap_insert(TableInsertState *state, TupleTableSlot *slot); > > Do you think that's a good idea? IIUC, your suggestion is to use expectedRows and move the multi insert implementation heap_multi_insert_v2 to heap_insert_v2. If that's correct, so heap_insert_v2 will look something like this: heap_insert_v2() { if (single_insert) //do single insertion work, the code in existing heap_insert_v2 comes here else //do multi insertion work, the code in existing heap_multi_insert_v2 comes here } I don't see any problem in combining single and multi insert APIs into one. Having said that, will the APIs be cleaner then? Isn't it going to be confusing if a single heap_insert_v2 API does both the works? With the existing separate APIs, for single insertion, the sequence of the API can be like begin, insert_v2, end and for multi inserts it's like begin, multi_insert_v2, flush, end. I prefer to have a separate multi insert API so that it will make the code look readable. Thoughts? > Two smaller things I'm wondering: > - the clear_mi_slots; why is this not in the HeapMultiInsertState? the > slots themselves are declared there? Firstly, we need to have the buffered slots sometimes(please have a look at the comments in TableInsertState structure) outside the multi_insert API. And we need to have cleared the previously flushed slots before we start buffering in heap_multi_insert_v2(). I can remove the clear_mi_slots flag altogether and do as follows: I will not set mistate->cur_slots to 0 in heap_multi_insert_flush after the flush, I will only set state->flushed to true. In heap_multi_insert_v2, void heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot) { TupleTableSlot *batchslot; HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate; Size sz; Assert(mistate && mistate->slots); * /* if the slots are flushed previously then clear them off before using them again. */ if (state->flushed) { int i; for (i = 0; i < mistate->cur_slots; i++) ExecClearTuple(mistate->slots[i]); mistate->cur_slots = 0; state->flushed = false }* if (mistate->slots[mistate->cur_slots] == NULL) mistate->slots[mistate->cur_slots] = table_slot_create(state->rel, NULL); batchslot = mistate->slots[mistate->cur_slots]; ExecCopySlot(batchslot, slot); Thoughts? > Also, why do we want to do ExecClearTuple() anyway? Isn't > it good enough that the next call to ExecCopySlot will effectively clear > it out? For virtual, heap, minimal tuple slots, yes ExecCopySlot slot clears the slot before copying. But, for buffer heap slots, the tts_buffer_heap_copyslot does not always clear the destination slot, see below. If we fall into else condition, we might get some issues. And also note that, once the slot is cleared in ExecClearTuple, it will not be cleared again in ExecCopySlot because TTS_SHOULDFREE(slot) will be false. That is why, let's have ExecClearTuple as is. /* * If the source slot is of a different kind, or is a buffer slot that has * been materialized / is virtual, make a new copy of the tuple. Otherwise * make a new reference to the in-buffer tuple. */ if (dstslot->tts_ops != srcslot->tts_ops || TTS_SHOULDFREE(srcslot) || !bsrcslot->base.tuple) { MemoryContext oldContext; ExecClearTuple(dstslot); } else { Assert(BufferIsValid(bsrcslot->buffer)); tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple, bsrcslot->buffer, false); > - flushed -> why is this a stored boolean? isn't this indirectly encoded > by cur_slots/cur_size == 0? Note that cur_slots is in HeapMultiInsertState and outside of the new APIs i.e. in TableInsertState, mistate is a void pointer, and we can't really access the cur_slots. I mean, we can access but we need to be dereferencing using the tableam kind. Instead of doing all of that, to keep the API cleaner, I chose to have a boolean in the TableInsertState which we can see and use outside of the new APIs. Hope that's fine. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com --000000000000a1456f05b8245cf4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Mon, Jan 4, 2021 at 1:29 PM Luc Vlaming <luc@swarm64.com> wrote:> =C2=A0> table AM patch [2] be reviewed further?
> As to the = patches themselves:
>
> I think the API is a huge step forward!= I assume that we want to have a
> single-insert API like heap_insert= _v2 so that we can encode the
> knowledge that there will just be a s= ingle insert coming and likely a
> commit afterwards?
>
>= Reason I'm asking is that I quite liked the heap_insert_begin paramete= r
> is_multi, which could even be turned into a "expected_rowcou= nt" of the
> amount of rows expected to be commited in the trans= action (e.g. single,
> several, thousands/stream).
> If we were= to make the API based on expected rowcounts, the whole
> heap_insert= _v2, heap_insert and heap_multi_insert could be turned into a
> singl= e function heap_insert, as the knowledge about buffering of the
> slo= ts is then already stored in the TableInsertState, creating an API like:>
> // expectedRows: -1 =3D streaming, otherwise expected rowcoun= t.
> TableInsertState* heap_insert_begin(Relation rel, CommandId cid,= int
> options, int expectedRows);
> heap_insert(TableInsertSta= te *state, TupleTableSlot *slot);
>
> Do you think that's a= good idea?

IIUC, your suggestion is to use expectedRows and move th= e multi insert implementation heap_multi_insert_v2 to heap_insert_v2. If th= at's correct, so heap_insert_v2 will look something like this:

h= eap_insert_v2()
{
=C2=A0 =C2=A0 if (single_insert)
=C2=A0 =C2=A0 = =C2=A0 //do single insertion work, the code in existing heap_insert_v2 come= s here
=C2=A0 =C2=A0else
=C2=A0 =C2=A0 =C2=A0 //do multi insertion wo= rk, the code in existing heap_multi_insert_v2 comes here
}

I don&= #39;t see any problem in combining single and multi insert APIs into one. H= aving said that, will the APIs be cleaner then? Isn't it going to be co= nfusing if a single heap_insert_v2 API does both the works? With the existi= ng separate APIs, for single insertion, the sequence of the API can be like= begin, insert_v2, end and for multi inserts it's like begin, multi_ins= ert_v2, flush, end. I prefer to have a separate multi insert API so that it= will make the code look readable.

Thoughts?

> Two smaller= things I'm wondering:
> - the clear_mi_slots; why is this not in= the HeapMultiInsertState? the
> slots themselves are declared there?=

Firstly, we need to have the buffered slots sometimes(please have a= look at the comments in TableInsertState structure) outside the multi_inse= rt API. And we need to have cleared the previously flushed slots before we = start buffering in heap_multi_insert_v2(). I can remove the clear_mi_slots = flag altogether and do as follows: I will not set mistate->cur_slots to = 0 in heap_multi_insert_flush after the flush, I will only set state->flu= shed to true. In heap_multi_insert_v2,

void
heap_multi_insert_v2(= TableInsertState *state, TupleTableSlot *slot)
{
=C2=A0 =C2=A0 TupleT= ableSlot =C2=A0*batchslot;
=C2=A0 =C2=A0 HeapMultiInsertState *mistate = =3D (HeapMultiInsertState *)state->mistate;
=C2=A0 =C2=A0 Size sz;
=C2=A0 =C2=A0 Assert(mistate && mistate->slots);

=C2= =A0 =C2=A0 /* if the slots are flushed previously then clear them off be= fore using them again. */
=C2=A0 =C2=A0 if (state->flushed)
=C2=A0= =C2=A0 {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 int i;

=C2=A0 =C2=A0 =C2=A0= =C2=A0 for (i =3D 0; i < mistate->cur_slots; i++)
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 ExecClearTuple(mistate->slots[i]);

= =C2=A0 =C2=A0 =C2=A0 =C2=A0 mistate->cur_slots =3D 0;
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 state->flushed =3D false
=C2=A0 =C2=A0 }


=C2= =A0 =C2=A0 if (mistate->slots[mistate->cur_slots] =3D=3D NULL)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 mistate->slots[mistate->cur_slots] =3D
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 table_slot_create(stat= e->rel, NULL);

=C2=A0 =C2=A0 batchslot =3D mistate->slots[mist= ate->cur_slots];

=C2=A0 =C2=A0 ExecCopySlot(batchslot, slot);
=

Thoughts?

> Also,= why do we want to do ExecClearTuple() anyway? Isn't
> it good en= ough that the next call to ExecCopySlot will effectively clear
> it o= ut?

For virtual, heap, minimal tuple slots, yes ExecCopySlot slot cl= ears the slot before copying. But, for buffer heap slots, the tts_buffer_he= ap_copyslot does not always clear the destination slot, see below. If we fa= ll into else condition, we might get some issues. And also note that, once = the slot is cleared in ExecClearTuple, it will not be cleared again in Exec= CopySlot because TTS_SHOULDFREE(slot) will be false. That is why, let's= have ExecClearTuple as is.

=C2=A0 =C2=A0 /*
=C2=A0 =C2=A0 =C2=A0= * If the source slot is of a different kind, or is a buffer slot that has=C2=A0 =C2=A0 =C2=A0* been materialized / is virtual, make a new copy of = the tuple. Otherwise
=C2=A0 =C2=A0 =C2=A0* make a new reference to the i= n-buffer tuple.
=C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 if (dstslot->= tts_ops !=3D srcslot->tts_ops ||
=C2=A0 =C2=A0 =C2=A0 =C2=A0 TTS_SHOU= LDFREE(srcslot) ||
=C2=A0 =C2=A0 =C2=A0 =C2=A0 !bsrcslot->base.tuple)=
=C2=A0 =C2=A0 {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 MemoryContext oldContext= ;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 ExecClearTuple(dstslot);
=C2=A0 =C2= =A0 }
=C2=A0 =C2=A0 else
=C2=A0 =C2=A0 {
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 Assert(BufferIsValid(bsrcslot->buffer));

=C2=A0 =C2=A0 =C2=A0= =C2=A0 tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple,
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 bsrcslot->buffer, f= alse);

> - flushed -> why is this a stored boolean? isn't = this indirectly encoded
> by cur_slots/cur_size =3D=3D 0?

Note= that cur_slots is in HeapMultiInsertState and outside of the new APIs i.e.= in TableInsertState, mistate is a void pointer, and we can't really ac= cess the cur_slots. I mean, we can access but we need to be dereferencing u= sing the tableam kind. Instead of =C2=A0doing all of that, to keep the API = cleaner, I chose to have a boolean in the TableInsertState which we can see= and use outside of the new APIs. Hope that's fine.

W= ith Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
--000000000000a1456f05b8245cf4--