MIME-Version: 1.0
References: 
 <CALj2ACVi9eTRYR=gdca5wxtj3Kk_9q9qVccxsS1hngTGOCjPwQ@mail.gmail.com>
 <20201217050522.GU30237@telsasoft.com>
 <CALj2ACVgT1iocd5nQ+rEmqt3xcCONkR037qbc8PiojdR39Ag=w@mail.gmail.com>
 <20201217204442.GX30237@telsasoft.com>
 <CALj2ACW3BC5kgdffZ2LD_CT2wQoXVc29kGB74SVWnGZ=UFqcAQ@mail.gmail.com>
 <20201218175439.GA30237@telsasoft.com> <20201221074725.GF30237@telsasoft.com>
 <CALj2ACWMnZZCu=G0PJkEeYYicKeuJ-X=SU19i6vQ1+=uXz8u0Q@mail.gmail.com>
 <20201225023958.GW30237@telsasoft.com>
 <CALj2ACVDtYYRYD2SC+X2ALOUkhnUcgC7RLxiEYVWW2HxxrfRww@mail.gmail.com>
 <96eaa813-4ad6-e80a-04a4-cc8082d356dd@swarm64.com>
In-Reply-To: <96eaa813-4ad6-e80a-04a4-cc8082d356dd@swarm64.com>
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 5 Jan 2021 15:36:24 +0530
Message-ID: 
 <CALj2ACVsiAZMsP8p5MPg6SSEtoMFFaiAa6j2AFtEQJDhfbgs3Q@mail.gmail.com>
Subject: Re: New Table Access Methods for Multi and Single Inserts
To: Luc Vlaming <luc@swarm64.com>
Cc: Justin Pryzby <pryzby@telsasoft.com>,
	PostgreSQL-development <pgsql-hackers@postgresql.org>,
 Andres Freund <andres@anarazel.de>,
	Paul Guo <guopa@vmware.com>, Jeff Davis <pgsql@j-davis.com>
Content-Type: multipart/alternative; boundary="000000000000a1456f05b8245cf4"
Precedence: bulk

--000000000000a1456f05b8245cf4
Content-Type: text/plain; charset="UTF-8"

On Mon, Jan 4, 2021 at 1:29 PM Luc Vlaming <luc@swarm64.com> wrote:
>  > table AM patch [2] be reviewed further?
> As to the patches themselves:
>
> I think the API is a huge step forward! I assume that we want to have a
> single-insert API like heap_insert_v2 so that we can encode the
> knowledge that there will just be a single insert coming and likely a
> commit afterwards?
>
> Reason I'm asking is that I quite liked the heap_insert_begin parameter
> is_multi, which could even be turned into a "expected_rowcount" of the
> amount of rows expected to be commited in the transaction (e.g. single,
> several, thousands/stream).
> If we were to make the API based on expected rowcounts, the whole
> heap_insert_v2, heap_insert and heap_multi_insert could be turned into a
> single function heap_insert, as the knowledge about buffering of the
> slots is then already stored in the TableInsertState, creating an API
like:
>
> // expectedRows: -1 = streaming, otherwise expected rowcount.
> TableInsertState* heap_insert_begin(Relation rel, CommandId cid, int
> options, int expectedRows);
> heap_insert(TableInsertState *state, TupleTableSlot *slot);
>
> Do you think that's a good idea?

IIUC, your suggestion is to use expectedRows and move the multi insert
implementation heap_multi_insert_v2 to heap_insert_v2. If that's correct,
so heap_insert_v2 will look something like this:

heap_insert_v2()
{
    if (single_insert)
      //do single insertion work, the code in existing heap_insert_v2 comes
here
   else
      //do multi insertion work, the code in existing heap_multi_insert_v2
comes here
}

I don't see any problem in combining single and multi insert APIs into one.
Having said that, will the APIs be cleaner then? Isn't it going to be
confusing if a single heap_insert_v2 API does both the works? With the
existing separate APIs, for single insertion, the sequence of the API can
be like begin, insert_v2, end and for multi inserts it's like begin,
multi_insert_v2, flush, end. I prefer to have a separate multi insert API
so that it will make the code look readable.

Thoughts?

> Two smaller things I'm wondering:
> - the clear_mi_slots; why is this not in the HeapMultiInsertState? the
> slots themselves are declared there?

Firstly, we need to have the buffered slots sometimes(please have a look at
the comments in TableInsertState structure) outside the multi_insert API.
And we need to have cleared the previously flushed slots before we start
buffering in heap_multi_insert_v2(). I can remove the clear_mi_slots flag
altogether and do as follows: I will not set mistate->cur_slots to 0 in
heap_multi_insert_flush after the flush, I will only set state->flushed to
true. In heap_multi_insert_v2,

void
heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
{
    TupleTableSlot  *batchslot;
    HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate;
    Size sz;

    Assert(mistate && mistate->slots);


*   /* if the slots are flushed previously then clear them off before using
them again. */    if (state->flushed)    {        int i;        for (i = 0;
i < mistate->cur_slots; i++)            ExecClearTuple(mistate->slots[i]);
      mistate->cur_slots = 0;        state->flushed = false    }*

    if (mistate->slots[mistate->cur_slots] == NULL)
        mistate->slots[mistate->cur_slots] =
                                    table_slot_create(state->rel, NULL);

    batchslot = mistate->slots[mistate->cur_slots];

    ExecCopySlot(batchslot, slot);

Thoughts?

> Also, why do we want to do ExecClearTuple() anyway? Isn't
> it good enough that the next call to ExecCopySlot will effectively clear
> it out?

For virtual, heap, minimal tuple slots, yes ExecCopySlot slot clears the
slot before copying. But, for buffer heap slots, the
tts_buffer_heap_copyslot does not always clear the destination slot, see
below. If we fall into else condition, we might get some issues. And also
note that, once the slot is cleared in ExecClearTuple, it will not be
cleared again in ExecCopySlot because TTS_SHOULDFREE(slot) will be false.
That is why, let's have ExecClearTuple as is.

    /*
     * If the source slot is of a different kind, or is a buffer slot that
has
     * been materialized / is virtual, make a new copy of the tuple.
Otherwise
     * make a new reference to the in-buffer tuple.
     */
    if (dstslot->tts_ops != srcslot->tts_ops ||
        TTS_SHOULDFREE(srcslot) ||
        !bsrcslot->base.tuple)
    {
        MemoryContext oldContext;

        ExecClearTuple(dstslot);
    }
    else
    {
        Assert(BufferIsValid(bsrcslot->buffer));

        tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple,
                                    bsrcslot->buffer, false);

> - flushed -> why is this a stored boolean? isn't this indirectly encoded
> by cur_slots/cur_size == 0?

Note that cur_slots is in HeapMultiInsertState and outside of the new APIs
i.e. in TableInsertState, mistate is a void pointer, and we can't really
access the cur_slots. I mean, we can access but we need to be dereferencing
using the tableam kind. Instead of  doing all of that, to keep the API
cleaner, I chose to have a boolean in the TableInsertState which we can see
and use outside of the new APIs. Hope that's fine.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

--000000000000a1456f05b8245cf4
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Mon, Jan 4, 2021 at 1:29 PM Luc Vlaming &lt;<a href=3D"=
mailto:luc@swarm64.com" target=3D"_blank">luc@swarm64.com</a>&gt; wrote:<br=
>&gt; =C2=A0&gt; table AM patch [2] be reviewed further?<br>&gt; As to the =
patches themselves:<br>&gt;<br>&gt; I think the API is a huge step forward!=
 I assume that we want to have a<br>&gt; single-insert API like heap_insert=
_v2 so that we can encode the<br>&gt; knowledge that there will just be a s=
ingle insert coming and likely a<br>&gt; commit afterwards?<br>&gt;<br>&gt;=
 Reason I&#39;m asking is that I quite liked the heap_insert_begin paramete=
r<br>&gt; is_multi, which could even be turned into a &quot;expected_rowcou=
nt&quot; of the<br>&gt; amount of rows expected to be commited in the trans=
action (e.g. single,<br>&gt; several, thousands/stream).<br>&gt; If we were=
 to make the API based on expected rowcounts, the whole<br>&gt; heap_insert=
_v2, heap_insert and heap_multi_insert could be turned into a<br>&gt; singl=
e function heap_insert, as the knowledge about buffering of the<br>&gt; slo=
ts is then already stored in the TableInsertState, creating an API like:<br=
>&gt;<br>&gt; // expectedRows: -1 =3D streaming, otherwise expected rowcoun=
t.<br>&gt; TableInsertState* heap_insert_begin(Relation rel, CommandId cid,=
 int<br>&gt; options, int expectedRows);<br>&gt; heap_insert(TableInsertSta=
te *state, TupleTableSlot *slot);<br>&gt;<br>&gt; Do you think that&#39;s a=
 good idea?<br><br>IIUC, your suggestion is to use expectedRows and move th=
e multi insert implementation heap_multi_insert_v2 to heap_insert_v2. If th=
at&#39;s correct, so heap_insert_v2 will look something like this:<br><br>h=
eap_insert_v2()<br>{<br>=C2=A0 =C2=A0 if (single_insert)<br>=C2=A0 =C2=A0 =
=C2=A0 //do single insertion work, the code in existing heap_insert_v2 come=
s here<br>=C2=A0 =C2=A0else<br>=C2=A0 =C2=A0 =C2=A0 //do multi insertion wo=
rk, the code in existing heap_multi_insert_v2 comes here<br>}<br><br>I don&=
#39;t see any problem in combining single and multi insert APIs into one. H=
aving said that, will the APIs be cleaner then? Isn&#39;t it going to be co=
nfusing if a single heap_insert_v2 API does both the works? With the existi=
ng separate APIs, for single insertion, the sequence of the API can be like=
 begin, insert_v2, end and for multi inserts it&#39;s like begin, multi_ins=
ert_v2, flush, end. I prefer to have a separate multi insert API so that it=
 will make the code look readable.<br><br>Thoughts?<br><br>&gt; Two smaller=
 things I&#39;m wondering:<br>&gt; - the clear_mi_slots; why is this not in=
 the HeapMultiInsertState? the<br>&gt; slots themselves are declared there?=
<br><br>Firstly, we need to have the buffered slots sometimes(please have a=
 look at the comments in TableInsertState structure) outside the multi_inse=
rt API. And we need to have cleared the previously flushed slots before we =
start buffering in heap_multi_insert_v2(). I can remove the clear_mi_slots =
flag altogether and do as follows: I will not set mistate-&gt;cur_slots to =
0 in heap_multi_insert_flush after the flush, I will only set state-&gt;flu=
shed to true. In heap_multi_insert_v2,<br><br>void<br>heap_multi_insert_v2(=
TableInsertState *state, TupleTableSlot *slot)<br>{<br>=C2=A0 =C2=A0 TupleT=
ableSlot =C2=A0*batchslot;<br>=C2=A0 =C2=A0 HeapMultiInsertState *mistate =
=3D (HeapMultiInsertState *)state-&gt;mistate;<br>=C2=A0 =C2=A0 Size sz;<br=
><br>=C2=A0 =C2=A0 Assert(mistate &amp;&amp; mistate-&gt;slots);<br><br>=C2=
=A0<b> =C2=A0 /* if the slots are flushed previously then clear them off be=
fore using them again. */<br>=C2=A0 =C2=A0 if (state-&gt;flushed)<br>=C2=A0=
 =C2=A0 {<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 int i;<br><br>=C2=A0 =C2=A0 =C2=A0=
 =C2=A0 for (i =3D 0; i &lt; mistate-&gt;cur_slots; i++)<br>=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ExecClearTuple(mistate-&gt;slots[i]);<br><br>=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 mistate-&gt;cur_slots =3D 0;<br>=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 state-&gt;flushed =3D false<br>=C2=A0 =C2=A0 }</b><br><br>=C2=
=A0 =C2=A0 if (mistate-&gt;slots[mistate-&gt;cur_slots] =3D=3D NULL)<br>=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 mistate-&gt;slots[mistate-&gt;cur_slots] =3D<br>=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 table_slot_create(stat=
e-&gt;rel, NULL);<br><br>=C2=A0 =C2=A0 batchslot =3D mistate-&gt;slots[mist=
ate-&gt;cur_slots];<br><br>=C2=A0 =C2=A0 ExecCopySlot(batchslot, slot);<br>=
<div><br></div><div>Thoughts?<br></div><div></div><div><br></div>&gt; Also,=
 why do we want to do ExecClearTuple() anyway? Isn&#39;t<br>&gt; it good en=
ough that the next call to ExecCopySlot will effectively clear<br>&gt; it o=
ut?<br><br>For virtual, heap, minimal tuple slots, yes ExecCopySlot slot cl=
ears the slot before copying. But, for buffer heap slots, the tts_buffer_he=
ap_copyslot does not always clear the destination slot, see below. If we fa=
ll into else condition, we might get some issues. And also note that, once =
the slot is cleared in ExecClearTuple, it will not be cleared again in Exec=
CopySlot because TTS_SHOULDFREE(slot) will be false. That is why, let&#39;s=
 have ExecClearTuple as is.<br><br>=C2=A0 =C2=A0 /*<br>=C2=A0 =C2=A0 =C2=A0=
* If the source slot is of a different kind, or is a buffer slot that has<b=
r>=C2=A0 =C2=A0 =C2=A0* been materialized / is virtual, make a new copy of =
the tuple. Otherwise<br>=C2=A0 =C2=A0 =C2=A0* make a new reference to the i=
n-buffer tuple.<br>=C2=A0 =C2=A0 =C2=A0*/<br>=C2=A0 =C2=A0 if (dstslot-&gt;=
tts_ops !=3D srcslot-&gt;tts_ops ||<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 TTS_SHOU=
LDFREE(srcslot) ||<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 !bsrcslot-&gt;base.tuple)=
<br>=C2=A0 =C2=A0 {<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 MemoryContext oldContext=
;<br><br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 ExecClearTuple(dstslot);<br>=C2=A0 =C2=
=A0 }<br>=C2=A0 =C2=A0 else<br>=C2=A0 =C2=A0 {<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 Assert(BufferIsValid(bsrcslot-&gt;buffer));<br><br>=C2=A0 =C2=A0 =C2=A0=
 =C2=A0 tts_buffer_heap_store_tuple(dstslot, bsrcslot-&gt;base.tuple,<br>=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 bsrcslot-&gt;buffer, f=
alse);<br><br>&gt; - flushed -&gt; why is this a stored boolean? isn&#39;t =
this indirectly encoded<br>&gt; by cur_slots/cur_size =3D=3D 0?<br><br>Note=
 that cur_slots is in HeapMultiInsertState and outside of the new APIs i.e.=
 in TableInsertState, mistate is a void pointer, and we can&#39;t really ac=
cess the cur_slots. I mean, we can access but we need to be dereferencing u=
sing the tableam kind. Instead of =C2=A0doing all of that, to keep the API =
cleaner, I chose to have a boolean in the TableInsertState which we can see=
 and use outside of the new APIs. Hope that&#39;s fine.<br><br><div></div>W=
ith Regards,<br>Bharath Rupireddy.<br>EnterpriseDB: <a href=3D"http://www.e=
nterprisedb.com" target=3D"_blank">http://www.enterprisedb.com</a></div>

--000000000000a1456f05b8245cf4--