Re: New Table Access Methods for Multi and Single Inserts

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: New Table Access Methods for Multi and Single Inserts
4+ messages / 3 participants
[nested] [flat]

* Re: New Table Access Methods for Multi and Single Inserts
@ 2022-03-06 11:11 Bharath Rupireddy <[email protected]>
  2022-03-07 16:09 ` Re: New Table Access Methods for Multi and Single Inserts Matthias van de Meent <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Bharath Rupireddy @ 2022-03-06 11:11 UTC (permalink / raw)
  To: Matthias van de Meent <[email protected]>; +Cc: Dilip Kumar <[email protected]>; Luc Vlaming <[email protected]>; Justin Pryzby <[email protected]>; pgsql-hackers; Andres Freund <[email protected]>; Paul Guo <[email protected]>; Jeff Davis <[email protected]>

On Fri, Mar 4, 2022 at 8:07 PM Matthias van de Meent
<[email protected]> wrote:
> > Another rebase due to conflicts in 0003. Attaching v6 for review.
>
> I recently touched the topic of multi_insert, and I remembered this
> patch. I had to dig a bit to find it, but as it's still open I've
> added some comments:

Thanks for reviving the thread. I almost lost hope in it. In fact, it
took me a while to recollect the work and respond to your comments.
I'm now happy to answer or continue working on this patch if you or
someone is really interested to review it and take it to the end.

> > diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
> > +#define MAX_BUFFERED_TUPLES        1000
> > +#define MAX_BUFFERED_BYTES        65535
>
> It looks like these values were copied over from copyfrom.c, but are
> now expressed in the context of the heapam.
> As these values are now heap-specific (as opposed to the
> TableAM-independent COPY infrastructure), shouldn't we instead
> optimize for heap page insertions? That is, I suggest using a multiple
> (1 or more) of MaxHeapTuplesPerPage for _TUPLES, and that same / a
> similar multiple of BLCKSZ for MAX_BUFFERED_BYTES.

We can do that. In fact, it is a good idea to let callers pass in as
an input to tuple_insert_begin and have it as part of
TableInsertState. If okay, I will do that in the next patch.

> > TableInsertState->flushed
> > TableInsertState->mi_slots
>
> I don't quite like the current storage-and-feedback mechanism for
> flushed tuples. The current assumptions in this mechanism seem to be
> that
> 1.) access methods always want to flush all available tuples at once,
> 2.) access methods want to maintain the TupleTableSlots for all
> inserted tuples that have not yet had all triggers handled, and
> 3.) we need access to the not-yet-flushed TupleTableSlots.
>
> I think that that is not a correct set of assumptions; I think that
> only flushed tuples need to be visible to the tableam-agnostic code;
> and that tableams should be allowed to choose which tuples to flush at
> which point; as long as they're all flushed after a final
> multi_insert_flush.
>
> Examples:
> A heap-based access method might want bin-pack tuples and write out
> full pages only; and thus keep some tuples in the buffers as they
> didn't fill a page; thus having flushed only a subset of the current
> buffered tuples.
> A columnstore-based access method might not want to maintain the
> TupleTableSlots of original tuples, but instead use temporary columnar
> storage: TupleTableSlots are quite large when working with huge
> amounts of tuples; and keeping lots of tuple data in memory is
> expensive.
>
> As such, I think that this should be replaced with a
> TableInsertState->mi_flushed_slots + TableInsertState->mi_flushed_len,
> managed by the tableAM, in which only the flushed tuples are visible
> to the AM-agnostic code. This is not much different from how the
> implementation currently works; except that ->mi_slots now does not
> expose unflushed tuples; and that ->flushed is replaced by an integer
> value of number of flushed tuples.

Yeah, that makes sense. Let's table AMs expose the flushed tuples
outside on which the callers can handle the after-insert row triggers
upon them.

IIUC, TableInsertState needs to have few other variables:

    /* Below members are only used for multi inserts. */
    /* Array of buffered slots. */
    TupleTableSlot  **mi_slots;
    /* Number of slots that are currently buffered. */
    int32   mi_cur_slots;
    /* Array of flushed slots that will be used by callers to handle
after-insert row triggers or similar events outside */
    TupleTableSlot  **mi_flushed_slots ;
    /* Number of slots that are currently buffered. */
    int32   mi_no_of_flushed_slots;

The implementation of heap_multi_insert_flush will just set the
mi_slots to mi_flushed_slots.

> A further improvement (in my opinion) would be the change from a
> single multi_insert_flush() to a signalling-based multi_insert_flush:
> It is not unreasonable for e.g. a columnstore to buffer tens of
> thousands of inserts; but doing so in TupleTableSlots would introduce
> a high memory usage. Allowing for batched retrieval of flushed tuples
> would help in memory usage; which is why multiple calls to
> multi_insert_flush() could be useful. To handle this gracefully, we'd
> probably add TIS->mi_flush_remaining, where each insert adds one to
> mi_flush_remaining; and each time mi_flushed_slots has been handled
> mi_flush_remaining is decreased by mi_flushed_len by the handler code.
> Once we're done inserting into the table, we keep calling
> multi_insert_flush until no more tuples are being flushed (and error
> out if we're still waiting for flushes but no new flushed tuples are
> returned).

The current approach is signalling-based right?
heap_multi_insert_v2
    if (state->mi_cur_slots >= mistate->max_slots ||
        mistate->cur_size >= mistate->max_size)
        heap_multi_insert_flush(state);

The table_multi_insert_v2 am implementers will have to carefully
choose buffering strategy i.e. number of tuples, size to buffer and
decide rightly without hitting memory usages.

Thoughts?

Regards,
Bharath Rupireddy.






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: New Table Access Methods for Multi and Single Inserts
  2022-03-06 11:11 Re: New Table Access Methods for Multi and Single Inserts Bharath Rupireddy <[email protected]>
@ 2022-03-07 16:09 ` Matthias van de Meent <[email protected]>
  2022-10-12 05:30   ` Re: New Table Access Methods for Multi and Single Inserts Michael Paquier <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Matthias van de Meent @ 2022-03-07 16:09 UTC (permalink / raw)
  To: Bharath Rupireddy <[email protected]>; +Cc: Dilip Kumar <[email protected]>; Luc Vlaming <[email protected]>; Justin Pryzby <[email protected]>; pgsql-hackers; Andres Freund <[email protected]>; Paul Guo <[email protected]>; Jeff Davis <[email protected]>

On Sun, 6 Mar 2022 at 12:12, Bharath Rupireddy
<[email protected]> wrote:
>
> On Fri, Mar 4, 2022 at 8:07 PM Matthias van de Meent
> <[email protected]> wrote:
> > > Another rebase due to conflicts in 0003. Attaching v6 for review.
> >
> > I recently touched the topic of multi_insert, and I remembered this
> > patch. I had to dig a bit to find it, but as it's still open I've
> > added some comments:
>
> Thanks for reviving the thread. I almost lost hope in it. In fact, it
> took me a while to recollect the work and respond to your comments.
> I'm now happy to answer or continue working on this patch if you or
> someone is really interested to review it and take it to the end.
>
> > > diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
> > > +#define MAX_BUFFERED_TUPLES        1000
> > > +#define MAX_BUFFERED_BYTES        65535
> >
> > It looks like these values were copied over from copyfrom.c, but are
> > now expressed in the context of the heapam.
> > As these values are now heap-specific (as opposed to the
> > TableAM-independent COPY infrastructure), shouldn't we instead
> > optimize for heap page insertions? That is, I suggest using a multiple
> > (1 or more) of MaxHeapTuplesPerPage for _TUPLES, and that same / a
> > similar multiple of BLCKSZ for MAX_BUFFERED_BYTES.
>
> We can do that. In fact, it is a good idea to let callers pass in as
> an input to tuple_insert_begin and have it as part of
> TableInsertState. If okay, I will do that in the next patch.
>
> > > TableInsertState->flushed
> > > TableInsertState->mi_slots
> >
> > I don't quite like the current storage-and-feedback mechanism for
> > flushed tuples. The current assumptions in this mechanism seem to be
> > that
> > 1.) access methods always want to flush all available tuples at once,
> > 2.) access methods want to maintain the TupleTableSlots for all
> > inserted tuples that have not yet had all triggers handled, and
> > 3.) we need access to the not-yet-flushed TupleTableSlots.
> >
> > I think that that is not a correct set of assumptions; I think that
> > only flushed tuples need to be visible to the tableam-agnostic code;
> > and that tableams should be allowed to choose which tuples to flush at
> > which point; as long as they're all flushed after a final
> > multi_insert_flush.
> >
> > Examples:
> > A heap-based access method might want bin-pack tuples and write out
> > full pages only; and thus keep some tuples in the buffers as they
> > didn't fill a page; thus having flushed only a subset of the current
> > buffered tuples.
> > A columnstore-based access method might not want to maintain the
> > TupleTableSlots of original tuples, but instead use temporary columnar
> > storage: TupleTableSlots are quite large when working with huge
> > amounts of tuples; and keeping lots of tuple data in memory is
> > expensive.
> >
> > As such, I think that this should be replaced with a
> > TableInsertState->mi_flushed_slots + TableInsertState->mi_flushed_len,
> > managed by the tableAM, in which only the flushed tuples are visible
> > to the AM-agnostic code. This is not much different from how the
> > implementation currently works; except that ->mi_slots now does not
> > expose unflushed tuples; and that ->flushed is replaced by an integer
> > value of number of flushed tuples.
>
> Yeah, that makes sense. Let's table AMs expose the flushed tuples
> outside on which the callers can handle the after-insert row triggers
> upon them.
>
> IIUC, TableInsertState needs to have few other variables:
>
>     /* Below members are only used for multi inserts. */
>     /* Array of buffered slots. */
>     TupleTableSlot  **mi_slots;

Not quite: there's no need for TupleTableSlot  **mi_slots in the
TableInsertState; as the buffer used by the tableAM to buffer
unflushed tuples shouldn't be publicly visible. I suspect that moving
that field to HeapMultiInsertState instead would be the prudent thing
to do; limiting the external access of AM-specific buffers.

>     /* Number of slots that are currently buffered. */
>     int32   mi_cur_slots;
>     /* Array of flushed slots that will be used by callers to handle
> after-insert row triggers or similar events outside */
>     TupleTableSlot  **mi_flushed_slots ;
>     /* Number of slots that are currently buffered. */
>     int32   mi_no_of_flushed_slots;
>
> The implementation of heap_multi_insert_flush will just set the
> mi_slots to mi_flushed_slots.

Yes.

> > A further improvement (in my opinion) would be the change from a
> > single multi_insert_flush() to a signalling-based multi_insert_flush:
> > It is not unreasonable for e.g. a columnstore to buffer tens of
> > thousands of inserts; but doing so in TupleTableSlots would introduce
> > a high memory usage. Allowing for batched retrieval of flushed tuples
> > would help in memory usage; which is why multiple calls to
> > multi_insert_flush() could be useful. To handle this gracefully, we'd
> > probably add TIS->mi_flush_remaining, where each insert adds one to
> > mi_flush_remaining; and each time mi_flushed_slots has been handled
> > mi_flush_remaining is decreased by mi_flushed_len by the handler code.
> > Once we're done inserting into the table, we keep calling
> > multi_insert_flush until no more tuples are being flushed (and error
> > out if we're still waiting for flushes but no new flushed tuples are
> > returned).
>
> The current approach is signalling-based right?
> heap_multi_insert_v2
>     if (state->mi_cur_slots >= mistate->max_slots ||
>         mistate->cur_size >= mistate->max_size)
>         heap_multi_insert_flush(state);

That's for the AM-internal flushing; yes. I was thinking about the AM
api for flushing that's used when finalizing the batched insert; i.e.
table_multi_insert_flush.

Currently it assumes that all buffered tuples will be flushed after
one call (which is correct for heap), but putting those unflushed
tuples all at once back in memory might not be desirable or possible
(for e.g. columnar); so we might need to call table_multi_insert_flush
until there's no more buffered tuples.

> The table_multi_insert_v2 am implementers will have to carefully
> choose buffering strategy i.e. number of tuples, size to buffer and
> decide rightly without hitting memory usages.

Agreed

-Matthias






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: New Table Access Methods for Multi and Single Inserts
  2022-03-06 11:11 Re: New Table Access Methods for Multi and Single Inserts Bharath Rupireddy <[email protected]>
  2022-03-07 16:09 ` Re: New Table Access Methods for Multi and Single Inserts Matthias van de Meent <[email protected]>
@ 2022-10-12 05:30   ` Michael Paquier <[email protected]>
  2022-10-12 05:35     ` Re: New Table Access Methods for Multi and Single Inserts Bharath Rupireddy <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Michael Paquier @ 2022-10-12 05:30 UTC (permalink / raw)
  To: Matthias van de Meent <[email protected]>; +Cc: Bharath Rupireddy <[email protected]>; Dilip Kumar <[email protected]>; Luc Vlaming <[email protected]>; Justin Pryzby <[email protected]>; pgsql-hackers; Andres Freund <[email protected]>; Paul Guo <[email protected]>; Jeff Davis <[email protected]>

On Mon, Mar 07, 2022 at 05:09:23PM +0100, Matthias van de Meent wrote:
> That's for the AM-internal flushing; yes. I was thinking about the AM
> api for flushing that's used when finalizing the batched insert; i.e.
> table_multi_insert_flush.
> 
> Currently it assumes that all buffered tuples will be flushed after
> one call (which is correct for heap), but putting those unflushed
> tuples all at once back in memory might not be desirable or possible
> (for e.g. columnar); so we might need to call table_multi_insert_flush
> until there's no more buffered tuples.

This thread has been idle for 6 months now, so I have marked it as
returned with feedback as of what looks like a lack of activity.  I
have looked at what's been proposed, and I am not really sure if the
direction taken is correct, though there may be a potential gain in
consolidating the multi-insert path within the table AM set of
callbacks.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: New Table Access Methods for Multi and Single Inserts
  2022-03-06 11:11 Re: New Table Access Methods for Multi and Single Inserts Bharath Rupireddy <[email protected]>
  2022-03-07 16:09 ` Re: New Table Access Methods for Multi and Single Inserts Matthias van de Meent <[email protected]>
  2022-10-12 05:30   ` Re: New Table Access Methods for Multi and Single Inserts Michael Paquier <[email protected]>
@ 2022-10-12 05:35     ` Bharath Rupireddy <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: Bharath Rupireddy @ 2022-10-12 05:35 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Dilip Kumar <[email protected]>; Luc Vlaming <[email protected]>; Justin Pryzby <[email protected]>; pgsql-hackers; Andres Freund <[email protected]>; Paul Guo <[email protected]>; Jeff Davis <[email protected]>

On Wed, Oct 12, 2022 at 11:01 AM Michael Paquier <[email protected]> wrote:
>
> This thread has been idle for 6 months now, so I have marked it as
> returned with feedback as of what looks like a lack of activity.  I
> have looked at what's been proposed, and I am not really sure if the
> direction taken is correct, though there may be a potential gain in
> consolidating the multi-insert path within the table AM set of
> callbacks.

Thanks. Unfortunately, I'm not finding enough cycles to work on this
feature. I'm happy to help if others have any further thoughts and
take it from here.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 4+ messages in thread

end of thread, other threads:[~2022-10-12 05:35 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2022-03-06 11:11 Re: New Table Access Methods for Multi and Single Inserts Bharath Rupireddy <[email protected]>
2022-03-07 16:09 ` Matthias van de Meent <[email protected]>
2022-10-12 05:30   ` Michael Paquier <[email protected]>
2022-10-12 05:35     ` Bharath Rupireddy <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox