Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lDP7A-0000JM-9m for pgsql-hackers@arkaria.postgresql.org; Sat, 20 Feb 2021 10:03:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1lDP78-0000tT-48 for pgsql-hackers@arkaria.postgresql.org; Sat, 20 Feb 2021 10:03:50 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lDMcF-0003KZ-IB for pgsql-hackers@lists.postgresql.org; Sat, 20 Feb 2021 07:23:47 +0000 Received: from mail-lj1-x230.google.com ([2a00:1450:4864:20::230]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lDMcC-00026J-EC for pgsql-hackers@postgresql.org; Sat, 20 Feb 2021 07:23:46 +0000 Received: by mail-lj1-x230.google.com with SMTP id a22so36965298ljp.10 for ; Fri, 19 Feb 2021 23:23:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yugabyte.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MYhQAeyx6xloAUMxbxSBTXtN+blC9mS5tAWFI7+UoDg=; b=KrVymKN4KU1QNZW/j6aYUKb/iIxQRimgCt1h81HbRJ2pxiTgfJtsvjychABBR91e0T IBEGWneoXKmREKCmUo8i970oGuLQZVcbHEJf3fdauwNKICIEcp4CgnBJAYES0HihC0xs X9IRV4iEjwwgnXfkh5jC1K5khSxbvGwhq0acM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MYhQAeyx6xloAUMxbxSBTXtN+blC9mS5tAWFI7+UoDg=; b=WltMW+Uc16uWQuVQdTA+DkSSCQFZ6VHDdOuIGd1Lve0XisO7FUmKXqBroyDV9zxwVD dcf7cz3ooPYtYlvk5CNDUlIVrPscXoaBOBPu25F5x+stZQzvA0gbIzUUI0suLsarmiYx aaoJmA3wBTZ163Yx5+sbkqZevhwu+0CZlGxIAa5Qhq5cgLFiITaJ5lqexcAy2gmSqqb1 CezPaGPD52FEJrxZCoeOqsIhvoEFzkv0ICrvzX3YEe8t5clFH3omrGnqIN4VThmQCtWr r/OqohylfssUe1McIEJhqE2UmFDjFwNOJPG1YT4XibJqMajk/t9ZyNaw9AE6jcWJ3Klz 8oQA== X-Gm-Message-State: AOAM533iluvn5SDFzMC/rIFtK8GfEasH9nNF9oVYCGRdYsBK+qUUareT GZr1SsnX8WywH4kJuy3Jd9fEdrdMbKGHidILl9nC9g== X-Google-Smtp-Source: ABdhPJxcLgULp7rBU8tdcc4Shu2lN+k3XqVBq3pmzNBQykLj35U668COaiZ2l12bX6ss0tNLAM1aZ0mchMxaC8pYbIw= X-Received: by 2002:a2e:145e:: with SMTP id 30mr8420771lju.199.1613805822644; Fri, 19 Feb 2021 23:23:42 -0800 (PST) MIME-Version: 1.0 References: <20201217050522.GU30237@telsasoft.com> <20201217204442.GX30237@telsasoft.com> <20201218175439.GA30237@telsasoft.com> <20201221074725.GF30237@telsasoft.com> <20201225023958.GW30237@telsasoft.com> <96eaa813-4ad6-e80a-04a4-cc8082d356dd@swarm64.com> <508af801-6356-d36b-1867-011ac6df8f55@swarm64.com> In-Reply-To: From: Zhihong Yu Date: Fri, 19 Feb 2021 23:25:51 -0800 Message-ID: Subject: Re: New Table Access Methods for Multi and Single Inserts To: Bharath Rupireddy Cc: Luc Vlaming , Justin Pryzby , PostgreSQL-development , Andres Freund , Paul Guo , Jeff Davis Content-Type: multipart/alternative; boundary="000000000000cba50805bbbf72da" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk --000000000000cba50805bbbf72da Content-Type: text/plain; charset="UTF-8" Hi, bq. case 3 - 2 integer(of 4 bytes each) columns, tuple size 32 bytes Is there some other column(s) per row apart from the integer columns ? Since the 2 integer columns only occupy 8 bytes. I wonder where the other 32-8=24 bytes come from. Thanks On Fri, Feb 19, 2021 at 9:45 PM Bharath Rupireddy < bharath.rupireddyforpostgres@gmail.com> wrote: > On Wed, Feb 17, 2021 at 12:46 PM Bharath Rupireddy > wrote: > > Hi, > > > > I addressed the following review comments and attaching v3 patch set. > > > > 1) ExecClearTuple happens before ExecCopySlot in heap_multi_insert_v2 > > and this allowed us to remove clear_mi_slots flag from > > TableInsertState. > > 2) I retained the flushed variable inside TableInsertState so that the > > callers can know whether the buffered slots have been flushed. If yes, > > the callers can execute after insert row triggers or perform index > > insertions. This is easier than passing the after insert row triggers > > info and index info to new multi insert table am and let it do. This > > way the functionalities can be kept separate i.e. multi insert ams do > > only buffering, decisions on when to flush, insertions and the callers > > will execute triggers or index insertions. And also none of the > > existing table ams are performing these operations within them, so > > this is inline with the current design of the table ams. > > 3) I have kept the single and multi insert API separate. The previous > > suggestion was to have only a single insert API and let the callers > > provide initially whether they want multi or single inserts. One > > problem with that approach is that we have to allow table ams to > > execute the after row triggers or index insertions. That is something > > I personally don't like. > > > > 0001 - new table ams implementation > > 0002 - the new multi table ams used in CREATE TABLE AS and REFRESH > > MATERIALIZED VIEW > > 0003 - the new multi table ams used in COPY > > > > Please review the v3 patch set further. > > Below is the performance gain measured for CREATE TABLE AS with the > new multi insert am propsed in this thread: > > case 1 - 2 integer(of 4 bytes each) columns, 3 varchar(8), tuple size > 59 bytes, 100mn tuples > on master - 185sec > on master with multi inserts - 121sec, gain - 1.52X > > case 2 - 2 bigint(of 8 bytes each) columns, 3 name(of 64 bytes each) > columns, 1 varchar(8), tuple size 241 bytes, 100mn tuples > on master - 367sec > on master with multi inserts - 291sec, gain - 1.26X > > case 3 - 2 integer(of 4 bytes each) columns, tuple size 32 bytes, 100mn > tuples > on master - 130sec > on master with multi inserts - 105sec, gain - 1.23X > > case 4 - 2 bigint(of 8 bytes each) columns, 16 name(of 64 bytes each) > columns, tuple size 1064 bytes, 10mn tuples > on master - 120sec > on master with multi inserts - 115sec, gain - 1.04X > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com > > > --000000000000cba50805bbbf72da Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,
bq. case 3 - 2 integer(of 4 bytes each) columns, t= uple size 32 bytes

Is there some other column(s) p= er row apart from the integer columns ? Since the 2 integer columns only oc= cupy 8 bytes. I wonder where the other 32-8=3D24 bytes come from.

Thanks

On Fri, Feb 19, 2021 at 9:45 PM Bharath Rupir= eddy <bharath.= rupireddyforpostgres@gmail.com> wrote:
On Wed, Feb 17, 2021 at 12:46 PM Bharath Rupi= reddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> Hi,
>
> I addressed the following review comments and attaching v3 patch set.<= br> >
> 1) ExecClearTuple happens before ExecCopySlot in heap_multi_insert_v2<= br> > and this allowed us to remove clear_mi_slots flag from
> TableInsertState.
> 2) I retained the flushed variable inside TableInsertState so that the=
> callers can know whether the buffered slots have been flushed. If yes,=
> the callers can execute after insert row triggers or perform index
> insertions. This is easier than passing the after insert row triggers<= br> > info and index info to new multi insert table am and let it do. This > way the functionalities can be kept separate i.e. multi insert ams do<= br> > only buffering, decisions on when to flush, insertions and the callers=
> will execute triggers or index insertions. And also none of the
> existing table ams are performing these operations within them, so
> this is inline with the current design of the table ams.
> 3) I have kept the single and multi insert API separate. The previous<= br> > suggestion was to have only a single insert API and let the callers > provide initially whether they want multi or single inserts. One
> problem with that approach is that we have to allow table ams to
> execute the after row triggers or index insertions. That is something<= br> > I personally don't like.
>
> 0001 - new table ams implementation
> 0002 - the new multi table ams used in CREATE TABLE AS and REFRESH
> MATERIALIZED VIEW
> 0003 - the new multi table ams used in COPY
>
> Please review the v3 patch set further.

Below is the performance gain measured for CREATE TABLE AS with the
new multi insert am propsed in this thread:

case 1 - 2 integer(of 4 bytes each) columns, 3 varchar(8), tuple size
59 bytes, 100mn tuples
on master - 185sec
on master with multi inserts - 121sec, gain - 1.52X

case 2 - 2 bigint(of 8 bytes each) columns, 3 name(of 64 bytes each)
columns, 1 varchar(8), tuple size 241 bytes, 100mn tuples
on master - 367sec
on master with multi inserts - 291sec, gain - 1.26X

case 3 - 2 integer(of 4 bytes each) columns, tuple size 32 bytes, 100mn tup= les
on master - 130sec
on master with multi inserts - 105sec, gain - 1.23X

case 4 - 2 bigint(of 8 bytes each) columns, 16 name(of 64 bytes each)
columns, tuple size 1064 bytes, 10mn tuples
on master - 120sec
on master with multi inserts - 115sec, gain - 1.04X

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


--000000000000cba50805bbbf72da--