Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lDL5K-0000S8-D8 for pgsql-hackers@arkaria.postgresql.org; Sat, 20 Feb 2021 05:45:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1lDL5I-0001eA-Qs for pgsql-hackers@arkaria.postgresql.org; Sat, 20 Feb 2021 05:45:40 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lDL5I-0001e2-HY for pgsql-hackers@lists.postgresql.org; Sat, 20 Feb 2021 05:45:40 +0000 Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lDL5F-00005f-Nh for pgsql-hackers@postgresql.org; Sat, 20 Feb 2021 05:45:40 +0000 Received: by mail-pg1-x533.google.com with SMTP id p21so6691766pgl.12 for ; Fri, 19 Feb 2021 21:45:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=frdXGhvzfdlSmh2JxmM8k2Td8d55lznaJIYNtQWy2gc=; b=bKyOuycMWh4as/qbsH3J3RRtPPvaSWxFbH3eU/7P8n1qh11AxDGI0UwfQ0IaH1HGuS 3ZlZ5pt6gktJlBIuXxJZB/L5GenRzOvIO83XsYdyqdKWi3HFYycU2IW+PUzAb+OJbRiB yIeXh4dbkCg2yhNqXt8WJQDt/qPEMvFB05tpphJ6VFAAor0xUV+MtH/LU69MCJNoLQYf /DwayP3tkWOi4qNJupV7vHWHaHwKWlohF6N32uS6/x2dCJ37vwiqB9M+nY0mIutAsxgz BPsDB2iKMuXHONSclGSkUI6/WEeNlfBBJ9mKBmmfUQ4hRDwDtrhfr60JF+m7yKNHoQZG IXZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=frdXGhvzfdlSmh2JxmM8k2Td8d55lznaJIYNtQWy2gc=; b=gHQ2ZD5uud9CECaA9VHxVa+Ge23iFhksJWzTSdFlLURZo7dcGwLKV9+hB2QqNC7MDr TvdTaveRxWZPLIRLYtM79s53hIDGsdvubKvlsfC03pwDzAE0c1uZOghieb7QvJfGC4YO JmgZj+xzwWmatpo5xiJL8AxbqcK0K/kVdDRImYPXvqGvxpfOsPmiCEJLuMi17/cj/9Dg HRuiAwtjsPW6wjbZK95CgIjEckMZmiPC/rJ/4tPcWk4fRIksXDZG6xufQA0yjLiRw83i 57/EmGjhm6hESAogg2BKXbbXoUgpSXk+z6guv7ywiduIhHcmjjBvAOb/MQTbl6MPF20V 2fSA== X-Gm-Message-State: AOAM532BRiLNir9uEzTclL+28iCXtHBoZLgOrkxbyl7mEwn2kB+plQtD whkIkGrlzsakLz7Y2G6rl44UB3+GE4dHkIP/ZHg= X-Google-Smtp-Source: ABdhPJyhMon62TNWJxxeREizZboxnvvsFJU61jl8HwkKMYq6Vl8Hcf/vP9IZ/lGSvuEswPaReh1UNM6o3ZXTbfzxurk= X-Received: by 2002:a63:4e13:: with SMTP id c19mr11379392pgb.432.1613799935412; Fri, 19 Feb 2021 21:45:35 -0800 (PST) MIME-Version: 1.0 References: <20201217050522.GU30237@telsasoft.com> <20201217204442.GX30237@telsasoft.com> <20201218175439.GA30237@telsasoft.com> <20201221074725.GF30237@telsasoft.com> <20201225023958.GW30237@telsasoft.com> <96eaa813-4ad6-e80a-04a4-cc8082d356dd@swarm64.com> <508af801-6356-d36b-1867-011ac6df8f55@swarm64.com> In-Reply-To: From: Bharath Rupireddy Date: Sat, 20 Feb 2021 11:15:24 +0530 Message-ID: Subject: Re: New Table Access Methods for Multi and Single Inserts To: Luc Vlaming Cc: Justin Pryzby , PostgreSQL-development , Andres Freund , Paul Guo , Jeff Davis Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk On Wed, Feb 17, 2021 at 12:46 PM Bharath Rupireddy wrote: > Hi, > > I addressed the following review comments and attaching v3 patch set. > > 1) ExecClearTuple happens before ExecCopySlot in heap_multi_insert_v2 > and this allowed us to remove clear_mi_slots flag from > TableInsertState. > 2) I retained the flushed variable inside TableInsertState so that the > callers can know whether the buffered slots have been flushed. If yes, > the callers can execute after insert row triggers or perform index > insertions. This is easier than passing the after insert row triggers > info and index info to new multi insert table am and let it do. This > way the functionalities can be kept separate i.e. multi insert ams do > only buffering, decisions on when to flush, insertions and the callers > will execute triggers or index insertions. And also none of the > existing table ams are performing these operations within them, so > this is inline with the current design of the table ams. > 3) I have kept the single and multi insert API separate. The previous > suggestion was to have only a single insert API and let the callers > provide initially whether they want multi or single inserts. One > problem with that approach is that we have to allow table ams to > execute the after row triggers or index insertions. That is something > I personally don't like. > > 0001 - new table ams implementation > 0002 - the new multi table ams used in CREATE TABLE AS and REFRESH > MATERIALIZED VIEW > 0003 - the new multi table ams used in COPY > > Please review the v3 patch set further. Below is the performance gain measured for CREATE TABLE AS with the new multi insert am propsed in this thread: case 1 - 2 integer(of 4 bytes each) columns, 3 varchar(8), tuple size 59 bytes, 100mn tuples on master - 185sec on master with multi inserts - 121sec, gain - 1.52X case 2 - 2 bigint(of 8 bytes each) columns, 3 name(of 64 bytes each) columns, 1 varchar(8), tuple size 241 bytes, 100mn tuples on master - 367sec on master with multi inserts - 291sec, gain - 1.26X case 3 - 2 integer(of 4 bytes each) columns, tuple size 32 bytes, 100mn tuples on master - 130sec on master with multi inserts - 105sec, gain - 1.23X case 4 - 2 bigint(of 8 bytes each) columns, 16 name(of 64 bytes each) columns, tuple size 1064 bytes, 10mn tuples on master - 120sec on master with multi inserts - 115sec, gain - 1.04X With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com