Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rrjzw-002Q60-NP for pgsql-hackers@arkaria.postgresql.org; Tue, 02 Apr 2024 19:40:45 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rrjzu-00BK82-Tv for pgsql-hackers@arkaria.postgresql.org; Tue, 02 Apr 2024 19:40:42 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rrjzu-00BK7s-Ad for pgsql-hackers@lists.postgresql.org; Tue, 02 Apr 2024 19:40:42 +0000 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rrjzr-000End-HX for pgsql-hackers@postgresql.org; Tue, 02 Apr 2024 19:40:41 +0000 Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-6ea838bf357so4463193b3a.0 for ; Tue, 02 Apr 2024 12:40:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=j-davis-com.20230601.gappssmtp.com; s=20230601; t=1712086838; x=1712691638; darn=postgresql.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=7a1KzQVCz081VLEyslAw7B2Swh1R0KvU53/8xGu+3lo=; b=hu3GoWQ6i523Hrdida6HuoLRAVrLmtsEw3km5yhxpH6anZsB8bG6jZ4wgfsOcDBczE 8nKR5T5m252xetY/seynbkHTnVF19OyCTh7ukJg9LD+gmWoh6vmA2I8/k9MYbmchpMsK 75dV9SGlbSdffbzuG1VUkdTUCjgk7MoIc5hxPHn1RliA8QUrwLRmOD5WC6xTyPDJ9jBO WCxaY3JPz4Zx/qfWI3LnfM02dRuHsOMexWwkeMwRipmtskCUuKUKVUC2etbVHWeotW34 iQdOEZkZq/FrUfLAKaaVoqyQ8R9C97eCqX7J4/tt3tLm8Yn6DUlR6mIwszyO8hPcW+d9 VjaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712086838; x=1712691638; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=7a1KzQVCz081VLEyslAw7B2Swh1R0KvU53/8xGu+3lo=; b=tQJo/WWGar5Tv5XIJLcWxzCP0Flk9/U1tELmnAqvrvoh7QL23AKOcsyH+jJ2SnXvFK tExsGc0FK9ThppiBoKPf2HLrYjG4qQaSfmG2RmJ1DWQyUZTljPJEG/nR9bMQuQpJJBDZ sFcepAfW1zXhN8MKPkfYBIfvsywg+T9Sm5lvf/pgXwdGeRyGCZKpnPixZA83Pdmio3oH Kj1Vi914aEofx/aQ77+tdzOav2+LeMSfeZsg4dT5SW6NYKRSd557UIJ6F4ZAqAjlEAnh XLVPpJlOl6CoDpj4V46KuMYxnBZW6VFxjCbWTaut1nsK5CFU9RQoYKGGXSirEq1xHQiH XwWw== X-Forwarded-Encrypted: i=1; AJvYcCUhmt8/6qiVepyLh+pUi2+8xxCq4USs6EVw60w2hTGjwoS3X8Clnetx96XvA7O0G4tSQlH2+73WwjKG4qZkNWrNKKTYgKmu28B5RB9T X-Gm-Message-State: AOJu0YyVTRpvn8HsLTNE+4OGf2Suxc62UZ8gYsoIkv+Z0g+IwEfyTMO/ UQhIHG7PbsEtyGKRGskxTpdul3sUmDx1mulRwyNrOXC3ZDrdsxEJHU6WkLdU/w== X-Google-Smtp-Source: AGHT+IFcHOZf7Z5PXtzGwNSVd+6dKjeCPuAfzEXVnfWHZC52dfRm9P9XU+dXvy8Fc98OFM0yuWbOXw== X-Received: by 2002:a05:6a20:914a:b0:1a5:6b5b:c0c5 with SMTP id x10-20020a056a20914a00b001a56b5bc0c5mr13057213pzc.14.1712086838344; Tue, 02 Apr 2024 12:40:38 -0700 (PDT) Received: from [172.24.2.15] ([12.126.244.130]) by smtp.gmail.com with ESMTPSA id g18-20020aa78192000000b006e7324d32bbsm10209835pfi.122.2024.04.02.12.40.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Apr 2024 12:40:37 -0700 (PDT) Message-ID: <8633171cb034aafc260fdf37df04b6c779aa1e2f.camel@j-davis.com> Subject: Re: New Table Access Methods for Multi and Single Inserts From: Jeff Davis To: Bharath Rupireddy Cc: Masahiko Sawada , PostgreSQL-development , Andres Freund , Dilip Kumar , Luc Vlaming , Justin Pryzby , Michael Paquier , Matthias van de Meent , Alexander Korotkov Date: Tue, 02 Apr 2024 12:40:36 -0700 In-Reply-To: References: <20230603223824.o7iyochli2dwwi7k@alap3.anarazel.de> <6be6f58815dc0844fbe058edf56b4e735a6efc1c.camel@j-davis.com> <2280bf7241119bb88cbe0fe5eb36490cbd04c0c0.camel@j-davis.com> <12f1a2d8dd3b6305c0354f1c701f44b7be5e54eb.camel@j-davis.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4-0ubuntu2 MIME-Version: 1.0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sun, 2024-03-31 at 21:18 +0530, Bharath Rupireddy wrote: > if (table_modify_buffer_insert() is defined) > =C2=A0=C2=A0 table_modify_buffer_insert(...); > else > { > =C2=A0 myState->bistate =3D GetBulkInsertState(); > =C2=A0 table_tuple_insert(...); > } We can't alloc/free the bulk insert state for every insert call. I see two options: * Each caller needs to support two code paths: if the buffered insert APIs are defined, then use those; otherwise the caller needs to manage the bulk insert state itself and call the plain insert API. * Have default implementation for the new API methods, so that the default for the begin method would allocate the bulk insert state, and the default for the buffered insert method would be to call plain insert using the bulk insert state. I'd prefer the latter, at least in the long term. But I haven't really thought through the details, so perhaps we'd need to use the former. > >=20 > > After we have these new APIs fully in place and used by COPY, what > > will > > happen to those other APIs? Will they be deprecated or will there > > be a > > reason to keep them? >=20 > Deprecated perhaps? Including Alexander on this thread, because he's making changes to the multi-insert API. We need some consensus on where we are going with these APIs before we make more changes, and what incremental steps make sense in v17. Here's where I think this API should go: 1. Have table_modify_begin/end and table_modify_buffer_insert, like those that are implemented in your patch. 2. Add some kind of flush callback that will be called either while the tuples are being flushed or after the tuples are flushed (but before they are freed by the AM). (Aside: do we need to call it while the tuples are being flushed to get the right visibility semantics for after-row triggers?) 3. Add table_modify_buffer_{update|delete} APIs. 4. Some kind of API tweaks to help manage memory when modifying pertitioned tables, so that the buffering doesn't get out of control. Perhaps just reporting memory usage and allowing the caller to force flushes would be enough. 5. Use these new methods for CREATE/REFRESH MATERIALIZED VIEW. This is fairly straightforward, I believe, and handled by your patch. Indexes are (re)built afterward, and no triggers are possible. 6. Use these new methods for CREATE TABLE ... AS. This is fairly straightforward, I believe, and handled by your patch. No indexes or triggers are possible. 7. Use these new methods for COPY. We have to be careful to avoid regressions for the heap method, because it's already managing its own buffers. If the AM manages the buffering, then it may require additional copying of slots, which could be a disadvantage. To solve this, we may need some minor API tweaks to avoid copying when the caller guarantees that the memory will not be freed to early, or perhaps expose the AM's memory context to copyfrom.c. Another thing to consider is that the buffering in copyfrom.c is also used for FDWs, so that buffering code path needs to be preserved in copyfrom.c even if not used for AMs. 8. Use these new methods for INSERT INTO ... SELECT. One potential challenge here is that execution nodes are not always run to completion, so we need to be sure that the flush isn't forgotten in that case. 9. Use these new methods for DELETE, UPDATE, and MERGE. MERGE can use the buffer_insert/update/delete APIs; we don't need a separate merge method. This probably requires that the AM maintain 3 separate buffers to distinguish different kinds of changes at flush time (obviously these can be initialized lazily to avoid overhead when not being used). 10. Use these new methods for logical apply. 11. Deprecate the multi_insert API. Thoughts on this plan? Does your patch make sense in v17 as a stepping stone, or should we try to make all of these API changes together in v18? Also, a sample AM code would be a huge benefit here. Writing a real AM is hard, but perhaps we can at least have an example one to demonstrate how to use these APIs? Regards, Jeff Davis