Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sihkZ-008F3h-Tc for pgsql-hackers@arkaria.postgresql.org; Mon, 26 Aug 2024 21:59:48 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sihkX-00Fiq0-OA for pgsql-hackers@arkaria.postgresql.org; Mon, 26 Aug 2024 21:59:46 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sihkX-00Fips-9s for pgsql-hackers@lists.postgresql.org; Mon, 26 Aug 2024 21:59:45 +0000 Received: from mail-lj1-x235.google.com ([2a00:1450:4864:20::235]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sihkV-001aAz-4V for pgsql-hackers@postgresql.org; Mon, 26 Aug 2024 21:59:44 +0000 Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2f3f68dd44bso54856321fa.3 for ; Mon, 26 Aug 2024 14:59:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724709581; x=1725314381; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HYkrdD8cVwYJ7z7M3F5r/nJUV7CeO+rQkL5mFWTLbIE=; b=heDZtDelLdWXLOhi//r8HqFPAZfHjZ8SqV/K4ZEPXDniMaVsZweXzzjpy8Ya+lHM/h PfNZd5u3B5KH2PsfG1ym14gqrsmjvn2ALvc67tT0AVrycWe7qkOk7mfqS0OnP2wGs4W9 KSVXjHPMTumMULFxYAScv6wfkeN+gy5vmwUrtD8uizJE+cdrevt9pbks/VTndpanHAAL 6BkIDRcLVuW/0wdJIgcAvD/k+pvaLnVwIIWh5bd17t0FzxDjc55eqotojs3ynPfFdUL8 PdvJEOR1o0+8tq20+wCiuMi03asgg+CMY5Iz4wLxSdcBGULY8v+NFfHBl/4ipnafjshy Y6Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724709581; x=1725314381; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HYkrdD8cVwYJ7z7M3F5r/nJUV7CeO+rQkL5mFWTLbIE=; b=d69KglLGVMuC4zLgnIAn7Z6Pg658JuOnA7woniQuWX/HntahvkDbSQvSC27yoDSFD3 MjrFO6Zke79wHLd4FXb4zbItGLhY9MU/EZY1u+0B/Cb/2JXp83m37uEKuXid9bNRFk6f snDggI8wFmqXrqnJIqGcRNm1t9EC38DebevN2nqPBajgjzJg9wSIFeOs9nYK9SaK7hSe G9NrxZUUIH9kGbR3Q5g89Zc5ZDnPc9IspTjK2XsGeyasA9XzTVy8dICVUwMgTCAc8VFe QpA/CzF3IrgtfuwFcpc3SU5lrPyd2LC3fkVA2kFqski9Xx2a67kwOjLXtZRQ/aklLcO6 7EKQ== X-Forwarded-Encrypted: i=1; AJvYcCUfEZiezsjsXyl6RP54eenXzCBy7y/rWV8gl+09yiG5FMRbHpXz5C/Jztelb6pwSaM0lgRwQAXK3Jv+T//+@postgresql.org X-Gm-Message-State: AOJu0YzKgCEiQ6jHUEqZuQDgEINFondKGkQdkdDKc+1NVYJwTsliZPd2 EtnY12QHik1JKCWtG9H4InpE2afajBuIrMjj23k1vdC2BkVRTWyVxhLuOs99SsiMG+3n4g8/okx eE+ngA+Gum1zydgMCRQ6FXZVEdY0= X-Google-Smtp-Source: AGHT+IEuRHR8ams+Da7A0UynuqG2+/5Wa14+vz2rk+H3AwtES+9c+pXgSC1rdkgCwN6YefJzQGzKspfYwge2sNz9kDE= X-Received: by 2002:a2e:bc0a:0:b0:2ef:2dac:9076 with SMTP id 38308e7fff4ca-2f514a2dc44mr6453441fa.11.1724709580432; Mon, 26 Aug 2024 14:59:40 -0700 (PDT) MIME-Version: 1.0 References: <20230603223824.o7iyochli2dwwi7k@alap3.anarazel.de> <6be6f58815dc0844fbe058edf56b4e735a6efc1c.camel@j-davis.com> <2280bf7241119bb88cbe0fe5eb36490cbd04c0c0.camel@j-davis.com> <12f1a2d8dd3b6305c0354f1c701f44b7be5e54eb.camel@j-davis.com> <8633171cb034aafc260fdf37df04b6c779aa1e2f.camel@j-davis.com> <229c4f7219ed164088dadc935df21e1cf125e191.camel@j-davis.com> In-Reply-To: From: Matthias van de Meent Date: Mon, 26 Aug 2024 23:59:28 +0200 Message-ID: Subject: Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM To: Jeff Davis , Bharath Rupireddy Cc: Masahiko Sawada , PostgreSQL-development , Andres Freund , Dilip Kumar , Luc Vlaming , Justin Pryzby , Michael Paquier , Alexander Korotkov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, 26 Aug 2024 at 23:18, Jeff Davis wrote: > > On Mon, 2024-08-26 at 11:09 +0530, Bharath Rupireddy wrote: > > On Wed, Jun 5, 2024 at 12:42=E2=80=AFPM Bharath Rupireddy > > wrote: > > > > > > Please find the v22 patches with the above changes. > > > > Please find the v23 patches after rebasing 0005 and adapting 0004 for > > 9758174e2e. > > > Thank you. > > 0001 API design: > > * Remove TableModifyState.modify_end_callback. > > * This patch means that we will either remove or deprecate > TableAmRoutine.multi_insert and finish_bulk_insert. Are there any > strong opinions about maintaining support for multi-insert, or should > we just remove it outright and force any new AMs to implement the new > APIs to maintain COPY performance? I don't think there is a significant enough difference in the capabilities and requirements between the two APIs as currently designed that removal of the old API would mean a significant difference in capabilities. Maybe we could supply an equivalent API shim to help the transition, but I don't think we should keep the old API around in the TableAM. > * Why do we need a separate "modify_flags" and "options"? Can't we just > combine them into TABLE_MODIFY_* flags? > > > Alexander, you had some work in this area as well, such b1484a3f19. I > believe 0001 covers this use case in a different way: rather than > giving complete responsibility to the AM to insert into the indexes, > the caller provides a callback and the AM is responsible for calling it > at the time the tuples are flushed. Is that right? > > The design has been out for a while, so unless others have suggestions, > I'm considering the major design points mostly settled and I will move > forward with something like 0001 (pending implementation issues). Sorry about this late feedback, but while I'm generally +1 on the idea and primary design, I feel that it doesn't quite cover all the areas I'd expected it to cover. Specifically, I'm having trouble seeing how this could be used to implement ```INSERT INTO ... SELECT ... RETURNING ctid``` as I see no returning output path for the newly inserted tuples' data, which is usually required for our execution nodes' output path. Is support for RETURN-clauses planned for this API? In a previous iteration, the flush operation was capable of returning a TTS, but that seems to have been dropped, and I can't quite figure out why. > Note: I believe this API will extend naturally to updates and deletes, > as well. I have the same concern about UPDATE ... RETURNING not fitting with this callback-based design. Kind regards, Matthias van de Meent Neon (https://neon.tech)