Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s7O69-005Qdc-00 for pgsql-hackers@arkaria.postgresql.org; Wed, 15 May 2024 23:31:50 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s7O68-00AJC7-Ru for pgsql-hackers@arkaria.postgresql.org; Wed, 15 May 2024 23:31:48 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s7O68-00AJ6v-DB for pgsql-hackers@lists.postgresql.org; Wed, 15 May 2024 23:31:48 +0000 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1s7O65-000PLF-Ro for pgsql-hackers@postgresql.org; Wed, 15 May 2024 23:31:47 +0000 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-6f4302187c0so5757821b3a.1 for ; Wed, 15 May 2024 16:31:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=j-davis-com.20230601.gappssmtp.com; s=20230601; t=1715815904; x=1716420704; darn=postgresql.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=kklvjp0//WHt2CShuo5lOFEub7F5UxXipW2cHS+ec3M=; b=Dki43YNxfslqR1tRiTbWMR5ePaaUxya8BoR1SB85ljFG2PFn3QKSv3ZUnI6VPxsW3i 4yXMK2h00FkQBiyLfBhisO6SLBP85CT0Bqzpe63fLgMHkO7ILVi5pRaCYZzA6fvWw665 Z3ncE3Iiv2HWTslnv3mSkQjS7XfYuWjafgCaK3RBctvQe7FGYQhv2Al/WqkomBjCD5mx aeXgBiNjhXeDA9o6mWpOHAwvgQeR0PcmdEi8BNMPNTC1frFDeGJzoR9KJ/aMz58r2QxF DO9eCZI/xUR6UZfaYyr1oyivHiCwFp31H6TQ6tzaAXH1jq33QiL7rd9jUdidfDhe0ew5 nZ0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715815904; x=1716420704; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=kklvjp0//WHt2CShuo5lOFEub7F5UxXipW2cHS+ec3M=; b=VXb+v984f29mj0VYYbeyBT90IV6APLf4MqfegRud2saLKEFExisoyYSyES1ri4c9eV /tSkMfieYbCctk31u6RgVmyehgeCvaqGWCotU0EhL+6MGzH7lwp52uZK3ZKBf7M+hwBy 7PUUa67uelZP5NpJD3v2t1i50Rj9Vylc/77bIxwqo1KhPPjRBLcnkD28NzYvTgsMiy8t ozzSYAlkWsbSnRUC6aGsUugm7ZzRgIIKTP0NQbiAefs6Do8aq3X9wn92+6pcFP099IyJ dGYT+hsyUpFf++0O4WzwRs3fJ4/feAMR8kFvnEdNKmS4CR65Ox/YzF0E/DyZn7smVdJG cnbA== X-Forwarded-Encrypted: i=1; AJvYcCXO5+BnOZ+GEmJR25UXiecSsAfpQ7dAOfMQ7KwQEGl/c5v82DXnc8vw9pXkO9v00pI8k3ErPgjPBuSMV9vD1WK7+58fqWjaivi7kHBa X-Gm-Message-State: AOJu0Yz2yMdWXqUVlkXs5tCG8Api77GhaLRgFcEUZTIDjKs0CUrbkZP5 vPdpyyoKl8Trb/2FXrU36CviMvghsFHuT/sdNAodALIpBV54O3l9wgEoN40Whg== X-Google-Smtp-Source: AGHT+IFewqi+8IepmniQWxgUfCqFn9jxG07+xuQDdAwHIpHWR6aaho1r4EB7k/jQx8hZ6lMHMrhwaA== X-Received: by 2002:a05:6a20:6a26:b0:1b0:111f:2b6d with SMTP id adf61e73a8af0-1b0111f2cb7mr8566088637.28.1715815904505; Wed, 15 May 2024 16:31:44 -0700 (PDT) Received: from [172.18.10.46] ([12.126.244.130]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f4d2afecd0sm11722885b3a.180.2024.05.15.16.31.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 May 2024 16:31:43 -0700 (PDT) Message-ID: <229c4f7219ed164088dadc935df21e1cf125e191.camel@j-davis.com> Subject: Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM From: Jeff Davis To: Bharath Rupireddy Cc: Masahiko Sawada , PostgreSQL-development , Andres Freund , Dilip Kumar , Luc Vlaming , Justin Pryzby , Michael Paquier , Matthias van de Meent , Alexander Korotkov Date: Wed, 15 May 2024 16:31:42 -0700 In-Reply-To: References: <20230603223824.o7iyochli2dwwi7k@alap3.anarazel.de> <6be6f58815dc0844fbe058edf56b4e735a6efc1c.camel@j-davis.com> <2280bf7241119bb88cbe0fe5eb36490cbd04c0c0.camel@j-davis.com> <12f1a2d8dd3b6305c0354f1c701f44b7be5e54eb.camel@j-davis.com> <8633171cb034aafc260fdf37df04b6c779aa1e2f.camel@j-davis.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4-0ubuntu2 MIME-Version: 1.0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, 2024-05-15 at 12:56 +0530, Bharath Rupireddy wrote: > Because of this, the > buffers get flushed sooner than that of the existing COPY with > table_multi_insert AM causing regression in pgbench which uses COPY > extensively. The flushing behavior is entirely controlled by the table AM. The heap can use the same flushing logic that it did before, which is to hold 1000 tuples. I like that it's accounting for memory, too, but it doesn't need to be overly restrictive. Why not just use work_mem? That should hold 1000 reasonably-sized tuples, plus overhead. Even better would be if we could take into account partitioning. That might be out of scope for your current work, but it would be very useful. We could have a couple new GUCs like modify_table_buffer and modify_table_buffer_per_partition or something like that. > 1. Try to get the actual tuple sizes excluding header sizes for each > column in the new TAM. I don't see the point in arbitrarily excluding the header. > v21 also adds code to maintain tuple size for virtual tuple slots. > This helps make better memory-based flushing decisions in the new > TAM. That seems wrong. We shouldn't need to change the TupleTableSlot structure for this patch. Comments on v21: * All callers specify TM_FLAG_MULTI_INSERTS. What's the purpose? * The only caller that doesn't use TM_FLAG_BAS_BULKWRITE is ExecInsert(). What's the disadvantage to using a bulk insert state there? * I'm a bit confused by TableModifyState->modify_end_callback. The AM both sets the callback and calls the callback -- why can't the code just go into the table_modify_end method? * The code structure in table_modify_begin() (and related) is strange. Can it be simplified or am I missing something? * Why are table_modify_state and insert_modify_buffer_flush_context globals? What if there are multiple modify nodes in a plan? * Can you explain the design in logical rep? Regards, Jeff Davis