Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s7gLK-006oV5-H1 for pgsql-hackers@arkaria.postgresql.org; Thu, 16 May 2024 19:00:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s7gLK-00DXUt-7T for pgsql-hackers@arkaria.postgresql.org; Thu, 16 May 2024 19:00:42 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s7gLJ-00DXUl-UL for pgsql-hackers@lists.postgresql.org; Thu, 16 May 2024 19:00:41 +0000 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1s7gLH-000XXs-9B for pgsql-hackers@postgresql.org; Thu, 16 May 2024 19:00:40 +0000 Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-6f67c6fc504so510851b3a.1 for ; Thu, 16 May 2024 12:00:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=j-davis-com.20230601.gappssmtp.com; s=20230601; t=1715886038; x=1716490838; darn=postgresql.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=Xh8d28iDAjJuNZ3Poq0rzOIWMuwcBir/kXOh2XLITCc=; b=wuYmCfyNcMQOvITxDTKtBitJ5QNuSv8T47s7NT4ljQWkwjGmJjU+iPWOG0OsGXgicv pc7AjnDDJVGJoYFvCDzMTCwUe8Rq/BstX+jeGZeEARxcoMUR+FUQHD20v9t4tdzDVitQ gobSeBROoUMX2CZoY5aROpJIWRaLHyX0i5I5XezaqtpdEQofyK1sZJUUR5paGBEAi++p pjN3T5mGzGvazYx8lIvjBDu8baABLeIxbV4wA3+LKFtVYvOdOfwtJVdyxhEdjC+6H5IV Secf3g21N9lUB+1unRBM9a0naJSVaSAf+OxGUD0L9cL8eoORazY6kme59ozNJy191BVR TYrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715886038; x=1716490838; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Xh8d28iDAjJuNZ3Poq0rzOIWMuwcBir/kXOh2XLITCc=; b=wgDOnFgkK7JANd4F+4jQlqRFLoN0bnlrPL2Y3m089czVzYpn/gPfMCN8F75eStGcm9 i/3TuhEzon2pMWn7xVhZuhiyDbilvL9GOo0ZshHfAaEkG/8/B16pf1LTbyoAmejIkDm4 +HJ6ScgmKC3kLdLba0NNtgzXQ2DM5LRuGEM4BWAJnDBZLJhLCLmvFng3rzaqVmMq9SKx EwyR/N3vdNiOZhnm4hb3xzy2BV7CcgF4TMj6I+v4DWB/dZ4EZFk/PnJLsC5M4hsI2ufI 4rN3sy2i8MA98WONN47T2DObpOC9zwoxCBXl8Hl39T2xiMOah1LsQO9465eAIH5KcPad y3Fw== X-Forwarded-Encrypted: i=1; AJvYcCXSqdQ6tclFh+hYLf95j648TEjtaID0NLvc2LJ20Yaoy2crc3GFAYWp95Jo3L8zREVBLey0ooz2W8+q9e/UdWxWlyu/TuPIdA1EB+n0 X-Gm-Message-State: AOJu0YxzM3TE1H2jIXsLZ3rolBk4qfCHM7OBD5yPMjCUHrlwWPj6/YgA 20XpBL2/NcZfT6z2Prb4QvIvN2FUboM0V2QjTpmD09zpU3sI7J5INI4NBTd09g== X-Google-Smtp-Source: AGHT+IFevmdIvCltGhfkENNDSWiUG2VFYtZvrqT/V++1DiO0mP8a9KqYKb1Wk3VI5GIFjc7sHwl60Q== X-Received: by 2002:a05:6a21:329d:b0:1af:cd0c:d342 with SMTP id adf61e73a8af0-1afde1d7c5fmr23353043637.52.1715886038020; Thu, 16 May 2024 12:00:38 -0700 (PDT) Received: from jeff-laptop.lan (c-76-102-242-158.hsd1.ca.comcast.net. [76.102.242.158]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f4d2a828e2sm13433108b3a.72.2024.05.16.12.00.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 May 2024 12:00:37 -0700 (PDT) Message-ID: Subject: Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM From: Jeff Davis To: Bharath Rupireddy Cc: Masahiko Sawada , PostgreSQL-development , Andres Freund , Dilip Kumar , Luc Vlaming , Justin Pryzby , Michael Paquier , Matthias van de Meent , Alexander Korotkov Date: Thu, 16 May 2024 12:00:36 -0700 In-Reply-To: <229c4f7219ed164088dadc935df21e1cf125e191.camel@j-davis.com> References: <20230603223824.o7iyochli2dwwi7k@alap3.anarazel.de> <6be6f58815dc0844fbe058edf56b4e735a6efc1c.camel@j-davis.com> <2280bf7241119bb88cbe0fe5eb36490cbd04c0c0.camel@j-davis.com> <12f1a2d8dd3b6305c0354f1c701f44b7be5e54eb.camel@j-davis.com> <8633171cb034aafc260fdf37df04b6c779aa1e2f.camel@j-davis.com> <229c4f7219ed164088dadc935df21e1cf125e191.camel@j-davis.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4-0ubuntu2 MIME-Version: 1.0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, 2024-05-15 at 16:31 -0700, Jeff Davis wrote: > Even better would be if we could take into account partitioning. That > might be out of scope for your current work, but it would be very > useful. We could have a couple new GUCs like modify_table_buffer and > modify_table_buffer_per_partition or something like that. To expand on this point: For heap, the insert buffer is only 1000 tuples, which doesn't take much memory. But for an AM that does any significant reorganization of the input data, the buffer may be much larger. For insert into a partitioned table, that buffer could be multiplied across many partitions, and start to be a real concern. We might not need table AM API changes at all here beyond what v21 offers. The ModifyTableState includes the memory context, so that gives the caller a way to know the memory consumption of a single partition's buffer. And if it needs to free the resources, it can just call modify_table_end(), and then _begin() again if more tuples hit that partition. So I believe what I'm asking for here is entirely orthogonal to the current proposal. However, it got me thinking that we might not want to use work_mem for controlling the heap's buffer size. Each AM is going to have radically different memory needs, and may have its own (extension) GUCs to control that memory usage, so they won't honor work_mem. We could either have a separate GUC for the heap if it makes sense, or we could just hard-code a reasonable value. Regards, Jeff Davis