Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vux9w-00ENCq-1n for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 18:29:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vux9v-002rxx-0n for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 18:29:23 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vux9u-002rxp-2s for pgsql-hackers@lists.postgresql.org; Tue, 24 Feb 2026 18:29:22 +0000 Received: from fhigh-b8-smtp.messagingengine.com ([202.12.124.159]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vux9r-00000000zxP-2GD5 for pgsql-hackers@lists.postgresql.org; Tue, 24 Feb 2026 18:29:21 +0000 Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfhigh.stl.internal (Postfix) with ESMTP id 11C847A012F; Tue, 24 Feb 2026 13:29:19 -0500 (EST) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-05.internal (MEProxy); Tue, 24 Feb 2026 13:29:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; t=1771957759; x=1772044159; bh=P tudFe26bdgBGGeAwHKusY/amwZixumZzWzi3sP5RCs=; b=UXT1dL9eFtjFPsian QeLlJQMrsBPiEY0YDH6JiE6L0eE4a6P37X5c046GAqqjjjygbbNNWvQQEPVODvz/ oFodo77j2aKGPvZa6reFwVe88mEVHNO3HaFQgKeTjs3eiajof+S0mRbZhn42dG7H vjfgiLeQ9NolI7785+Irt6+fLWWmwEZ0M4Ei2b7t9VG9+aNk5doPAtk84FSbKJgw JVu4/DlZxxAmGrX8MHcsb1OBA76MTCwS8BykE7O07RL1NC/rexMDI9fPxbvrxIom j7v+J+GWzdUXWmR6RdzSSEDKYZ3H5xHNHFPpMoAcmKMvvb1k6VYF0EJtmI1TaqU4 c08GQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvgedtkeelucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkgggtugfgjgesthekredttddtjeenucfhrhhomheptehlvhgrrhho ucfjvghrrhgvrhgruceorghlvhhhvghrrhgvsegrlhhvhhdrnhhoqdhiphdrohhrgheqne cuggftrfgrthhtvghrnhepvdektdffudfftdffffehfffhjeejhffgieeuueekjeekfffg udffhfduffffueevnecuffhomhgrihhnpegvnhhtvghrphhrihhsvggusgdrtghomhenuc evlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegrlhhvhhgv rhhrvgesrghlvhhhrdhnohdqihhprdhorhhgpdhnsggprhgtphhtthhopeegpdhmohguvg epshhmthhpohhuthdprhgtphhtthhopegrhhestgihsggvrhhtvggtrdgrthdprhgtphht thhopehmihhhrghilhhnihhkrghlrgihvghusehgmhgrihhlrdgtohhmpdhrtghpthhtoh epphhgshhqlhdqhhgrtghkvghrsheslhhishhtshdrphhoshhtghhrvghsqhhlrdhorhhg pdhrtghpthhtoheprhhosgesgiiiihhllhgrrdhnvght X-ME-Proxy: Feedback-ID: ia2694551:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 24 Feb 2026 13:29:19 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=alvh.no-ip.org; s=schmee; t=1771957757; bh=MuRJEYvpEbEMysxXfcLEgrAKym5jrf46rQObqO/4fO4=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=Uy5j6xiaVRiDg9PPxFmGSKaTlzxcR70eXDdOWxEp61MNj8sR93BpcTbjBlz20KCHf ydXjW29vLkumeaPf8Xekau2wquCSwAWb8ESd9rFgchkJIcjNZv8aIfbP3YdXobG9pM tlGJVxy1/d5mJJB5TB94s10aWzYZdTrS+eLqSy7orrPcExYwYUQAkdimscupPfpZL6 LCNkyX+1oSF6ZKMPno2bWMkyIVlLpPM5VrQ1KxHZrUuulMvx1gaYveZHUFcm64vCt7 +j/nImM7OBLMBWYf1fRLdBV3ptJrnMm5EQP2LYodoLl+FhLPP7QP0t9vOYkG+mNtw5 QBS9rcpqt1aMA== Received: by schmee.kurilemu.internal (Postfix, from userid 1000) id 2891F7A; Tue, 24 Feb 2026 19:29:17 +0100 (CET) Date: Tue, 24 Feb 2026 19:29:17 +0100 From: Alvaro Herrera To: Antonin Houska Cc: Mihail Nikalayeu , Pg Hackers , Robert Treat Subject: Re: Adding REPACK [concurrently] Message-ID: <202602241757.6ac3iss2u4vo@alvherre.pgsql> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <202602231821.eqzziask6rjj@alvherre.pgsql> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 2026-Feb-23, Alvaro Herrera wrote: Looking at this function in pgoutput_repack.c: > +/* Store concurrent data change. */ > +static void > +store_change(LogicalDecodingContext *ctx, ConcurrentChangeKind kind, > + HeapTuple tuple) > +{ [...] we have this: > + size = VARHDRSZ + SizeOfConcurrentChange; > + > + /* > + * ReorderBufferCommit() stores the TOAST chunks in its private memory > + * context and frees them after having called apply_change(). Therefore > + * we need flat copy (including TOAST) that we eventually copy into the > + * memory context which is available to decode_concurrent_changes(). > + */ > + if (HeapTupleHasExternal(tuple)) > + { > + /* > + * toast_flatten_tuple_to_datum() might be more convenient but we > + * don't want the decompression it does. > + */ > + tuple = toast_flatten_tuple(tuple, dstate->tupdesc); > + flattened = true; > + } > + > + size += tuple->t_len; > + if (size >= MaxAllocSize) > + elog(ERROR, "Change is too big."); > + > + /* Construct the change. */ > + change_raw = (char *) palloc0(size); > + SET_VARSIZE(change_raw, size); I wonder if this isn't problematic with large tuples. If a row has some very wide columns, each of which individually is less than 1 GB, then it might happen that the sum of their sizes exceeds 1 GB, causing palloc() to complain and abort the whole repack operation. This wouldn't be very nice, so I think we need to address it somehow. Another thing I'm not very keen on, is the fact that we have to memcpy() the tuple contents a few lines below: > + /* > + * Copy the tuple. > + * > + * Note: change->tup_data.t_data must be fixed on retrieval! > + */ > + memcpy(&change.tup_data, tuple, sizeof(HeapTupleData)); > + memcpy(dst, &change, SizeOfConcurrentChange); > + dst += SizeOfConcurrentChange; > + memcpy(dst, tuple->t_data, tuple->t_len); > + /* Store as tuple of 1 bytea column. */ > + values[0] = PointerGetDatum(change_raw); > + isnull[0] = false; > + tuplestore_putvalues(dstate->tstore, dstate->tupdesc_change, > + values, isnull); To make matters worse, tuplestore_putvalues does a heap_form_minimal_tuple() on this and copies the data again. This seems pretty wasteful. I think we need some new APIs to avoid all this copying. It appears that it all starts with reorderbuffer doing something unhelpful with the memory context of the TOAST chunks. Maybe we should address this by "fixing" reorderbuffer so that it doesn't do this, instead of playing so many games to cope. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/