Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vvHMD-00Bj9z-1Q for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Feb 2026 16:03:25 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vvHMC-0075FO-0s for pgsql-hackers@arkaria.postgresql.org; Wed, 25 Feb 2026 16:03:24 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vvHMB-0075FG-2s for pgsql-hackers@lists.postgresql.org; Wed, 25 Feb 2026 16:03:23 +0000 Received: from mail-wm1-x336.google.com ([2a00:1450:4864:20::336]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vvHM8-000000018op-2dhX for pgsql-hackers@lists.postgresql.org; Wed, 25 Feb 2026 16:03:22 +0000 Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-483bd7354efso15997205e9.2 for ; Wed, 25 Feb 2026 08:03:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cybertec.at; s=google; t=1772035400; x=1772640200; darn=lists.postgresql.org; h=message-id:date:content-transfer-encoding:content-id:mime-version :comments:references:in-reply-to:subject:cc:to:from:from:to:cc :subject:date:message-id:reply-to; bh=iAvr3z7N5y8AKOtZ/PEvLr/om+PgXuAf91kZKntU11M=; b=h+iS2Pgb11zzXiXBgltBhcoh8M+UaXj4oX8IwSDbmMKZLH7Q+9RyHfGgPzB5i+Edw0 uOJqD6nD04jG9J51D9ZWohe9DDZDzqcIvHm3JNwYp667G/1GSC98bS1hsCHUXOPPVGxt 99d0lr3BP+x3OV0jT4QeaI2giOCPjbUXVNkB458NikuFLBPJmIy5jnZvg7SseBES7vxc M+PidsuGpQ9DRm60Ea/zZ0iosY2j7+ZKaGJLAipv5CtR6OnpAL+6Fa+gqEk63tg5Zfre Nb6TULbwKkfdVEvbcxtvgRGK9ROdTH1R2BFHscPwpyRnKHhqn4T5mXW9Ld0ZK1SKDoWN TAhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772035400; x=1772640200; h=message-id:date:content-transfer-encoding:content-id:mime-version :comments:references:in-reply-to:subject:cc:to:from:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iAvr3z7N5y8AKOtZ/PEvLr/om+PgXuAf91kZKntU11M=; b=aD4T9JTAdOfdAobY0jXkzmT4n90AL6tkcokykWR8iMl6mm0NFjmKRNLdAXXHOqdZQu l8/NbOpnUIXjgkDFEi6HaSYq7rnuryY8VOqdPdXbGWXB5eXDTVbVWmLmcc6I5SswudyI rA4czFwaeroICunvkq+tX7YQ7Kpk6dyo6EaD3DB//CRex6885bpFk9NRiC0uZLwrEAV+ YggDZM+2ltlWaH6nph3I1y+4NNVt0q0VyFxElDEdpVsIPCvUEa9MNv+wbCPiqPAILp1x bw2mfcn3Cywf4rvX2jdbaGEEEdpXs/+OUwq7uL+BJr00jHDTSQZiXJoA7p+9AGz/Vu/u 2+pQ== X-Forwarded-Encrypted: i=1; AJvYcCVBJxLPEpII/x2bQxi8wUW+ueXI7vXkJm+RmeFkUSBlcTacjN+dq+YVW9thlb2EKzN4TdFvZTBxpQH1f2an@lists.postgresql.org X-Gm-Message-State: AOJu0Yz0VQmW3mBsy1w5UmnV4BLwQdsoUCh26AkjmVYesmTGn3shd3CQ tPBr/KaDv5/zFnlyBUfG5ym4hV+cssTkdCoB/DqvQMfcc8wDumMX8v7QylVfZq7DVJc= X-Gm-Gg: ATEYQzzyQ8bIWrAcEDGJC8d81TZnjmE2WzbCIkVUFPtEI3uQj7CXNeBmqgnGw8vAs9E KrniHqAxtm0oMzdss0NMGAXxL6qct4eBg3lkQ1/gJ6DsCaLfjBWmwJasUqJIo5GYv8JB/hknSNL +gEHgg1b0eNxE7+VuGGF6i+wxbJa2EIzTgvIx+qtUgJZTr2b7xfTjhsw83Tky81xZWIjjfKMzwr b3kgwxf/zA8+9PY15ReP+u7+Za40O3soXfVbYvxNhr8KeBJRUIVTUFNYrrouMmED7Z2NFsJ+Chp qSlopEMgP29ptDKFaAsa/9C6feV6W7gN9qEAwECxy1Fd5M0dnPiblTtXb/ojwfrkunwRBoYbt90 zbMDXnuPPPnAL9PX/Tq6sWmyJfD8bep2HrtIGnEBpR9+7LtLUp6p28AEACcKdyosNz983d04tmu LU12ivTMlmu+956B1O5+3vIj+fnXLjscuaI9aA X-Received: by 2002:a05:600d:640f:20b0:477:9b4a:a82 with SMTP id 5b1f17b1804b1-483c252de9bmr6080585e9.35.1772035400120; Wed, 25 Feb 2026 08:03:20 -0800 (PST) Received: from localhost (109-81-168-142.rct.o2.cz. [109.81.168.142]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-483bfcbf789sm13791185e9.17.2026.02.25.08.03.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Feb 2026 08:03:19 -0800 (PST) From: Antonin Houska To: Alvaro Herrera cc: Mihail Nikalayeu , Pg Hackers , Robert Treat Subject: Re: Adding REPACK [concurrently] In-reply-to: <202602251344.gtuwb7nbk7m3@alvherre.pgsql> References: <202602251344.gtuwb7nbk7m3@alvherre.pgsql> Comments: In-reply-to Alvaro Herrera message dated "Wed, 25 Feb 2026 14:54:09 +0100." X-Mailer: MH-E 8.6+git; nmh 1.8; GNU Emacs 28.3 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <93329.1772035399.1@localhost> Content-Transfer-Encoding: quoted-printable Date: Wed, 25 Feb 2026 17:03:19 +0100 Message-ID: <93330.1772035399@localhost> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Alvaro Herrera wrote: > On 2026-Feb-25, Antonin Houska wrote: > > For REPACK, I suggest a variant of toast_flatten_tuple() that writes t= he > > output to a file, and a corresponding function that reads it while all= ocating > > separate chunks of memory for the individual TOASTed attributes - the = restored > > tuple would reference the chunks using the "external indirect" TOAST p= ointers, > > as if it had been processed by ReorderBufferToastReplace(). Does that = make > > sense to you? > = > Hmm, so on the apply side when reading the file, we would first reach > each toast attribute value, which we know to insert directly to the > toast table (keeping track of each individually toast pointer as we do > so); then we reach the heap tuple itself, we [... somehow ...] interpret > these external indirect toast pointers and substitute the toast pointers > that we created. So we never have to construct the entire tuple, or > indeed do anything else with the toasted values other than insert them > into the toast table. Yes, that's what I mean. > Actually, can't we simply insert the toasted values directly in the > decoding worker into the new toast table? That could save a lot of > writing to the file, since we only save the raw heap tuples with no > toasted contents; but it's not clear to me that this is valid. (And we > might create extra bloat if a tuple is inserted and later deleted > concurrently with the repack; but that would happen with the original > approach as well, no?) The problem I see here is that for UPDATE you need the old tuple to determ= ine if its TOAST value should be deleted or if the new tuple should reuse it - this is how I understand toast_tuple_init(). So the worker would have to s= tore all the changes somewhere temporarily until it can fully apply the changes (i.e. until the initial copy and index build is complete). Besides that, if the worker had to switch between the past (for the decodi= ng) and present (for the TOAST operations), it would have to invalidate system caches repeatedly. 0004 does that, but 0005 makes that unnecessary. (I don= 't know if the repeated cache invalidation would be a serious performance problem, but from coding perspective I find it more convenient if the work= er only deals with decoding and does not have to do this time travel and invalidations at all.) -- = Antonin Houska Web: https://www.cybertec-postgresql.com