Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v862K-00Cfqf-Dr for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Oct 2025 00:03:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v862I-0033as-4u for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Oct 2025 00:03:35 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v862H-0033aj-MF for pgsql-hackers@lists.postgresql.org; Mon, 13 Oct 2025 00:03:34 +0000 Received: from mail-vk1-xa32.google.com ([2607:f8b0:4864:20::a32]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v862F-001pSX-25 for pgsql-hackers@lists.postgresql.org; Mon, 13 Oct 2025 00:03:34 +0000 Received: by mail-vk1-xa32.google.com with SMTP id 71dfb90a1353d-554ff1682c8so348556e0c.1 for ; Sun, 12 Oct 2025 17:03:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=xzilla-net.20230601.gappssmtp.com; s=20230601; t=1760313810; x=1760918610; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nHEwxHh7L1UGC5Dev1xq85WcpkjSC8DyTTVGGWmeJEw=; b=hPO2+QyuO5NG8Pkwt/J7yOQA3rgRjzQBbFWb9N5AH3NUoizlpCS7WX4RPCkkVO0MGt 8ngWAkL52HCjbJKz39xF2/pcpSf5kVNf4XQcT+Nek/OkIRHwFGw9Jmd7RsG/aOeCwuiT hiZiQVBW2DLliCxOTRfhSEksXGc06VkYoOZgi9YjvRuissWQinDamizrILpHrVGN5v6Z vWUssuqw61HwRA0074dDJuqxUgRqoEPbDM+gxMhzGoa/yoPEmAockplmgU4Ts41ModQv u3vPV6OauLqd55xuwNH1O+BUFl2Q6Fm7+Ai813s86Me8+nyFNACS/rNebvYLLEgmzf3L jpcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760313810; x=1760918610; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nHEwxHh7L1UGC5Dev1xq85WcpkjSC8DyTTVGGWmeJEw=; b=pC8yozF+4qv/O2efxUAzXafCOJlnGZrIvMUDs7rtbY90/UBGdbYVwfSTpu1lHwERNu Tj8rbQ0BIjSjljOtIU5nVeRfkeUmyV+Cyu+QWcB36EQ0fY/ZptVETNP3yQQ24hIrU9jF rlMLoR1KU2AfMi1iqlsxDAxt0ISitjsDlqpR6WKIEROdRI9OQzIytVkEuinvIGUwUBud ZOUXmoEggjunH5HO7ALQMGuCl4ARPLz3hOgT/Bkad2Gq7eULodb8TkIrPBG/9QMpgGSv pLLtoW4qgE6yZAiUUtUaMEoPbTveGqoO8LVUJT9+yhwfYryYE2x7N8lcd7YCbJOAJpNf pGYw== X-Forwarded-Encrypted: i=1; AJvYcCU7re7V09vR/eSesBLtZXNeFnjbS4DPMohXwLaq7KwjA4NukgIEkWLemxR9S/yDctvNVgak1Ywnf0ZZmrgp@lists.postgresql.org X-Gm-Message-State: AOJu0YzMUi/mA847/X2QWFc20k94R/GN4vEe+aRkD813HkEeUvbuuKCX oECTdr4YYHzOGCeBvLJqlyzdm776fWRzg0rpycZmkhKYN/leFb8Cp+uv1jrahsrnroFPKlAZGee M/Vg2IlyYUmdEwds7SSE2ehCSChbBnppjWB/GlXGR/Q== X-Gm-Gg: ASbGncucT2WyTW2D4jyS3AKYvaiPf7nu01Eh23MOtx9wzV8jq209U2HYFocbBnOtd/X l+2cWMmna7xQZjSTG8Wdvtehz27gcODAiEU3JOX6MVAzSm0aKdZwIZkg1bGLXvmHdgQ1tsY/pg6 5b3QPzZWrVYpDtIEbOoni0UsZeHrnvoRea2ipZsB/9NQJRlGGujd2s18/62Z3df0oVgDPNjtW4p 1h0IwjVF3ZBFbxr3RIm6y4GpE2W7n/rGc5L X-Google-Smtp-Source: AGHT+IF+esNIcl5f6LMtk9voubdNTalR/W9MyVHhHu1Wp3G2ZwOYejBlKIdSBDXY6L0Dye7CHfK12zk3DO0EF0VQ0/E= X-Received: by 2002:a05:6102:6888:b0:5a2:668d:f20b with SMTP id ada2fe7eead31-5d5e236b1a7mr9236690137.16.1760313809869; Sun, 12 Oct 2025 17:03:29 -0700 (PDT) MIME-Version: 1.0 References: <202510070909.biaseppvydvg@alvherre.pgsql> In-Reply-To: <202510070909.biaseppvydvg@alvherre.pgsql> From: Robert Treat Date: Sun, 12 Oct 2025 20:03:18 -0400 X-Gm-Features: AS18NWCEla26_vJpWaWknitp7dIL9WXFTzo18gPQelLJ901JdcyB1qMs7aTwDQY Message-ID: Subject: Re: Adding REPACK [concurrently] To: =?UTF-8?Q?=C3=81lvaro_Herrera?= Cc: Mihail Nikalayeu , Pg Hackers , Antonin Houska , Fujii Masao Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Oct 7, 2025 at 10:05=E2=80=AFAM =C3=81lvaro Herrera wrote: > On 2025-Sep-26, Robert Treat wrote: > That said, on this topic, I've always been bothered by our usage of > command names as verbs, because they are (IMO) horrible for translation. > For instance, in this version of the patch I am making this change: > > if (OidIsValid(indexOid) && OldHeap->rd_rel->relisshared) > ereport(ERROR, > - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > - errmsg("cannot cluster a shared catalog"))); > + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot run %s on a shared catalog", > + RepackCommandAsString(cmd))); > > In the old version, the message is not very translatable because you > have to find a native word to say "to cluster" or "to vacuum", and that > doesn't always work very well in a direct translation. For instance, in > the Spanish message catalog you find this sort of thing: > > msgid "vacuuming \"%s.%s.%s\"" > msgstr "haciendo vacuum a =C2=AB%s.%s.%s=C2=BB" > > which is pretty clear ... but the reason it works, is that I have turned > the phrase around before translating it. I would struggle if I had to > find a Spanish verb that means "to repack" without contorting the > message or saying something absurd and/or against Spanish language > rules, such as "ejecutando repack en table XYZ" or "repaqueando tabl > XYZ" (that's not a word!) or "reempaquetando tabla XYZ" (this is > correct, but far enough from "repack" that it's annoying and potentially > confusing). So I would rather the original used "running REPACK on > table using method XYZ", which is very very easy to translate, and then > the translator doesn't have to editorialize. > I see you didn't do this in the current patch, but +1 for this idea from me. And if you think it'd help, I'm also +1 on the idea for the main docs as well, for example doing something like + - pg_repackdb is a utility for repacking a + pg_repackdb is a utility for running REPACK = on a + PostgreSQL database. I'd be inclined to leave the internal comments alone though, since they aren't translated. > > #5 > > [xzilla@zebes] pgsql/bin/pg_repackdb -d pagila -v -t film --index > > pg_repackdb: repacking database "pagila" > > > > In the above scenario, I am repacking without having previously > > specified an index. At the SQL level this would throw an error, at the > > command line it gives me a heart attack. :-) > > It's actually not that bad, because we don't actually do anything, but > > maybe we should throw an error? > > Yeah, I think this is confusing. I think we should make pg_repackdb > explicitly indicate what has been done, in all cases, without requiring > -v. Otherwise it's too confusing, particularly for the using-index mode > which determines which tables to process based on the existance of an > index marked indiscluster. > At the moment, clusterdb runs silently, but vacuumdb emits output, so there is an argument for either way as default behavior. That said, I think the current behavior of vacuum, which is what we are currently following in pg_repackdb, is the worst of the two: [xzilla@zebes] pgsql/bin/vacuumdb -t actor pagila vacuumdb: vacuuming database "pagila" Without any additional information, the information we do give is misleading; I would rather not say anything. We could of course try to make this more verbose, but I think clusterdb actually gets this right... - say nothing by default (follow the "rule of silence.") - if we want to see commands, pass -e - if we want to see the details, pass -v - if we do something that causes an error, return the error - if we don't want errors, pass -q This is also how reindexdb works, and I think most of the other utilities, and I'd argue this is how vacuumdb should work... to the extent I almost consider it a bug that it doesn't (I leave a little room since I am not sure why it doesn't operate like the other utilities). vacuum is a bit outside the purview of what we are doing here, but I do think following clusterdb/reindexdb is the behavior we should follow for pg_repackdb. > I admit I haven't paid too much attention to these tests. I think I > would rather create a separate src/test/regress/sql/repack.sql file with > the tests for this command. Let's consider this part a WIP for now -- > clearly more tests are needed both for the SQL command CLUSTER and for > pg_repackdb. Yeah, istm as long as we have all 3 commands (repack, cluster, vacuum full) we need regression tests for all 3. > - pg_stat_progress_cluster is no longer a view on top of the low-level > pg_stat_get_progress_info() function. Instead, it's a view on top of > pg_stat_progress_repack. The only change it applies on top of that > one is change the command from REPACK to one of VACUUM FULL or > CLUSTER, depending on whether an index is being used or not. This > should keep the behavior identical to previous versions. > Alternatively we could just hide rows where the command is REPACK, but > I don't think that would be any better. This way, we maintain > compatibility with tools reading pg_stat_progress_cluster. Maybe this > is useless and we should just drop the view, not sure, we can discuss > separately. > I think this mostly depends on how aggressive you want to be in moving people away from cluster and toward repack. If we remove _progress_cluster, it will force people to update monitoring which probably encourages people to switch to pg_repackdb. We probably need to have at least one "bridge" release though, and I think you've got the right balance for that. > - I noticed that you can do "CLUSTER pg_class ON some_index" and it will > happily modify pg_index.indisclustered, which is a bit weird > considering that allow_system_table_mods is off -- if you later try > ALTER TABLE .. SET WITHOUT CLUSTER, it won't let you. I think this is > bogus and we should change it so that CLUSTER refuses to change the > clustered index on a system catalog, unless allow_system_table_mods is > on. However, that would be a change from longstanding behavior which > is specifically tested for in regression tests, so I didn't do it. > We can discuss such a change separately. But I did make REPACK refuse > to do that, because we don't need to propagate bogus historical > behavior. So REPACK will fail if you try to change the indisclustered > index, but it will work fine if you repack based on the same index as > before, or repack with no index. > Since cluster will presumably be deprecated with this release, I'd leave the existing behavior and move forward with repack as you've laid out. > - pg_repackdb: if you try with a non-superuser without specifying a > table name, it will fail as soon as it hits the first catalog table or > whatever with "ERROR: cannot lock this table". This is sorta fine for > vacuumdb, but only because VACUUM itself will instead say "WARNING: > cannot lock table XYZ, skipping", so it's not an error and vacuumdb > keeps running. IMO this is bogus: vacuumdb should not try to process > tables that it doesn't have privileges to. However, not wanting to > change longstanding behavior, I left that alone. For pg_repackdb, I > added a condition in the WHERE clause there to only fetch tables that > the current user has MAINTAIN privilege over. Then you can do a > "pg_repackdb -U foobar" and it will nicely process the tables that > that user is allowed to process. We can discuss changing the vacuumdb > behavior separately. Again, vacuumdb seems to be a good example of what not to do, but I'll leave that for another thread. In general I like this idea, but it does make for a weird corner case where if I specify a table with -t that I don't have permission to repack, repack returns silently whilst doing nothing. I suppose one way to handle that would be to check if the table passed in -t is found in the list of tables with MAINTAIN privileges, and if not to issue a WARNING like "%s not found. Make sure that the table exists and that you have MAINTAIN privileges". Robert Treat https://xzilla.net