public inbox for [email protected]help / color / mirror / Atom feed
Re: Adding REPACK [concurrently] 6+ messages / 3 participants [nested] [flat]
* Re: Adding REPACK [concurrently] @ 2026-03-16 14:21 Matthias van de Meent <[email protected]> 2026-03-16 15:49 ` Re: Adding REPACK [concurrently] Robert Treat <[email protected]> 2026-03-16 19:24 ` Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> 0 siblings, 2 replies; 6+ messages in thread From: Matthias van de Meent @ 2026-03-16 14:21 UTC (permalink / raw) To: Alvaro Herrera <[email protected]>; +Cc: Antonin Houska <[email protected]>; Mihail Nikalayeu <[email protected]>; Pg Hackers <[email protected]>; Robert Treat <[email protected]> On Mon, 16 Mar 2026 at 13:21, Alvaro Herrera <[email protected]> wrote: > > On 2026-Mar-16, Antonin Houska wrote: > > > One more problem related to the replication slot is that, due to the call of > > CheckSlotPermissions() in setup_logical_decoding(), REPLICATION privilege is > > required for REPACK (CONCURRENTLY) to run. That's not too user-friendly. > > > > I think the reason to require the REPLICATION privilege is that, in generic > > case, the output plugin can access data of any table in the database. However > > REPACK uses one particular plugin and that plugin only decodes changes of one > > particular table. Thus I think we don't really need to call > > CheckSlotPermissions(). Do I seem to miss something? > > Yeah, I don't think it makes sense to require REPLICATION privilege to > run REPACK. I don't think it makes sense to allow just any table owner to modify the effective_wal_level GUC; which is what the effect would be of removing the REPLICATION requirement from roles that want to REPACK CONCURRENTLY -- and in doing so create logical slots, which increase effective_wal_level to logical. Creating a replication slot requires REPLICATION privilege, because it consumes a (possibly very) limited server resource that can't be increased without restart and because it impacts other backends' WAL performance through effective_wal_level. Allowing users to consume said resource without first having the appropriate permissions makes this flag practically meaningless. Note that most of my argument hinges on the impact on other, unrelated databases/tables/sessions. Replication slots have a hard cap defined at startup, and effective_wal_level increases the WAL generated by practically all backends. If replication slot counts were SIGHUP, and RelationIsLogicallyLogged() returned true only in databases with LR slots, and only for the tables actually present in publications, I'd consider this much less problematic. However, we don't live in that world, so I am opposed to allowing table owners without REPLICATION to take any/all replication slots. Matthias van de Meent Databricks (https://www.databricks.com) ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Adding REPACK [concurrently] 2026-03-16 14:21 Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> @ 2026-03-16 15:49 ` Robert Treat <[email protected]> 2026-03-16 17:55 ` Re: Adding REPACK [concurrently] Mahendra Singh Thalor <[email protected]> 1 sibling, 1 reply; 6+ messages in thread From: Robert Treat @ 2026-03-16 15:49 UTC (permalink / raw) To: Alvaro Herrera <[email protected]>; +Cc: Matthias van de Meent <[email protected]>; Antonin Houska <[email protected]>; Mihail Nikalayeu <[email protected]>; Pg Hackers <[email protected]> On Mon, Mar 16, 2026 at 11:11 AM Alvaro Herrera <[email protected]> wrote: > On 2026-Mar-16, Matthias van de Meent wrote: > > > However, we don't live in that world, so I am opposed to allowing > > table owners without REPLICATION to take any/all replication slots. > > I think requiring REPACK users to have the REPLICATION priv is rather > user unfriendly. Some potential REPACK users might not have any other > use for replication at all. > For many users, I feel like repack concurrently making use of replication machinery is an implementation detail that some will find quite surprising (pg_repack doesn't use it), so I'd agree requiring REPLICATION priv is both unfriendly and a bit counter intuitive, especially if you need to run repack concurrently on a stand alone server. That said, I think Matthias is right that we can't allow "repackers" to block "replicators"... > > Note that most of my argument hinges on the impact on other, unrelated > > databases/tables/sessions. Replication slots have a hard cap defined > > at startup, and effective_wal_level increases the WAL generated by > > practically all backends. > > I'd rather have a new GUC to declare a bunch of additional slots that > are reserved exclusively for repack, set its default to something like > 3, and call it a day. If all repack slots are in use, you don't get to > run repack, period. > > A slot costs nothing if unused, and we really don't want to make the > interaction with regular replication more complicated than it is today. > I'm never excited about adding GUCs, but at first thought this seems like a decent work-around; most people are unlikely to run multiple repack concurrently's, but they can if needed. (I think the most likely use case is on clusters using the "database per customer" pattern, but if we have the guc, people will have a means to deal with it). Robert Treat https://xzilla.net ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Adding REPACK [concurrently] 2026-03-16 14:21 Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> 2026-03-16 15:49 ` Re: Adding REPACK [concurrently] Robert Treat <[email protected]> @ 2026-03-16 17:55 ` Mahendra Singh Thalor <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Mahendra Singh Thalor @ 2026-03-16 17:55 UTC (permalink / raw) To: Alvaro Herrera <[email protected]>; +Cc: Robert Treat <[email protected]>; Matthias van de Meent <[email protected]>; Antonin Houska <[email protected]>; Mihail Nikalayeu <[email protected]>; Pg Hackers <[email protected]> On Mon, 16 Mar 2026 at 21:33, Alvaro Herrera <[email protected]> wrote: > > On 2026-Mar-16, Robert Treat wrote: > > > I'm never excited about adding GUCs, but at first thought this seems > > like a decent work-around; most people are unlikely to run multiple > > repack concurrently's, but they can if needed. (I think the most > > likely use case is on clusters using the "database per customer" > > pattern, but if we have the guc, people will have a means to deal with > > it). > > I wonder if, longer term, it would make sense to do away with the > max_replication_slots GUC (and this new one) altogether, and use dynamic > shared memory for slots instead. There's of course always the danger > that people would accumulate arbitrary numbers of slots since they would > never be forced to check. But that may be a lesser problem than having > to gauge these GUCs with any care. > > -- > Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ > "You're _really_ hosed if the person doing the hiring doesn't understand > relational systems: you end up with a whole raft of programmers, none of > whom has had a Date with the clue stick." (Andrew Sullivan) > https://postgr.es/m/[email protected] > > Hi all, I was reading this thread and was doing some tests. postgres=# create table test1(a int); CREATE TABLE postgres=# create table test2(a int); CREATE TABLE postgres=# *vacuum full test1 , test2;* VACUUM postgres=# repack test1; REPACK postgres=# repack test2; REPACK postgres=#* repack test1, test2;* ERROR: syntax error at or near "," LINE 1: repack test1, test2; ^ I was not expecting any error but maybe I am missing something (some patch needs to be applied to test this query?). Thanks and Regards Mahendra ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Adding REPACK [concurrently] 2026-03-16 14:21 Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> @ 2026-03-16 19:24 ` Matthias van de Meent <[email protected]> 2026-03-16 21:45 ` Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> 2026-03-16 21:50 ` Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> 1 sibling, 2 replies; 6+ messages in thread From: Matthias van de Meent @ 2026-03-16 19:24 UTC (permalink / raw) To: Alvaro Herrera <[email protected]>; +Cc: Antonin Houska <[email protected]>; Mihail Nikalayeu <[email protected]>; Pg Hackers <[email protected]>; Robert Treat <[email protected]> On Mon, 16 Mar 2026 at 16:10, Alvaro Herrera <[email protected]> wrote: > > On 2026-Mar-16, Matthias van de Meent wrote: > > > Note that most of my argument hinges on the impact on other, unrelated > > databases/tables/sessions. Replication slots have a hard cap defined > > at startup, and effective_wal_level increases the WAL generated by > > practically all backends. > > I'd rather have a new GUC to declare a bunch of additional slots that > are reserved exclusively for repack, set its default to something like > 3, and call it a day. If all repack slots are in use, you don't get to > run repack, period. > > A slot costs nothing if unused, and we really don't want to make the > interaction with regular replication more complicated than it is today. There are a few overheads even for unused slots: each slot uses some shared memory, and many (most) functions that operate on the shared slot state have O(n_slots) overhead through e.g. ReplicationSlotsComputeRequiredXmin(); so having a 10k slot limit whilst using only 1 will be slower for that one active slot than if the limit was just 10. Granted, that's not a lot of overhead, but it's not free. > > However, we don't live in that world, so I am opposed to allowing > > table owners without REPLICATION to take any/all replication slots. > > I think requiring REPACK users to have the REPLICATION priv is rather > user unfriendly. Some potential REPACK users might not have any other > use for replication at all. I agree it's not user-friendly, but that's the point of limiting permissions. Users can't install c-functions without SUPERUSER, because it can cause cluster instability and crashes. Users can't create slots without REPLICATION, because they'll be able to negatively impact the whole cluster's performance, and possibly, stability, when taking up replication slots that otherwise would be used for critical HA purposes. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com) ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Adding REPACK [concurrently] 2026-03-16 14:21 Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> 2026-03-16 19:24 ` Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> @ 2026-03-16 21:45 ` Matthias van de Meent <[email protected]> 1 sibling, 0 replies; 6+ messages in thread From: Matthias van de Meent @ 2026-03-16 21:45 UTC (permalink / raw) To: Alvaro Herrera <[email protected]>; +Cc: Antonin Houska <[email protected]>; Mihail Nikalayeu <[email protected]>; Pg Hackers <[email protected]>; Robert Treat <[email protected]> On Mon, 16 Mar 2026 at 20:34, Alvaro Herrera <[email protected]> wrote: > On 2026-Mar-16, Matthias van de Meent wrote: > > I agree it's not user-friendly, but that's the point of limiting > > permissions. Users can't install c-functions without SUPERUSER, > > because it can cause cluster instability and crashes. Users can't > > create slots without REPLICATION, because they'll be able to > > negatively impact the whole cluster's performance, and possibly, > > stability, when taking up replication slots that otherwise would be > > used for critical HA purposes. > > Well, as I said, these repack slots would be separate from the regular > ones for replication and available only for repack, so they would not > interfere with the count of slots used for replication, so the second > point is moot, isn't it? Sorry, I misread your response as "if at all, then at least like this", instead of "let's do this alternative in this patch". But, yes, a pool of slots used exclusively by REPACK CONCURRENTLY would also solve the slot availability issue. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com) ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Adding REPACK [concurrently] 2026-03-16 14:21 Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> 2026-03-16 19:24 ` Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> @ 2026-03-16 21:50 ` Matthias van de Meent <[email protected]> 1 sibling, 0 replies; 6+ messages in thread From: Matthias van de Meent @ 2026-03-16 21:50 UTC (permalink / raw) To: Antonin Houska <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Mihail Nikalayeu <[email protected]>; Pg Hackers <[email protected]>; Robert Treat <[email protected]> On Mon, 16 Mar 2026 at 21:15, Antonin Houska <[email protected]> wrote: > > Matthias van de Meent <[email protected]> wrote: > > > I agree it's not user-friendly, but that's the point of limiting > > permissions. Users can't install c-functions without SUPERUSER, > > because it can cause cluster instability and crashes. Users can't > > create slots without REPLICATION, because they'll be able to > > negatively impact the whole cluster's performance, and possibly, > > stability, when taking up replication slots that otherwise would be > > used for critical HA purposes. > > I thought these attributes exist primarily for security purposes. Well, yes, but operational security is also security, right? > If > non-SUPERUSER user could install C-functions, it'd be easy to install code > that leaks data. Yes, as long as they're able to find the right primitives in the available binaries. That's certainly possible, but a bit more work than just none. > REPLICATION is currently the only way to limit access to the > the publisher's data as there is no ACL for publications. > And regarding resources, the REPLICATION attribute alone does not pose a limit > on resource consumption unless you limit the total number of sessions of all > the REPLICATION users at the same time. True. It's not great. And we also don't really have a (good) distinction for logical/physical replication permissions either... > Anyway (fortunately?), the concurrent use of slots by REPACK is limited > because, during the initialization of logical decoding, the backend needs to > wait for all the transactions having XID assigned to finish, and these include > the already running REPACK commands. See SnapBuildWaitSnapshot() and callers > if you're interested in details. Huh, so would you be able to run more than one Repack Concurrently in the same database? ISTM that would not be possible, apart from possibly a mechanism comparable to the SAFE_IN_IC flag (to not wait on those backends). Kind regards, Matthias van de Meent Databricks (https://www.databricks.com) ^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-03-16 21:50 UTC | newest] Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-03-16 14:21 Re: Adding REPACK [concurrently] Matthias van de Meent <[email protected]> 2026-03-16 15:49 ` Robert Treat <[email protected]> 2026-03-16 17:55 ` Mahendra Singh Thalor <[email protected]> 2026-03-16 19:24 ` Matthias van de Meent <[email protected]> 2026-03-16 21:45 ` Matthias van de Meent <[email protected]> 2026-03-16 21:50 ` Matthias van de Meent <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox