Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7d21-005Xdg-32 for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 17:37:38 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7d20-00BjKh-1M for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 17:37:36 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7d20-00BjKZ-01 for pgsql-hackers@lists.postgresql.org; Tue, 31 Mar 2026 17:37:36 +0000 Received: from mail-ua1-x92b.google.com ([2607:f8b0:4864:20::92b]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w7d1y-000000020JF-1mMG for pgsql-hackers@lists.postgresql.org; Tue, 31 Mar 2026 17:37:35 +0000 Received: by mail-ua1-x92b.google.com with SMTP id a1e0cc1a2514c-9539d9f1675so2407253241.2 for ; Tue, 31 Mar 2026 10:37:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774978654; cv=none; d=google.com; s=arc-20240605; b=A9D7yGch2Jy4f3/XwCpdvbVKzpu/6taIUd6yoVbh8M4MnJ8BDkuABd55o2qpf5IKz3 zMi5iTNlzLTFFyDYTnHkV1/LpqzgU5pnUTu9NmlmXK+jWrDIECLfgi6U3tt3LDSMCfng H24ZcNtwvKOUo2po1JYB/yvX4uaTkNPMAhHPA9hucs4gFbGkyik8MJL4axXCPic/PNsz MC8OuRkU4Sn+WVdwiYIQ2qF3SRe8ayy/KqUrWvDOs/cLaVVtK1Ie9n9C4ztx3drUj9iZ K3yyY4p58f76dAHf8oiLKuq7jJLH/GxDOA/U9XVlk0g7Qt5RSM5ZSe56PJLE+WUXiS3Z X9DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=KlyZ4bhoTsnPO5SieFmqqFvkMc6fEx5Ll/BslHQp3IQ=; fh=u8zxqgd/UA0jq2EGou+CszJPnyU2s1yA1V5fJOtpgEc=; b=fes27IzUjNKNmtxidz092gItugWur9cGnYArh7hXO7T0uKDCMEMR8NaXsCnYHRauWB KqSafLqZCVEY0yWsYsvTa6Kwbf/OZWKYwY2pacogcejkUoZWFeBwp8JIbRUHY2aQkgsP P2xgn1BMq6VncTGuDHxw17pNvSsOlEzyL1QGe3UK5UxtSvmzZFEuNV08X9XFtHvGjG+j VXQpez+GL3cZqVjVqVcW6BztvWcnmw5cG97zm5zLG0VXBvqHy90K0fz5tv1irfzSziqO iEf5vlPwPTHYqF5PsYtetPjdpSaVJZ0lH5WDCiT500kq9NCtyK3JNWyEx0jD/icaFPC0 ZKmA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774978654; x=1775583454; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KlyZ4bhoTsnPO5SieFmqqFvkMc6fEx5Ll/BslHQp3IQ=; b=ddnH3KECuEzfvOoKL0DE5ixotp+r4CY5Q+yX1eZvY72GTPVkuCMeJUew8sbqt2v5rX o652tQaO+djRqJw66W/gwzoFFUmoBvn3X5nJYi0yJkyICh4GYPybg3MBOrjozuYmLyGD 165IOVFI67SV86V5B6R9dJe/eSkFFnZ535wXr3kICrpEkqWxZFYZP/lGhR9Y+sCjUiHu cnTWF/X/2lzB79/NGEaOwKN/Z9QSG/ltvBRRe0nBYzpkLpBGt9l+OTwe/QkJZlWK1FlJ GYMIDPlIT6jXPCTbp0NICYompfKzzmgycz9CKnf1Ew+Du58ogqNvnuCoIMQo796noV80 aBcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774978654; x=1775583454; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KlyZ4bhoTsnPO5SieFmqqFvkMc6fEx5Ll/BslHQp3IQ=; b=m7jHOpEpfrAN3V9J7oSnaQTlL00uf3sfDMGG1gPQih9/XbOQh9RUXWahnvzA3PUqMc xcq3v0OmGBVYxy7yRCmChXekHuwbUnYI/OhXF+DPGFbi3XH4dMKX/IBQeUr3RzRCaGaY oWe8XCDEdWbND2+jefHhmvWb/g0HzAKlWAESkH0PUJa+a/YLeLSe1HvYc4MuZmJRalh4 Tz3rdEZRYzngXaTPxu8UAbIQDF5y/hH6hQ3sZIl5IEwPZI8F9r17lW4h31Ebu0NeLA+9 7BlVNaq2Pb7/9AirZfYah8OU6PHOFaTxAvRzuWqXEzoOaXWFvRucSAnSLmbor8hqgKnM /+Cg== X-Forwarded-Encrypted: i=1; AJvYcCWLESbaXgYyBcbSwm99XllSKxLyFg6epPdgcfhqE0lqoZElBm0MWySm63mVojNn7WJq+bAHGiC1Fw6FQPcT@lists.postgresql.org X-Gm-Message-State: AOJu0YysG365RNGqmPNfCm6I8BveVVwYzev3yl2tFS8VHL+GMH9o/5c/ BAgS2qu7DczTga6RFoYOz71EoNAYU7Bw5l1LKJqBn/jxd3q532H4X0jL+nTEB0WHs4xWWpxvvhE XBlEABOc3DbMZgRL4ukcCgUrwUMOkIbE= X-Gm-Gg: ATEYQzwZ8B3jf8N+WzYj5fc124cyEoq2PYghGUwtx/jKS7GqA96PYpv2gmnh9Qo75Gd OxyIA8nAB3PW8O+vYTgB+L3Cmv5KGiAL33viwRTcj6ki/yku3sp31JFa4J3hIq/5BR8sTlgvJdu cp6gG/eznpLJuwwjsWKndbBYe7pZklflDteGikdZgT7x65DC7qtl7J01Ldoq+/t9e2MYo53ef4y EMGZiAs6fLPKwu9adgpAQu7ui3nVOVKnC6V3iJNvGXjvZt6fGvVFm9Z/ROlMZJ87O3NCUF/oP49 ulQthD0f3Tp35Be5GuxW1iSIeOhWf/SPL3iXF/OU3RmCe5rJW2swpMf/5KWmRV39mZqi1HsGYHU k/yAf5tZzGctEh0L5emGwt4bmqeI= X-Received: by 2002:a05:6102:50a0:b0:605:5d09:8631 with SMTP id ada2fe7eead31-6056820d511mr45924137.29.1774978648902; Tue, 31 Mar 2026 10:37:28 -0700 (PDT) MIME-Version: 1.0 References: <202603252005.quy5h4oipoxd@alvherre.pgsql> In-Reply-To: <202603252005.quy5h4oipoxd@alvherre.pgsql> From: Srinath Reddy Sadipiralla Date: Tue, 31 Mar 2026 23:07:17 +0530 X-Gm-Features: AQROBzAZNFirPK97By7UCe3mZl76Z9yYAfSXHo0oxpmsDqZiixAosYndJnpSSCs Message-ID: Subject: Re: Adding REPACK [concurrently] To: Alvaro Herrera Cc: Mihail Nikalayeu , Antonin Houska , Matthias van de Meent , Pg Hackers , Robert Treat Content-Type: multipart/alternative; boundary="000000000000da1049064e556bf0" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000da1049064e556bf0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Alvaro, On Thu, Mar 26, 2026 at 1:42=E2=80=AFAM Alvaro Herrera wrote: > > As for lock upgrade, I wonder if the best way to handle this isn't to > hack the deadlock detector so that it causes any *other* process to die, > if they detect that they would block on REPACK. Arguably there's > nothing that you can do to a table while its undergoing REPACK > CONCURRENTLY; any alterations would have to wait until the repacking is > compelted. We can implement that idea simply enough, as shown in this > crude prototype. > After testing this, I observed that it solves the scenario where a query is waiting on REPACK. For example, if a DROP TABLE requests an AEL and queues behind REPACK's ShareUpdateExclusiveLock, the deadlock detector comes when REPACK tries to upgrade to AEL, killing the DROP to prevent the circular queue deadlock, But the case I originally mentioned [1] was the reverse: what happens if a transaction already holds a lock that conflicts with the upcoming AEL upgrade (e.g., an analytical SELECT or an idle-in-transaction holding an AccessShareLock), but isn't waiting on REPACK at all? In this case, there's no circular wait. The deadlock detector never fires. REPACK simply queues behind the SELECT, eventually hits its lock_timeout, aborts and cleans up.Initially, I thought this cleanup was expected behavior. But after seeing your solution to protect REPACK from losing its transient table work, I thought it's "not expected". If the goal is to prevent REPACK's work from being wasted, should we error out the backend that is making REPACK wait during the final swap phase? I am thinking of something conceptually similar to ResolveRecoveryConflictWithLock,actively cancelling the conflicting session to allow the AEL upgrade to proceed. Thoughts? test scenario: session 1: postgres=3D# repack (concurrently) stress_victim; had a breakpoint rebuild_relation_finish_concurrent-> LockRelationOid(old_table_oid, AccessExclusiveLock); just before getting the exclusive lock. with lock_timeout =3D 5s session 2: postgres=3D# BEGIN; SELECT * FROM stress_victim LIMIT 1; -- left it open BEGIN id | balance | payload -----+---------+--------------------------------- ------------------------------------------------- ------------------------------------------------- ------------------------------------------------- -------------- 170 | 65 | d12f400c4d0d3c49818f88597e16cf29 d12f400c4d0d3c49818f88597e16cf29d12f400c4d0d3c498 18f88597e16cf29d12f400c4d0d3c49818f88597e16cf29d1 2f400c4d0d3c49818f88597e16cf29d12f400c4d0d3c49818 f88597e16cf29 (1 row) -- this gets us a conflicting lock (AccessShareLock) on the same table, REPACK (concurrently) is running on. session 1: release the breakpoint and now the backend waits for the conflicting lock to be released. in between if lock_timeout occurs then transaction aborts. postgres=3D# repack (concurrently) stress_victim; ERROR: canceling statement due to lock timeout CONTEXT: waiting for AccessExclusiveLock on relation 16637 of database 5 [1] - https://www.postgresql.org/message-id/CAFC%2Bb6pK9ogeSpMA8hg18XhC1eNPcsKWBw= oC5OySXi4iTxwtRw%40mail.gmail.com --=20 Thanks, Srinath Reddy Sadipiralla EDB: https://www.enterprisedb.com/ --000000000000da1049064e556bf0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Alvaro,

On Thu, M= ar 26, 2026 at 1:42=E2=80=AFAM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

As for lock upgrade, I wonder if the best way to handle this isn't to hack the deadlock detector so that it causes any *other* process to die, if they detect that they would block on REPACK.=C2=A0 Arguably there's<= br> nothing that you can do to a table while its undergoing REPACK
CONCURRENTLY; any alterations would have to wait until the repacking is
compelted.=C2=A0 We can implement that idea simply enough, as shown in this=
crude prototype.=C2=A0

After testing this, I obser= ved that it solves the scenario where a query is waiting
on REPACK. For = example, if a DROP TABLE requests an AEL and queues
behind REPACK's = ShareUpdateExclusiveLock, the deadlock detector comes
when REPACK tries = to upgrade to AEL, killing the DROP to prevent the circular
queue deadlo= ck, But=C2=A0the case I originally mentioned [1] was the reverse: what
h= appens if a transaction already holds a lock that conflicts with the upcomi= ng
AEL upgrade (e.g., an analytical SELECT or an idle-in-transaction hol= ding an AccessShareLock),
but isn't waiting on REPACK at all?
In this case, there's no circular wait. The deadlock detector never fi= res. REPACK
simply queues behind the SELECT, eventually hits its lock_ti= meout, aborts and
cleans up.Initially, I thought this cleanup was expect= ed behavior. But after seeing
your solution to protect REPACK from losin= g its transient table work, I thought it's "not expected".If the goal is to prevent REPACK's work from being wasted, should we e= rror out
the backend that is making REPACK wait during the final swap ph= ase? I am thinking
of something conceptually similar to ResolveRecoveryC= onflictWithLock,actively
cancelling the conflicting session to allow the= AEL upgrade to proceed. Thoughts?



test scenario:
=
session 1:
postgres=3D# repack (concurrently) stress_victim;
had = a breakpoint rebuild_relation_finish_concurrent->
LockRelationOid(old= _table_oid, AccessExclusiveLock); just before getting
the exclusive lock= .
with lock_timeout =3D 5s



session 2:
postgres=3D# BEG= IN;
SELECT * FROM stress_victim LIMIT 1;
-- left it open
BEGIN
= =C2=A0id =C2=A0| balance |
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0payl= oad
-----+---------+---------------------------------
---------------= ----------------------------------
-------------------------------------= ------------
-------------------------------------------------
------= --------
=C2=A0170 | =C2=A0 =C2=A0 =C2=A065 | d12f400c4d0d3c49818f88597e= 16cf29
d12f400c4d0d3c49818f88597e16cf29d12f400c4d0d3c498
18f88597e16c= f29d12f400c4d0d3c49818f88597e16cf29d1
2f400c4d0d3c49818f88597e16cf29d12f= 400c4d0d3c49818
f88597e16cf29
(1 row)
-- this gets us a conflictin= g lock (AccessShareLock) on the same table,
REPACK (concurrently) is run= ning on.



session 1:
release the breakpoint and now the ba= ckend waits for the conflicting lock
to be released.
in between if lo= ck_timeout occurs then transaction aborts.
postgres=3D# repack (concurre= ntly) stress_victim;
ERROR: =C2=A0canceling statement due to lock timeou= t
CONTEXT: =C2=A0waiting for AccessExclusiveLock on relation 16637 of da= tabase 5


--
T= hanks,
Srinath Reddy Sadipiralla
<= span style=3D"color:rgb(34,34,34)">EDB:=C2=A0https:/= /www.enterprisedb.com/
--000000000000da1049064e556bf0--