Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sBwlO-002Cw0-II for pgsql-general@arkaria.postgresql.org; Tue, 28 May 2024 13:21:16 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sBwlM-009e0a-UM for pgsql-general@arkaria.postgresql.org; Tue, 28 May 2024 13:21:12 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sBwlL-009e0R-GS for pgsql-general@lists.postgresql.org; Tue, 28 May 2024 13:21:12 +0000 Received: from fhigh7-smtp.messagingengine.com ([103.168.172.158]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sBwlE-002N3L-J3 for pgsql-general@postgresql.org; Tue, 28 May 2024 13:21:09 +0000 Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfhigh.nyi.internal (Postfix) with ESMTP id A4BD511400DD; Tue, 28 May 2024 09:21:02 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute7.internal (MEProxy); Tue, 28 May 2024 09:21:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purefiction.net; h=cc:cc:content-transfer-encoding:content-type:content-type :date:date:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:subject:subject:to:to; s=fm3; t=1716902462; x=1716988862; bh=bOB74DCkB22wQt0UR/b2B/Omh1yMAFav wA4aaSSccVU=; b=xkCi90cVcNLakvEF9v8b9BRZ7CQq21aT1PE5IKKO33IBc1Wg Zc38QLrJfQ+zsvpE7b09QShJ8juWBfZjWfWc3xccE4F3DHVELepmhLj5ie7/LunL AbfW1R6r7LoYyH3xMnSf7H2llMfzva3u19wU2Lh0nG8qVA42FDTUiBh0b+J6C54U fCMKhl2ywPiMjydkR17qUAZKXEOkTdjVg9Uwxg7qTFia5106gKqcDkgyt461xnNT OZhWy2Hxdy9Euf0ZmSM7f7x5zPF4Dh3DAccVJwVet2SsydrzxcPdScC+SiAvv8za jhSUD+i2I6WJZaB7rghm10p3tz+IgHHKe6D2eQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1716902462; x= 1716988862; bh=bOB74DCkB22wQt0UR/b2B/Omh1yMAFavwA4aaSSccVU=; b=F WXe5pMipwSAoMOPv4byGFBr8aFhJoP72FrTQWsXlAgEunvYSpyQNh58D3Xhm7fbS IOgdnTfAjv949bm6IAvYxuHkDNKoXdizCVbSoAENczSfA6+u10kaBPfQW7YvJqgd tXx38RfUGflg2v7fr4XrBLB8+hwD9G2xwEsc58Ck+Jw6KNCEFRRcR/HSrHY4gSe9 DcbwqYGWeWq8fX4uoHvqVm+UnjNHsEWKh2+CRs1m6JEjw/UrN/QS00FeGYxgJJqd WtcLF24nrIhwiGMkSEHlqo7tjqtpYarn1ygIc/a9G5hPi5MS7Qx+qMcnQa963P/a LZXGz7sFPKsxINKi7gGjg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrvdejkedgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurheptggguffhjgffvefgkfhfvffosehtqhhmtdhhtdejnecuhfhrohhmpeetlhgv gigrnhguvghrucfuthgruhgsohcuoegrlhgvgiesphhurhgvfhhitghtihhonhdrnhgvth eqnecuggftrfgrthhtvghrnheptdegfeekiedujeegleduudelhfefueejhfegheefgeef teegiedvheeghffgvddunecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrg hilhhfrhhomheprghlvgigsehpuhhrvghfihgtthhiohhnrdhnvght X-ME-Proxy: Feedback-ID: i341740b3:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 28 May 2024 09:21:01 -0400 (EDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\)) Subject: Re: Use of inefficient index in the presence of dead tuples From: Alexander Staubo In-Reply-To: <580159f0fad7be030ad8632e49d1cb01e8d38acc.camel@cybertec.at> Date: Tue, 28 May 2024 15:20:46 +0200 Cc: "pgsql-general@postgresql.org" Content-Transfer-Encoding: quoted-printable Message-Id: <2D6F40AA-1385-48AA-9F9F-DA8AA2BF30BC@purefiction.net> References: <580159f0fad7be030ad8632e49d1cb01e8d38acc.camel@cybertec.at> To: Laurenz Albe X-Mailer: Apple Mail (2.3774.500.171.1.1) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 28 May 2024, at 13:02, Laurenz Albe wrote: > ANALYZE considers only the live rows, so PostgreSQL knows that the = query will > return only few results. So it chooses the smaller index rather than = the one > that matches the WHERE condition perfectly. >=20 > Unfortunately, it has to wade through all the deleted rows, which is = slow. Sounds like the planner _should_ take the dead tuples into account. = I=E2=80=99m surprised there are no parameters to tweak to make the = planner understand that one index is more selective even though it is = technically larger. > But try to execute the query a second time, and it will be much = faster. > PostgreSQL marks the index entries as "dead" during the first = execution, so the > second execution won't have to look at the heap any more. Of course. It=E2=80=99s still not _free_; it=E2=80=99s still trawling = through many megabytes of dead data, and going through the shared buffer = cache and therefore competing with other queries that need shared = buffers.=20 > I understand your pain, but your use case is somewhat unusual. I don=E2=80=99t think rapidly updated tables is an unusual use of = Postgres, nor is the problem of long-running transaction preventing dead = tuple vacuuming. > What I would consider in your place is > a) running an explicit VACUUM after you delete lots of rows or The rows are deleted individually. It=E2=80=99s just that there are many = transactions doing it concurrently. > b) using partitioning to get rid of old data Partitioning will generate dead tuples in the original partition when = tuples are moved to the other partition, so I=E2=80=99m not sure how = that would help? I did explore a solution which is my =E2=80=9Cplan B=E2=80=9D =E2=80=94 = adding a =E2=80=9Cdone=E2=80=9D column, then using =E2=80=9CUPDATE =E2=80=A6= SET done =3D true=E2=80=9D rather than deleting the rows. This causes = dead tuples, of course, but then adding a new index with a =E2=80=9C=E2=80= =A6 WHERE NOT done=E2=80=9D filter fixes the problem by forcing the = query to use the right index. However, with this solution, rows will = still have to be deleted *sometime*, so this just delays the problem. = But it would allow a =E2=80=9Cbatch cleanup=E2=80=9D: =E2=80=9CDELETE = =E2=80=A6 WHERE done; VACUUM=E2=80=9D in one fell swoop.