Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sBube-0021lH-25 for pgsql-general@arkaria.postgresql.org; Tue, 28 May 2024 11:03:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sBubd-009RIg-Gq for pgsql-general@arkaria.postgresql.org; Tue, 28 May 2024 11:03:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sBubc-009RIY-Vj for pgsql-general@lists.postgresql.org; Tue, 28 May 2024 11:03:01 +0000 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sBubV-001AKN-4g for pgsql-general@postgresql.org; Tue, 28 May 2024 11:02:57 +0000 Received: by mail-wr1-x429.google.com with SMTP id ffacd0b85a97d-354e22bc14bso494779f8f.1 for ; Tue, 28 May 2024 04:02:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cybertec-at.20230601.gappssmtp.com; s=20230601; t=1716894171; x=1717498971; darn=postgresql.org; h=mime-version:user-agent:content-transfer-encoding:autocrypt :references:in-reply-to:date:to:from:subject:message-id:from:to:cc :subject:date:message-id:reply-to; bh=AwZGj6N8GiqS50mw3GFuMtaSlkJ0qzCTPuifQfKo61g=; b=m8v5GYhCkHAcUnQAMBovq8slkHz3Txwjue6IRnT1Qj8m+XnLuWPcfbj1vVz8mnFyOl fJNy0gMs0c2gBHeObzzAt2sHpMtbrfAtaXhZpqq+cskq9f4sdz59JIC4IbKDwMzKWGKh CllaoIFZxgZRXCXbVrTE6diJ5JPrTvohgt1Lv9DFhQ5NX/pMu+Oby0g2eWDcz7DFQRk1 jEf5G+29Xt59z9ago5uCkaBAg1bp6hSLQhiDYW8XutcwBxGu9IFHsziCIfc2mHkuBxF5 6BUiD3+8W3elFmpzzx1xTxxZ1n9sUqPWYChkd92yJBoZEeTsPwtrFew+JwCh9dpI6n4W ui2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716894171; x=1717498971; h=mime-version:user-agent:content-transfer-encoding:autocrypt :references:in-reply-to:date:to:from:subject:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AwZGj6N8GiqS50mw3GFuMtaSlkJ0qzCTPuifQfKo61g=; b=RdNWFAr2WubWzLUPkcsoMH8UCGjhNL0MdbuilChPGGiCkQd1yKrIVGoTjFYJ3JKRIO uTbhm3muMjp9HKpLS6GPlG3YI/9A5wNebwq+1Vu0XgaMRPXCUlClxZnlBK4MOe7AtTYP QRkKztTXl53G4Tg2czuw/JuAKdyJ8lhUb7KkWxh4aqMpNMFeQEiH3b2Fn+syS9DKXaHo WlhTmvOcLaUZSdW2AgFQs+gd76ppo0TjKSdnBOvdKy/awb5L9fFWeO7+67aw5NBSMKj1 Wx5kDp7b8GnOxKz6+w3CLBqqBF292ZtRo6exqZXNq7zeV9qhC29h3BQ0n6tijQly4aRm I/lw== X-Forwarded-Encrypted: i=1; AJvYcCVlPkcl85X9Rz3CPREP9Fu8tiZiJQanNRRATMuDy6cjh3EMGa0TEdhM+MD4sHq4aNGXEgR+2DgXUxAGf5Y/wW3fruQPUFgtFlNXPBr9 X-Gm-Message-State: AOJu0YxhfhAWRHW/q7kKkLWYfiqFv1ZVU2esbPb/ewlgfUKdU7w3IQst ooLgjnWYngjNOvXMJPUHXs8jubU8qpgRtBNy0LlFnsaf5RbzBU3hRamzHc0AAhJrB4OkVPJo0xK hFxQ= X-Google-Smtp-Source: AGHT+IH9TIPKot/cek+v69ibHdiS3QSdlpTofMBCoEPHtvoFk0N52pTJEId69OAXwddbE+o5ljImCg== X-Received: by 2002:a5d:4e01:0:b0:354:f1de:33eb with SMTP id ffacd0b85a97d-3552f4fd249mr7828718f8f.26.1716894170687; Tue, 28 May 2024 04:02:50 -0700 (PDT) Received: from localhost.localdomain ([88.116.133.170]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-359efaf5402sm4061695f8f.78.2024.05.28.04.02.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 May 2024 04:02:50 -0700 (PDT) Message-ID: <580159f0fad7be030ad8632e49d1cb01e8d38acc.camel@cybertec.at> Subject: Re: Use of inefficient index in the presence of dead tuples From: Laurenz Albe To: Alexander Staubo , "pgsql-general@postgresql.org" Date: Tue, 28 May 2024 13:02:49 +0200 In-Reply-To: References: Autocrypt: addr=laurenz.albe@cybertec.at; prefer-encrypt=mutual; keydata=mQINBGGDwAQBEADgbWy5cKXQld3N2mF+DFyiNFbi2oBl2T+XgxpPF8wTRw2D/u4bBKXP0SYSE/lA86jIVNWWU0gf1KODIkVvgJm2w4vH2VBV1b7ddVViGl1Iu+9zaRnv9wulhnH42KefepXnoean6UT1EzLM0opF/Ik0j+40TxdRtobkBprkQUyHDXWlHc2ffPs3SipyFEP9AVLf7ejRC46CXWDnsqjOBSMEW8Z4HiK/8RrPZBsKLts8dJxKF4pygOdJb0CWk8k/X1jbcfdxo+zOLjOMvJcSJ2pFdJmQHU+JufB3rePziqQ2S9Ur6sccr9XnTC1GVBWN4Lf5VHq+vf+bFJjVwg+2hrySZnAVfcOrxoqFLErr7ug1zN2nM1kcpgA4VWn4gxlJtYNYYq+9WxX5dtvnNANlG3ZCrRKQzl8lxtzoF6Zo7LUhEqPaHDwn7Rvs+IdbOn41lF5UDTJGqmC4gS/bZydW2Fy3YWm4aSaN9fgFf8D+PVkrlKAZB7gBLz1TyHjbcRf85cYF+GKKrDld5SzMB/V60VX3oP/Eo8ikFpyWaqiz1f9X7MBot3/PjJkY+wDzp3nmb19QEcOBuQiSQ4xds2r0HewbuHTAR68u8jNNMGmpm2j4x+g09Jd/WQDjqlTBZ/jEltH41fYCCPWMfljXTOOXu2eLNGdfi7ETZogtwjM9oTtSPQARAQABtCdMYXVyZW56IEFsYmUgPGxhdXJlbnouYWxiZUBjeWJlcnRlYy5hdD6JAk4EEwEIADgWIQR0CqhbZGGABqoaSbdi8bhXA2EdmAUCYYPABAIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRBi8bhXA2EdmM/6EADK232JCwmBzhlj8h7U9CjG6kx0JHP3uJGv+XfsHtHAlmY/RCwF1BHMEsRlk bT5UrLvJ2jb99bA9QARzhFaxzyn0F/BUKzuIjRGNs/n6d5dNUFA0kOt8sX+TacmC GEyjEBCrVCm4ranBiUyePn9NhHNWnaex7pJyqvMLLdwW9BEMJx0Fqo+DN8ukbXmYRsmhEtd3ue+x/luYmOmJnaGtzInaY5aOJYbW9XqoRIZkZvOCgbi1FfvNmoqWa+3oVxTOgw9RafjJDyW0lTHzKGjbGI5ofMU98l+/hKJFYJqWUF6VpFJY5YIcN/1lf4ZICMwDl+MPIVo/tpq8L10seJL28nLlvw3K+cI+TVW8IW/qL/LyVoDofI3USeOORuYmhpWRhik8JXX6xf3v6GrRilJIPWNFIJbxm1ZblQiQnOw3IOW7T+8nAmPin1HKqM3VrOrJQ2VtShsefNBibNAsr1oFaqcDBkn3yGG8i6CTW+FyO4PZ+/EwNxMVgktxbYdy5AT1/lpXr5tB+phhLIyVfiBvrWs5EThxYMQ/L8Y85c3GMsAy1l/x4h3jqySIYy3SCU9+jc5UVuNnXljbvkEzJ+NLWJ6C1rACFWrMszgPdh5tCrlRY9PpmYll4JbCgb8BtxEIUmR+xr50/ZElEK5iml7Q00KUekCcDt+36PsyGFTXBzNOrkCDQRhg8AEARAAzOZ2tLHlI4rrhG411h6cdCFjBZxuljaFCxFyHn3m6wbGLqwBUWC5k8UrRqjHMz88KcTSaNO7XGAmCqPdWd2SeflPZRnNTbjsVpw7mLdffsBm4JX7kki2Pvk5h0NtYeidXT1PSpc2ri4DutYXuT9uD8RAm1wUDCE5HQNUihT/WH6opt+hskHW21uHao0+y822tG0QQcGMqdQR5Vxdxj89wiEPdqW+HpU/oOZIhrf2E7prduAppxixjHy/o1rcnoznnJvc8D3+YgI9O0LrBMij89dM55pRGbLovTR1oGR3U74sX774+0xmSzeIKwZfiMUz7Atlvfk5SHOsRUFPN2Ux9kaXiiBibQpHFxt7b lDrT4wxdLJ/XCdbPPAyl+lZtOLsaHEEZvYNyTXwZc35dVf3R4/oz20HoG6s7ct8e1 AQygj43XAERzty9SkWgxs8+grp1PrGx6FHVSYRqBM8dS/ZR6yRVwOwJXPyaSSqfIF21DkE4j1y4n+ItSewPGoRp8K/yWCikt6qlkVkO2ASNIiX04fAbtzwVOaNn8ZMRNqyvLc1fED4sr49onE4cAIcBLjcC3KL+w9DUGRQCdziROj5H2Yl/sXGPdMciUHo/Uz2rggc+2th3bQiMhrHWSsBpUkDQp0yWewemstPpPgBL3h2fHKaX8B9oH5Qu/H1IgrOuX8AEQEAAYkCNgQYAQgAIBYhBHQKqFtkYYAGqhpJt2LxuFcDYR2YBQJhg8AEAhsMAAoJEGLxuFcDYR2YuPwQAMkpGtR80pQ1gVsONhdkqj0H2eU66efP/gO3CoyaoIcvrpKYj7C2HipVSmkt1gpByL0X4AMQ/vKuknUz3wd28Ba+G1dCfbVs/Xiusq+SmpUj5rTwmYqdSjWMuCo1R6oS5hdJMdUUJYGMT0QkVlm1KnW8jkmCTl9GzjDxOAsN9O6/6lPzaGFtk9XF+34Bry/N4HKiJkqpC4+UTd0AprPfzJ2jdT64e1F0+W88X8y1bTTgNrHwK4mDiLnlE4SKRuEm54lNhJz//ar86Or5BErzNpM6TL7lk44QS06hwsMrEdKIy8J/SYJPjfzR8tIUnKscclVpOgjKaBqC+0iFiVaRqAgfOlIEiezX6kMh5Q2FIUfqs46qWhhXjRrdKOEoStYAaikdLu5ZXr7vfb0ZaDh+ZwTQtbSMFolyOkecwI81MCdbMfT/1TqIGTOdAj5as9fAakk0jb2pXgUYQ8X1DVTR8ahSDVEaw9VTmWiSvTxvguVJ1Mb7gG4Gmh6aviDTJhfXtH4rPUNXhDLqrTH8JkJjyKROOMakIF68Hjse5vUfUxreBEOtb5r1Coa2Fe7ncJayaSE7ryrDbFqpZ 36UMAx4ulWMyqJajLNGY0DdG8qIsR5nxRhrnK/mrCidZ8F9/D3bWAl4rjtHlsztN59 +AnW5l0HsQcY9ntFL/zEBOaonjdJf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.50.4 (3.50.4-1.fc39) MIME-Version: 1.0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, 2024-05-28 at 10:00 +0200, Alexander Staubo wrote: > I am encountering an odd problem where Postgres will use the wrong index,= particularly if the table > has some dead tuples. The database affected is running 12.6, but I can al= so reproduce with 16.3. >=20 > To reproduce: > [create a table with a larger index on "id" and "receiver" and a smaller = on > "receiver" and "created_at", then delete all but one row and ANALYZE] >=20 > (7) Try the following query: >=20 > EXPLAIN (ANALYZE, VERBOSE, BUFFERS, COSTS, TIMING, SETTINGS, SUMMARY) > SELECT id FROM outbox_batches > WHERE receiver =3D 'dummy' > AND id =3D 'test'; >=20 > Here's the query plan: >=20 > Index Scan using outbox_batches_on_receiver_and_created_at on public.= outbox_batches (cost=3D0.38..8.39 rows=3D1 width=3D5) (actual time=3D0.426= ..984.038 rows=3D1 loops=3D1) > Output: id > Index Cond: (outbox_batches.receiver =3D 'dummy'::text) > Filter: (outbox_batches.id =3D 'test'::text) > Buffers: shared hit=3D3948 read=3D60742 dirtied=3D60741 written=3D302= 09 > Settings: work_mem =3D '32MB' > Query Identifier: -2232653838283363139 > Planning: > Buffers: shared hit=3D18 read=3D3 > Planning Time: 1.599 ms > Execution Time: 984.082 ms >=20 > This query is reading 60K buffers even though it only needs to read a sin= gle row. Notice in particular the > use of the index outbox_batches_on_receiver_and_created_at, even though o= utbox_batches_pkey would be > a much better choice. We know this because if we drop the first index: >=20 > Index Only Scan using outbox_batches_pkey on public.outbox_batches (= cost=3D0.50..8.52 rows=3D1 width=3D5) (actual time=3D2.067..2.070 rows=3D1 = loops=3D1) > Output: id > Index Cond: ((outbox_batches.receiver =3D 'dummy'::text) AND (outbox_= batches.id =3D 'test'::text)) > Heap Fetches: 1 > Buffers: shared hit=3D1 read=3D4 > Settings: work_mem =3D '32MB' > Query Identifier: -2232653838283363139 > Planning: > Buffers: shared hit=3D5 dirtied=3D1 > Planning Time: 0.354 ms > Execution Time: 2.115 ms >=20 > This is also the index that's used in the normal case when there are no d= ead tuples at all. >=20 > Interestingly, the cost of an index only scan on outbox_batches_pkey is 8= .52, whereas the other is > 8.39. Is this because it considers the number of index pages? I've tried = adjusting the various cost > and memory settings, but they have no effect. ANALYZE considers only the live rows, so PostgreSQL knows that the query wi= ll return only few results. So it chooses the smaller index rather than the o= ne that matches the WHERE condition perfectly. Unfortunately, it has to wade through all the deleted rows, which is slow. But try to execute the query a second time, and it will be much faster. PostgreSQL marks the index entries as "dead" during the first execution, so= the second execution won't have to look at the heap any more. See https://www.cybertec-postgresql.com/en/killed-index-tuples/ > In this test, we created 5M dead tuples. However, for me it also reproduc= es with just 1,000 rows. > For such a small table, the performance degradation is minimal, but it in= creases as more and more > tuples are deleted. >=20 > In a production environment, we have rows being constantly deleted at a h= igh rate, leaving a table > that often has very few live tuples, and often 500K+ dead tuples before a= utovacuum can kick in. Here > I am consistently seeing the wrong index used, leading to poor performanc= e. >=20 > The autovacuum settings ar aggressive, but for whatever reason it is not = keeping up. We also have > long-running transactions that sometimes cause the xmin to hang back for = a while, preventing > vacuums from helping. >=20 > All of that said, I would rather Postgres choose the right index than spe= nd a lot of time optimizing > vacuums. I understand your pain, but your use case is somewhat unusual. What I would consider in your place is a) running an explicit VACUUM after you delete lots of rows or b) using partitioning to get rid of old data I don't know how the PostgreSQL optimizer could be improved to take dead ro= ws into account. Yours, Laurenz Albe