Re: Use of inefficient index in the presence of dead tuples

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Alexander Staubo <[email protected]>
To: Tom Lane <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Use of inefficient index in the presence of dead tuples
Date: Wed, 29 May 2024 14:36:24 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>

> On 29 May 2024, at 02:53, Tom Lane <[email protected]> wrote:
> 
> Alexander Staubo <[email protected]> writes:
>> (2) Set up schema. It's important to create the index before insertion, in order to provoke a
>> situation where the indexes have dead tuples:
>> ...
>> (4) Then ensure all tuples are dead except one:
> 
>>    DELETE FROM outbox_batches;
>>    INSERT INTO outbox_batches (receiver, id) VALUES ('dummy', 'test');
> 
>> (5) Analyze:
> 
>>    ANALYZE outbox_batches;
> 
> So the problem here is that the ANALYZE didn't see any of the dead rows
> and thus there is no way to know that they all match 'dummy'.  The cost
> estimation is based on the conclusion that there is exactly one row
> that will pass the index condition in each case, and thus the "right"
> index doesn't look any cheaper than the "wrong" one --- in fact, it
> looks a little worse because of the extra access to the visibility
> map that will be incurred by an index-only scan.
> 
> I'm unpersuaded by the idea that ANALYZE should count dead tuples.
> Since those are going to go away pretty soon, we would risk
> estimating on the basis of no-longer-relevant stats and thus
> creating problems worse than the one we solve.

Mind you, “pretty soon” could actually be “hours" if a pg_dump is running, or some other long-running transaction is holding back the xmin. Granted, long-running transactions should be avoided, but they happen, and the result is operationally surprising.

I have another use case where I used a transaction to do lock a resource to prevent concurrent access. I.e. the logic did “SELECT … FROM … WHERE id = $1 FOR UPDATE” and held that transaction open for hours while doing maintenance. This ended up causing the exact same index issue with dead tuples, with some queries taking 30 minutes where they previously took just a few milliseconds. In retrospect, this process should have used advisory locks to avoid holding back vacuums. But the point stands that a small amount dead tuple cruft can massively skew performance in surprising ways.

> What is interesting here is that had you done ANALYZE *before*
> the delete-and-insert, you'd have been fine.  So it seems like
> somewhat out-of-date stats would have benefited you.
> 
> It would be interesting to see a non-artificial example that took
> into account when the last auto-vacuum and auto-analyze really
> happened, so we could see if there's any less-fragile way of
> dealing with this situation.

Just to clarify, this is a real use case, though the repro is of course artificial since the real production case is inserting and deleting rows very quickly.

According to collected metrics, the average time since the last autoanalyze is around 20 seconds for this table, same for autovacuum. The times I have observed poor performance is in situations where the autovacuum was not able reclaim non-removable rows, i.e. it’s not the absence of autovacuum, but rather the inability to clear up dead tuples.

view thread (8+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Use of inefficient index in the presence of dead tuples
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox