public inbox for [email protected]
help / color / mirror / Atom feedFrom: Steve Wampler <[email protected]>
To: Scott Lamb <[email protected]>
Cc: Postgres-performance <[email protected]>
Subject: Re: Help speeding up delete
Date: Mon, 14 Nov 2005 16:52:53 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
Scott Lamb wrote:
> On Nov 14, 2005, at 2:07 PM, Steve Wampler wrote:
>
>> # SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a
>> # WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';
>
>
> Isn't this equivalent?
>
> select id from tmp_table2 where name = 'obsid' and value = 'oid080505';
Probably, the user based the above on a query designed to find
all rows with the same id as those rows that have a.name='obsid' and
a.value='oid080505'. However, I think the above would work to locate
all the ids, which is all we need for the delete (see below)
>> # DELETE FROM "tmp_table2" WHERE id IN
>> # (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a
>> # WHERE at.id=a.id and a.name='obsid' and a.value='oid080505');
>
>
> and this?
>
> delete from tmp_table2 where name = 'obsid' and value = 'oid080505';
>
> Why are you doing a self-join using id, which I assume is a primary key?
Because I think we need to. The above would only delete rows that have
name = 'obsid' and value = 'oid080505'. We need to delete all rows that
have the same ids as those rows. However, from what you note, I bet
we could do:
DELETE FROM "tmp_table2" WHERE id IN
(SELECT id FROM "temp_table2" WHERE name = 'obsid' and value= 'oid080505');
However, even that seems to have a much higher cost than I'd expect:
lab.devel.configdb=# explain delete from "tmp_table2" where id in
(select id from tmp_table2 where name='obsid' and value = 'oid080505');
NOTICE: QUERY PLAN:
Seq Scan on tmp_table2 (cost=0.00..65705177237.26 rows=769844 width=6)
SubPlan
-> Materialize (cost=42674.32..42674.32 rows=38 width=50)
-> Seq Scan on tmp_table2 (cost=0.00..42674.32 rows=38 width=50)
EXPLAIN
And, sure enough, is taking an extrordinarily long time to run (more than
10 minutes so far, compared to < 10seconds for the select). Is this
really typical of deletes? It appears (to me) to be the Seq Scan on tmp_table2
that is the killer here. If we put an index on, would it help? (The user
claims she tried that and it's EXPLAIN cost went even higher, but I haven't
checked that...)
Thanks!
--
Steve Wampler -- [email protected]
The gods that smiled on your birth are now laughing out loud.
view thread (24+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Help speeding up delete
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox