X-Original-To: pgsql-performance-postgresql.org@localhost.postgresql.org Received: from localhost (av.hub.org [200.46.204.144]) by svr1.postgresql.org (Postfix) with ESMTP id CD41DDB685 for ; Mon, 14 Nov 2005 19:52:54 -0400 (AST) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 29819-05 for ; Mon, 14 Nov 2005 23:52:51 +0000 (GMT) X-Greylist: from auto-whitelisted by SQLgrey- Received: from noao.edu (noao.edu [140.252.1.54]) by svr1.postgresql.org (Postfix) with ESMTP id 3D34EDAA96 for ; Mon, 14 Nov 2005 19:52:53 -0400 (AST) X-TFF-CGPSA-Version: 1.4f1 X-TFF-CGPSA-Filter: Scanned Received: from weaver.tuc.noao.edu ([140.252.14.8] verified) by noao.edu (CommuniGate Pro SMTP 5.0.1) with ESMTPS id 21636207; Mon, 14 Nov 2005 16:52:55 -0700 Received: from [127.0.0.1] (localhost.localdomain [127.0.0.1]) by weaver.tuc.noao.edu (8.13.1/8.13.1) with ESMTP id jAENqrpv000541; Mon, 14 Nov 2005 16:52:54 -0700 Message-ID: <43792355.1020501@noao.edu> Date: Mon, 14 Nov 2005 16:52:53 -0700 From: Steve Wampler User-Agent: Mozilla Thunderbird 1.0.7 (X11/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Scott Lamb CC: Postgres-performance Subject: Re: Help speeding up delete References: <43790A99.9050603@noao.edu> <5E90D296-FA4A-4E48-92B4-396AADE35EF2@slamb.org> In-Reply-To: <5E90D296-FA4A-4E48-92B4-396AADE35EF2@slamb.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, score=0.005 required=5 tests=[AWL=0.005] X-Spam-Score: 0.005 X-Spam-Level: X-Archive-Number: 200511/231 X-Sequence-Number: 15488 Scott Lamb wrote: > On Nov 14, 2005, at 2:07 PM, Steve Wampler wrote: > >> # SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a >> # WHERE at.id=a.id and a.name='obsid' and a.value='oid080505'; > > > Isn't this equivalent? > > select id from tmp_table2 where name = 'obsid' and value = 'oid080505'; Probably, the user based the above on a query designed to find all rows with the same id as those rows that have a.name='obsid' and a.value='oid080505'. However, I think the above would work to locate all the ids, which is all we need for the delete (see below) >> # DELETE FROM "tmp_table2" WHERE id IN >> # (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a >> # WHERE at.id=a.id and a.name='obsid' and a.value='oid080505'); > > > and this? > > delete from tmp_table2 where name = 'obsid' and value = 'oid080505'; > > Why are you doing a self-join using id, which I assume is a primary key? Because I think we need to. The above would only delete rows that have name = 'obsid' and value = 'oid080505'. We need to delete all rows that have the same ids as those rows. However, from what you note, I bet we could do: DELETE FROM "tmp_table2" WHERE id IN (SELECT id FROM "temp_table2" WHERE name = 'obsid' and value= 'oid080505'); However, even that seems to have a much higher cost than I'd expect: lab.devel.configdb=# explain delete from "tmp_table2" where id in (select id from tmp_table2 where name='obsid' and value = 'oid080505'); NOTICE: QUERY PLAN: Seq Scan on tmp_table2 (cost=0.00..65705177237.26 rows=769844 width=6) SubPlan -> Materialize (cost=42674.32..42674.32 rows=38 width=50) -> Seq Scan on tmp_table2 (cost=0.00..42674.32 rows=38 width=50) EXPLAIN And, sure enough, is taking an extrordinarily long time to run (more than 10 minutes so far, compared to < 10seconds for the select). Is this really typical of deletes? It appears (to me) to be the Seq Scan on tmp_table2 that is the killer here. If we put an index on, would it help? (The user claims she tried that and it's EXPLAIN cost went even higher, but I haven't checked that...) Thanks! -- Steve Wampler -- swampler@noao.edu The gods that smiled on your birth are now laughing out loud.