Received: from localhost (unknown [200.46.204.184]) by postgresql.org (Postfix) with ESMTP id 14BE92E0063 for ; Wed, 20 Feb 2008 07:27:05 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (mx1.hub.org [200.46.204.184]) (amavisd-maia, port 10024) with ESMTP id 01793-04-2 for ; Wed, 20 Feb 2008 07:26:54 -0400 (AST) Received: from mail01.enterprisedb.com (webmail.enterprisedb.com [63.246.7.168]) by postgresql.org (Postfix) with ESMTP id 8075D2E0044 for ; Wed, 20 Feb 2008 07:26:59 -0400 (AST) thread-index: Achzs5UCLlT/ipcLQeaKdo65Xk12Aw== Received: from [192.168.1.33] ([81.109.253.218]) by mail01.enterprisedb.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.3959); Wed, 20 Feb 2008 06:27:31 -0500 Content-Class: urn:content-classes:message Importance: normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.4133 Message-ID: <47BC0E41.3040704@enterprisedb.com> Date: Wed, 20 Feb 2008 11:25:53 +0000 From: "Heikki Linnakangas" Organization: EnterpriseDB User-Agent: Mozilla-Thunderbird 2.0.0.9 (X11/20080110) MIME-Version: 1.0 To: "Gregory Stark" Cc: "Juho Saarikko" , Subject: Re: BUG #3965: UNIQUE constraint fails on long column values References: <200802181130.m1IBUNdu060026@wwwmaster.postgresql.org><47BBEADA.4090308@enterprisedb.com> <877ih0x9kk.fsf@oxford.xeocode.com> In-Reply-To: <877ih0x9kk.fsf@oxford.xeocode.com> Content-Type: text/plain; format=flowed; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 20 Feb 2008 11:27:31.0593 (UTC) FILETIME=[94FBCF90:01C873B3] X-Virus-Scanned: Maia Mailguard 1.0.1 X-Archive-Number: 200802/181 X-Sequence-Number: 19748 Gregory Stark wrote: > "Heikki Linnakangas" writes: > >> As others have pointed out, CREATE UNIQUE INDEX i ON ((md5(column)) is a pretty >> good work-around. > > Unless you need cryptographic security I would not suggest using MD5. MD5 is > intentionally designed to take a substantial amount of CPU resources to > calculate. > > Postgres's internal hash method is exposed for most data types as hashtext() > hashfloat8(), hashint4(), etc. These functions were chosen for their > lightweight design. > > Cryptographic security is important only if you're concerned with people being > able to intentionally create collisions. In this scenario that's probably not > a top threat. Conceivably someone could create a denial-of-service attack > slowing down your server by causing your indexes to become unbalanced. But it > would be fairly challenging to engineer. Return type of hash* functions is just 32 bits. I wonder if that's wide enough to avoid accidental collisions? Depends on the application of course... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com