public inbox for [email protected]  
help / color / mirror / Atom feed
From: David G. Johnston <[email protected]>
To: Volker Boehm <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: similarity and operator '%'
Date: Mon, 30 May 2016 14:20:33 -0400
Message-ID: <CAKFQuwaEUYS75qJVGQZJ7FGDZtM+kMrzTQzdPNLFhxVk+vQkDg@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
List-Unsubscribe:  <mailto:[email protected]?body=unsub%20pgsql-performance>

On Mon, May 30, 2016 at 1:53 PM, Volker Boehm <[email protected]> wrote:

>
> The reason for using the similarity function in place of the '%'-operator
> is that I want to use different similarity values in one query:
>
>     select name, street, zip, city
>     from addresses
>     where name % $1
>         and street % $2
>         and (zip % $3 or city % $4)
>         or similarity(name, $1) > 0.8
>
> which means: take all addresses where name, street, zip and city have
> little similarity _plus_ all addresses where the name matches very good.
>
>
> The only way I found, was to create a temporary table from the first
> query, change the similarity value with set_limit() and then select the
> second query UNION the temporary table.
>
> Is there a more elegant and straight forward way to achieve this result?
>

​Not that I can envision.

You are forced into using an operator due to our index implementation.

You are thus forced into using a GUC to control the parameter that the
index scanning function uses to compute true/false.

A GUC can only take on a single value within a given query - well, not
quite true[1] but the exception doesn't seem like it will help here.

Th
us you are consigned to​

​using two queries.

*​A functional index​ doesn't work since the second argument is query
specific

[1]​ When defining a function you can attach a "SET" clause to it; commonly
used for search_path but should work with any GUC.  If you could wrap the
operator comparison into a custom function you could use this capability.
It also would require a function that would take the threshold as a value -
the extension only provides variations that use the GUC.

I don't think this will use the index even if it compiles (not tested):

CREATE FUNCTION similarity_80(col, val)
RETURNS boolean
SET similarity_threshold = 0.80
LANGUAGE sql
AS $$
​SELECT ​col % val;
$$;

​David J.​


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: similarity and operator '%'
  In-Reply-To: <CAKFQuwaEUYS75qJVGQZJ7FGDZtM+kMrzTQzdPNLFhxVk+vQkDg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox