public inbox for [email protected]  
help / color / mirror / Atom feed
From: Tom Lane <[email protected]>
To: Bruce Momjian <[email protected]>
Cc: David E. Wheeler <[email protected]>
Cc: Greg Stark <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Peter Geoghegan <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)
Date: Thu, 08 May 2014 10:16:54 -0400
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <CA+TgmoZucoe-w5P_16A-gdfRBFCCu7PD17YO-KqrYPBhb5TxAA@mail.gmail.com>
	<[email protected]>
	<CAM-w4HNawF2sPDSUrZsng7zgGnste-Vy1+42AufyVq9UkVeCZw@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
List-Unsubscribe: <mailto:[email protected]?body=unsub%20pgsql-hackers>

Bruce Momjian <[email protected]> writes:
> On Tue, May  6, 2014 at 06:20:53PM -0400, Tom Lane wrote:
>> I wonder why there's not an option to store keys and values separately,
>> but as hashes not as the original strings, so that indexability of
>> everything could be guaranteed.  Or a variant of that might be to hash
>> only strings that are too large to fit in an index entry, and force
>> recheck only when searching for a string that needed hashing.
>> 
>> I wonder whether the most effective use of time at this point
>> wouldn't be to fix jsonb_ops to do that, rather than arguing about
>> what to rename it to.  If it didn't have the failure-for-long-strings
>> problem I doubt anybody would be unhappy about making it the default.

> Can we hash just the values, not the keys, in jsonb_ops, and hash the
> combo in jsonb_hash_ops.  That would give us key-only lookups without a
> recheck.

No, because there's nothing in JSON limiting the length of keys, any more
than values.

I think the idea of hashing only keys/values that are "too long" is a
reasonable compromise.  I've not finished coding it (because I keep
getting distracted by other problems in the code :-() but it does not
look to be very difficult.  I'm envisioning the cutoff as being something
like 128 bytes; in practice that would mean that few if any keys get
hashed, I think.

			regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



view thread (65+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox