public inbox for [email protected]  
help / color / mirror / Atom feed
From: Rob Imig <[email protected]>
To: [email protected]
Subject: Performant queries on table with many boolean columns
Date: Wed, 20 Apr 2016 14:41:54 -0400
Message-ID: <CANcrS5pR1P1Tj=e-RQQ=FF3WPAy_fyruS0YJer-+iJHxR1JAiA@mail.gmail.com> (raw)
List-Unsubscribe:  <mailto:[email protected]?body=unsub%20pgsql-performance>

Hey all,

New to the lists so please let me know if this isn't the right place for
this question.

I am trying to understand how to structure a table to allow for optimal
performance on retrieval. The data will not change frequently so you can
basically think of it as static and only concerned about optimizing reads
from basic SELECT...WHERE queries.

The data:

   - ~20 million records
   - Each record has 1 id and ~100 boolean properties
   - Each boolean property has ~85% of the records as true


The retrieval will always be something like "SELECT id FROM <table> WHERE
<conditions>.

<conditions> will be some arbitrary set of the ~100 boolean columns and you
want the ids that match all of the conditions (true for each boolean
column). Example:
WHERE prop1 AND prop18 AND prop24


The obvious thing seems to make a table with ~100 columns, with 1 column
for each boolean property. Though, what type of indexing strategy would one
use on that table? Doesn't make sense to do BTREE. Is there a better way to
structure it?


Any and all advice/tips/questions appreciated!

Thanks,
Rob


view thread (14+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Performant queries on table with many boolean columns
  In-Reply-To: <CANcrS5pR1P1Tj=e-RQQ=FF3WPAy_fyruS0YJer-+iJHxR1JAiA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox