Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1aukEQ-0002Gw-LQ for pgsql-performance@arkaria.postgresql.org; Mon, 25 Apr 2016 17:23:34 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1aukEQ-0005wp-37 for pgsql-performance@arkaria.postgresql.org; Mon, 25 Apr 2016 17:23:34 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1aukEO-0005tn-Pi for pgsql-performance@postgresql.org; Mon, 25 Apr 2016 17:23:33 +0000 Received: from mail-oi0-x230.google.com ([2607:f8b0:4003:c06::230]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1aukEM-0004t2-4V for pgsql-performance@postgresql.org; Mon, 25 Apr 2016 17:23:31 +0000 Received: by mail-oi0-x230.google.com with SMTP id k142so183660288oib.1 for ; Mon, 25 Apr 2016 10:23:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=XTcO0HOcc02nhXH2e3gMMQKHpCMelNeJLqYj2e1rcII=; b=eNHJXGxYVeIkxGA6W5YLDMpmXoo/pKlo3r4u5bqtrGzPaIZD/pkaQcNfxMffgNdPB3 LgQdFBJ4dbfp5jIWTPA93iS5N4wdKisrL12T+M3Yh0VSVO4pCNRCp7UAMFi7RelEzRh0 b580n/TIJibkpMb1OnQFaXxb3duya7KaT9XMVcIIq116lPWbHKSVZqq5/Ijzt8gtF7/z alHgoEv55Kye8PFtkVa/GCBReuw6AO38y/Rh0ybW03ahMN7Y6/n1pomHoRLA1Hb/D3Rj PcIS5XdfyuPriovtwD85eDdAJUDqxCor5yuRgZrh5e8FQFtpYK4OuZu7xl+9PpyNdYv2 hh3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=XTcO0HOcc02nhXH2e3gMMQKHpCMelNeJLqYj2e1rcII=; b=burRm0Wg+3RwNiUzFdGkG9jtF6Mpj5B8SW8FGyiAEKpveUXXVh+JI8VaBsLXiRMKTq krho9KxFCpi+xjtMz5OhHNhoTfbMfZrjKyYY8cEs6SWAMpI6dHMYNtbQGxU/iYHwEPoi 9oxjm7HYYP39Xi2i2bpcPDnAgDrtDCmDIX/mZjKuH02k+tj0Wx4Qt28pS27yQaNpAFmc NMo3bAfqJNiI+sGb52SYawpG1Kzkte7xH8ogmOYtrDzP13kMUJpvK0biJzz8YBUsFm33 atWi3BIbf2JddxGpqUSdfFb4Ts3aoOeNeuJ5u1sRU3/1ekEHFqq7agiWkHBFeAwy14Xe R/4Q== X-Gm-Message-State: AOPr4FUT5vRCDLDrfjzfLnfTLKxucdE9dHT+LkeIdFvhb48m8nFj8HhBSbZZMwEYWGI4hlttLQJ8GRDiJwffdQ== MIME-Version: 1.0 X-Received: by 10.202.82.134 with SMTP id g128mr15007748oib.31.1461605008663; Mon, 25 Apr 2016 10:23:28 -0700 (PDT) Received: by 10.157.12.174 with HTTP; Mon, 25 Apr 2016 10:23:28 -0700 (PDT) In-Reply-To: References: <5717D07E.3070802@sigaev.ru> Date: Mon, 25 Apr 2016 13:23:28 -0400 Message-ID: Subject: Re: Performant queries on table with many boolean columns From: Adam Brusselback To: "pgsql-performance@postgresql.org" Content-Type: multipart/alternative; boundary=001a113b06a857ea9f0531526f11 X-Pg-Spam-Score: -2.7 (--) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a113b06a857ea9f0531526f11 Content-Type: text/plain; charset=UTF-8 At that point would it be better to just use a boolean array? Here is an example I just wrote up that does pretty damn fast searches. SET work_mem = '256 MB'; CREATE TABLE test_bool AS SELECT id, array_agg(random() < 0.85) as boolean_column FROM generate_series(1, 100) CROSS JOIN generate_series(1, 500000) id GROUP BY id; CREATE INDEX idx_test_bool ON test_bool (boolean_column); VACUUM ANALYZE test_bool; SELECT * FROM test_bool ORDER BY random() LIMIT 10 SELECT id FROM test_bool WHERE boolean_column = '{t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,f,t,t,t,t,f,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,f,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,f,f,t,t,t,t,t,t,t,t,t,f,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,f}' --001a113b06a857ea9f0531526f11 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
At that point would it be bette= r to just use a boolean array?

Here is an example I just wrote up that does prett= y damn fast searches.


SET work_mem =3D = 9;256 MB';

CREATE TABLE test_bool AS=C2=A0
SE= LECT id, array_agg(random() < 0.85) as boolean_column
FROM generate_series(1, 100)
= CROSS JOIN generate_series(1, 500000) id
GR= OUP BY id;

CREATE INDEX idx_test_bool ON test_bool (boolean_column);

VACUUM ANALYZE tes= t_bool;

SELECT *
FROM test_bool
ORDER BY =C2=A0random()
LI= MIT 10

SELECT id
FROM test_bool
WHERE boolean_column =3D '{t,t,t,t,t,t,t,t,t,t,t,t,t,t= ,t,t,t,t,t,f,t,t,t,t,f,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,f,t,t,t,t,t,= t,t,t,t,t,t,t,t,t,t,f,f,t,t,t,t,t,t,t,t,t,f,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t,t= ,t,t,t,t,t,t,t,t,t,t,f}'

--001a113b06a857ea9f0531526f11--