MIME-Version: 1.0
References: <CAHesJ5LES3aTDf=xp7NOwrADQ_HWC-Spsv7yLu9ZY+zxzZO53A@mail.gmail.com>
 <b450927c-49da-46e5-ad74-bf38ceff166b@aklaver.com> <CAHesJ5+ASNoSNMiC5Ms0Ts=gw7v2_UeBpUT=phujO4yE_XCbEw@mail.gmail.com>
 <CAHesJ5JbkCBZ2f_AvUr8+KWnGPAsobu4zyfnWm8bEeb7X9oqDQ@mail.gmail.com>
 <06e1f1ee-74b2-43a2-9a63-da20ae455ae2@aklaver.com> <CAHesJ5JLzhHiGSBSkJZ7x7rGgHeeByP=wWk1D5GG=x8cJ5YY6Q@mail.gmail.com>
 <CAKFQuwYdpzwcbSdQ8TvZ-nVjPeHVVz+5=bWofCbUK+p_o=axrQ@mail.gmail.com>
 <CAHesJ5+yTenkAxOT8H33Cfe=1b2kSyXGqxFYfYz5fgYAVVvFmw@mail.gmail.com>
 <CAHesJ5KaJ8p7QhB9UUoFEbA87cU7ke4GBMkKR3q2FJPVv9GXyw@mail.gmail.com>
 <CANzqJaB_s8eXCZJvYO9CLvgJNqrshD=G5GgECi1M9=vk-JHjdQ@mail.gmail.com>
 <CAHesJ5LgLi9-uGCk3J9TUkuyttysz3fzTaP+o57EjcBtwDYKZA@mail.gmail.com>
 <CANzqJaD-MwXzvg97q0iLvAdkf=DnUMOq0Ex2_eNU7sTxEL7bfA@mail.gmail.com>
 <CAHesJ5KtKm9fjhMdR1+cC-M5jW98Sz6sWKbt0mN6SJcfkq9eig@mail.gmail.com>
 <CAHesJ5Kne6MZakdhcQ9Zc-5KhBvhgUt+zUXuX5v6z+zwTY6gLQ@mail.gmail.com>
 <CANzqJaDb901c=fbicfuzXu1kvf1OR+6rtRZvtHOPyGaOD2E99Q@mail.gmail.com>
 <CAHesJ5LbFjqN0NndYUHKsXX_JssgwM-VPPaYoi28kZvsciFifQ@mail.gmail.com>
 <CAHesJ5Kn4bm6DmEVtEmHc1itXovdqNo8iE4kiDunH3gYK5jF4A@mail.gmail.com> <CANzqJaBhAissakbxvn6hUBenHBkHzjODGyw-Yg3z_EcsfYhKmg@mail.gmail.com>
In-Reply-To: <CANzqJaBhAissakbxvn6hUBenHBkHzjODGyw-Yg3z_EcsfYhKmg@mail.gmail.com>
From: Divyansh Gupta JNsThMAudy <ag1567827@gmail.com>
Date: Tue, 24 Dec 2024 00:41:26 +0530
Message-ID: <CAHesJ5++Ca0W9DSWA-XtfHohuD8a15GuaPk_DEPU3ggeq9WHWw@mail.gmail.com>
Subject: Re: Need help in database design
To: Ron Johnson <ronljohnsonjr@gmail.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Content-Type: multipart/alternative; boundary="0000000000001014370629f4c473"
Archived-At: <https://www.postgresql.org/message-id/CAHesJ5%2B%2BCa0W9DSWA-XtfHohuD8a15GuaPk_DEPU3ggeq9WHWw%40mail.gmail.com>
Precedence: bulk

--0000000000001014370629f4c473
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Thank you everyone for giving your valuable responses, I am glad that
everyone understands my concern. I got some good ideas about the database
design that I am following after going through some stress testing I will
implement the same.

Thank you so much Everyone

On Tue, 24 Dec 2024, 12:09=E2=80=AFam Ron Johnson, <ronljohnsonjr@gmail.com=
> wrote:

> Are these columns really unique for all 20M rows that a userid can have i=
n
> the table?  I'm dubious.
>
> Split a LOT of those columns out into a separate table named "user" with
> PK userid.  It'll save a huge amount of disk space, and speed up queries =
by
> not having to fetch it all every time.
>
> useremail varchar(600) NOT NULL,
> title public.citext NULL,
> authorname varchar(600) NULL,
> authoremail varchar(600) NULL,
> updated varchar(300) NOT NULL,
> entryid varchar(2000) NOT NULL,
> lastmodifiedby varchar(600) NULL,
> lastmodifiedbyemail varchar(600) NULL,
> "size" varchar(300) NULL,
> contenttype varchar(250) NULL,
> fileextension varchar(50) NULL,
> docfoldername public.citext NULL,
> folderresourceid public.citext NULL,
> filesize int8 DEFAULT 0 NOT NULL,
> retentionstatus int2 DEFAULT 0 NOT NULL,
> docfileref int8 NULL,
> usid int4 NULL,
> archivepath varchar(500) NULL,
> createddate timestamp(6) DEFAULT NULL::timestamp without time zone NULL,
> zipfilename varchar(100) NULL,
> oncreatedat timestamp(6) DEFAULT clock_timestamp() NOT NULL,
> onupdateat timestamp(6) DEFAULT clock_timestamp() NOT NULL,
> startsnapshot int4 DEFAULT 0 NOT NULL,
> currentsnapshot int4 DEFAULT 0 NOT NULL,
> dismiss int2 DEFAULT 0 NOT NULL,
> checksum varchar NULL,
> typeoffile int2 GENERATED ALWAYS AS (
>
>
>
> On Mon, Dec 23, 2024 at 1:32=E2=80=AFPM Divyansh Gupta JNsThMAudy <
> ag1567827@gmail.com> wrote:
>
>> Currently I haven't created those columns , I have created addons_json
>> column which is a JSONB column yet in a discussion weather I should crea=
te
>> or consider only one JSONB column.
>>
>> On Tue, 24 Dec 2024, 12:00=E2=80=AFam Divyansh Gupta JNsThMAudy, <
>> ag1567827@gmail.com> wrote:
>>
>>> Range partition can help when you applies filter for a specific range
>>> but in my case I need to apply filter on userid always, however I have =
date
>>> columns but there is less variation in timestamp which I have that's wh=
y
>>> didn't go for range partition.
>>>
>>> On Mon, 23 Dec 2024, 11:57=E2=80=AFpm Ron Johnson, <ronljohnsonjr@gmail=
.com>
>>> wrote:
>>>
>>>>
>>>> 1. I bet you'd get better performance using RANGE partitioning.
>>>> 2. Twenty million rows per userid is a *LOT*.  No subdivisions (like
>>>> date range)?
>>>>
>>>> On Mon, Dec 23, 2024 at 1:23=E2=80=AFPM Divyansh Gupta JNsThMAudy <
>>>> ag1567827@gmail.com> wrote:
>>>>
>>>>> Adrian, Please check this out;
>>>>>
>>>>> PARTITION BY HASH (userid); CREATE TABLE dbo.
>>>>> googledocs_tbl_clone_part_0 PARTITION OF dbo.googledocs_tbl_clone FOR
>>>>> VALUES WITH (modulus 84, remainder 0); ... CREATE TABLE dbo.
>>>>> googledocs_tbl_clone_part_83 PARTITION OF dbo.googledocs_tbl_clone FO=
R
>>>>> VALUES WITH (modulus 84, remainder 83);
>>>>>
>>>>> On Mon, Dec 23, 2024 at 11:48=E2=80=AFPM Divyansh Gupta JNsThMAudy <
>>>>> ag1567827@gmail.com> wrote:
>>>>>
>>>>>> Adrian, the partition is on userid using hash partition with 84
>>>>>> partitions
>>>>>>
>>>>>> Ron, there could be more than 20 Million records possible for a
>>>>>> single userid in that case if I create index on userid only not on o=
ther
>>>>>> column the query is taking more than 30 seconds to return the result=
s.
>>>>>>
>>>>>> On Mon, 23 Dec 2024, 11:40=E2=80=AFpm Ron Johnson, <ronljohnsonjr@gm=
ail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> If your queries all reference userid, then you only need indices on
>>>>>>> gdid and userid.
>>>>>>>
>>>>>>> On Mon, Dec 23, 2024 at 12:49=E2=80=AFPM Divyansh Gupta JNsThMAudy =
<
>>>>>>> ag1567827@gmail.com> wrote:
>>>>>>>
>>>>>>>> I have one confusion with this design if I opt to create 50 column=
s
>>>>>>>> I need to create 50 index which will work with userid index in Bit=
map on
>>>>>>>> the other hand if I create a JSONB column I need to create a singl=
e index ?
>>>>>>>>
>>>>>>>> On Mon, 23 Dec 2024, 11:10=E2=80=AFpm Ron Johnson, <ronljohnsonjr@=
gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Given what you just wrote, I'd stick with 50 separate t* columns.
>>>>>>>>> Simplifies queries, simplifies updates, and eliminates JSONB conv=
ersions.
>>>>>>>>>
>>>>>>>>> On Mon, Dec 23, 2024 at 12:29=E2=80=AFPM Divyansh Gupta JNsThMAud=
y <
>>>>>>>>> ag1567827@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Values can be updated based on customer actions
>>>>>>>>>>
>>>>>>>>>> All rows won't have all 50 key value pairs always if I make thos=
e
>>>>>>>>>> keys into columns the rows might have null value on the other ha=
nd if it is
>>>>>>>>>> JSONB then the key value pair will not be there
>>>>>>>>>>
>>>>>>>>>> Yes in UI customers can search for the key value pairs
>>>>>>>>>>
>>>>>>>>>> During data population the key value pair will be empty array in
>>>>>>>>>> case of JSONB column or NULL in case of table columns, later whe=
n customer
>>>>>>>>>> performs some actions that time the key value pairs will populat=
e and
>>>>>>>>>> update, based on what action customer performs.
>>>>>>>>>>
>>>>>>>>>> On Mon, 23 Dec 2024, 10:51=E2=80=AFpm Divyansh Gupta JNsThMAudy,=
 <
>>>>>>>>>> ag1567827@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Let's make it more understandable, here is the table schema wit=
h
>>>>>>>>>>> 50 columns in it
>>>>>>>>>>>
>>>>>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1
>>>>>>>>>>> MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCL=
E) NOT NULL,
>>>>>>>>>>> userid int8 NOT NULL,
>>>>>>>>>>> t1 int4 NULL,
>>>>>>>>>>> t2 int4 NULL,
>>>>>>>>>>> t3 int4 NULL,
>>>>>>>>>>> t4 int4 NULL,
>>>>>>>>>>> t5 int4 NULL,
>>>>>>>>>>> t6 int4 NULL,
>>>>>>>>>>> t7 int4 NULL,
>>>>>>>>>>> t8 int4 NULL,
>>>>>>>>>>> t9 int4 NULL,
>>>>>>>>>>> t10 int4 NULL,
>>>>>>>>>>> t11 int4 NULL,
>>>>>>>>>>> t12 int4 NULL,
>>>>>>>>>>> t13 int4 NULL,
>>>>>>>>>>> t14 int4 NULL,
>>>>>>>>>>> t15 int4 NULL,
>>>>>>>>>>> t16 int4 NULL,
>>>>>>>>>>> t17 int4 NULL,
>>>>>>>>>>> t18 int4 NULL,
>>>>>>>>>>> t19 int4 NULL,
>>>>>>>>>>> t20 int4 NULL,
>>>>>>>>>>> t21 int4 NULL,
>>>>>>>>>>> t22 int4 NULL,
>>>>>>>>>>> t23 int4 NULL,
>>>>>>>>>>> t24 int4 NULL,
>>>>>>>>>>> t25 int4 NULL,
>>>>>>>>>>> t26 int4 NULL,
>>>>>>>>>>> t27 int4 NULL,
>>>>>>>>>>> t28 int4 NULL,
>>>>>>>>>>> t29 int4 NULL,
>>>>>>>>>>> t30 int4 NULL,
>>>>>>>>>>> t31 int4 NULL,
>>>>>>>>>>> t32 int4 NULL,
>>>>>>>>>>> t33 int4 NULL,
>>>>>>>>>>> t34 int4 NULL,
>>>>>>>>>>> t35 int4 NULL,
>>>>>>>>>>> t36 int4 NULL,
>>>>>>>>>>> t37 int4 NULL,
>>>>>>>>>>> t38 int4 NULL,
>>>>>>>>>>> t39 int4 NULL,
>>>>>>>>>>> t40 int4 NULL,
>>>>>>>>>>> t41 int4 NULL,
>>>>>>>>>>> t42 int4 NULL,
>>>>>>>>>>> t43 int4 NULL,
>>>>>>>>>>> t44 int4 NULL,
>>>>>>>>>>> t45 int4 NULL,
>>>>>>>>>>> t46 int4 NULL,
>>>>>>>>>>> t47 int4 NULL,
>>>>>>>>>>> t48 int4 NULL,
>>>>>>>>>>> t49 int4 NULL,
>>>>>>>>>>> t50 int4 NULL,
>>>>>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>>>>>> );
>>>>>>>>>>>
>>>>>>>>>>> Every time when i query I will query it along with userid
>>>>>>>>>>> Ex : where userid =3D 12345678 and t1 in (1,2,3) and t2 in (0,1=
,2)
>>>>>>>>>>> more key filters if customer applies
>>>>>>>>>>>
>>>>>>>>>>> On the other hand if I create a single jsonb column the schema
>>>>>>>>>>> will look like :
>>>>>>>>>>>
>>>>>>>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>>>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1
>>>>>>>>>>> MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCL=
E) NOT NULL,
>>>>>>>>>>> userid int8 NOT NULL,
>>>>>>>>>>> addons_json jsonb default '{}'::jsonb
>>>>>>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>>>>>>>> );
>>>>>>>>>>>
>>>>>>>>>>> and the query would be like
>>>>>>>>>>> where userid =3D 12345678 and ((addons_json @> {t1:1}) or
>>>>>>>>>>> (addons_json @> {t1:2}) or  (addons_json @> {t1:3})
>>>>>>>>>>> more key filters if customer applies
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Dec 23, 2024 at 10:38=E2=80=AFPM David G. Johnston <
>>>>>>>>>>> david.g.johnston@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy <
>>>>>>>>>>>> ag1567827@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So here my question is considering one JSONB column is perfec=
t
>>>>>>>>>>>>> or considering 50 columns will be more optimised.
>>>>>>>>>>>>>
>>>>>>>>>>>> The relational database engine is designed around the
>>>>>>>>>>>> column-based approach.  Especially if the columns are generall=
y unchanging,
>>>>>>>>>>>> combined with using fixed-width data types.
>>>>>>>>>>>>
>>>>>>>>>>>> David J.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Death to <Redacted>, and butter sauce.
>>>>>>>>> Don't boil me, I'm still alive.
>>>>>>>>> <Redacted> lobster!
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Death to <Redacted>, and butter sauce.
>>>>>>> Don't boil me, I'm still alive.
>>>>>>> <Redacted> lobster!
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> Death to <Redacted>, and butter sauce.
>>>> Don't boil me, I'm still alive.
>>>> <Redacted> lobster!
>>>>
>>>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>

--0000000000001014370629f4c473
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr">Thank you everyone for giving your valuable responses, I am =
glad that everyone understands my concern. I got some good ideas about the =
database design that I am following after going through some stress testing=
 I will implement the same.</p>
<p dir=3D"ltr">Thank you so much Everyone</p>
<br><div class=3D"gmail_quote gmail_quote_container"><div dir=3D"ltr" class=
=3D"gmail_attr">On Tue, 24 Dec 2024, 12:09=E2=80=AFam Ron Johnson, &lt;<a h=
ref=3D"mailto:ronljohnsonjr@gmail.com">ronljohnsonjr@gmail.com</a>&gt; wrot=
e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Are these =
columns really unique for all 20M rows that a userid can have in the table?=
=C2=A0 I&#39;m dubious.</div><div><br></div><div>Split a LOT of those colum=
ns out into a separate=C2=A0table named &quot;user&quot; with PK userid.=C2=
=A0 It&#39;ll save a huge amount of disk space, and speed up queries by not=
 having to fetch it all every time.</div><div><br></div><div>useremail varc=
har(600) NOT NULL,<br>title public.citext NULL,<br>authorname varchar(600) =
NULL,<br>authoremail varchar(600) NULL,<br>updated varchar(300) NOT NULL,<b=
r>entryid varchar(2000) NOT NULL,<br>lastmodifiedby varchar(600) NULL,<br>l=
astmodifiedbyemail varchar(600) NULL,<br>&quot;size&quot; varchar(300) NULL=
,<br>contenttype varchar(250) NULL,<br>fileextension varchar(50) NULL,<br>d=
ocfoldername public.citext NULL,<br>folderresourceid public.citext NULL,<br=
>filesize int8 DEFAULT 0 NOT NULL,<br>retentionstatus int2 DEFAULT 0 NOT NU=
LL,<br>docfileref int8 NULL,<br>usid int4 NULL,<br>archivepath varchar(500)=
 NULL,<br>createddate timestamp(6) DEFAULT NULL::timestamp without time zon=
e NULL,<br>zipfilename varchar(100) NULL,<br>oncreatedat timestamp(6) DEFAU=
LT clock_timestamp() NOT NULL,<br>onupdateat timestamp(6) DEFAULT clock_tim=
estamp() NOT NULL,<br>startsnapshot int4 DEFAULT 0 NOT NULL,<br>currentsnap=
shot int4 DEFAULT 0 NOT NULL,<br>dismiss int2 DEFAULT 0 NOT NULL,<br>checks=
um varchar NULL,<br>typeoffile int2 GENERATED ALWAYS AS (<br><br><br></div>=
<br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon=
, Dec 23, 2024 at 1:32=E2=80=AFPM Divyansh Gupta JNsThMAudy &lt;<a href=3D"=
mailto:ag1567827@gmail.com" target=3D"_blank" rel=3D"noreferrer">ag1567827@=
gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><p dir=3D"ltr">Currently I haven&#39;t created those columns , I=
 have created addons_json column which is a JSONB column yet in a discussio=
n weather I should create or consider only one JSONB column.</p>
<br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue=
, 24 Dec 2024, 12:00=E2=80=AFam Divyansh Gupta JNsThMAudy, &lt;<a href=3D"m=
ailto:ag1567827@gmail.com" target=3D"_blank" rel=3D"noreferrer">ag1567827@g=
mail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex"><p dir=3D"ltr">Range partition can help when you applies filter for=
 a specific range but in my case I need to apply filter on userid always, h=
owever I have date columns but there is less variation in timestamp which I=
 have that&#39;s why didn&#39;t go for range partition.</p>
<br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon=
, 23 Dec 2024, 11:57=E2=80=AFpm Ron Johnson, &lt;<a href=3D"mailto:ronljohn=
sonjr@gmail.com" rel=3D"noreferrer noreferrer" target=3D"_blank">ronljohnso=
njr@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);paddi=
ng-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><div>1. I bet you&=
#39;d get better performance using RANGE partitioning.</div><div>2. Twenty =
million rows per userid=C2=A0is a <b>LOT</b>.=C2=A0 No subdivisions (like d=
ate range)?</div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"g=
mail_attr">On Mon, Dec 23, 2024 at 1:23=E2=80=AFPM Divyansh Gupta JNsThMAud=
y &lt;<a href=3D"mailto:ag1567827@gmail.com" rel=3D"noreferrer noreferrer n=
oreferrer" target=3D"_blank">ag1567827@gmail.com</a>&gt; wrote:<br></div><b=
lockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-le=
ft:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Adrian, Pl=
ease check this out;<br><br><span style=3D"background-color:rgb(47,47,47);p=
adding:0px 0px 0px 2px"><span style=3D"color:rgb(204,204,204);font-family:C=
onsolas;font-size:10pt;white-space:pre-wrap"><span style=3D"color:rgb(115,1=
58,202);font-weight:bold">PARTITION</span> <span style=3D"color:rgb(115,158=
,202);font-weight:bold">BY</span> <span style=3D"color:rgb(158,158,158)">HA=
SH</span> (<span style=3D"color:rgb(158,158,158)">userid</span>)<span style=
=3D"color:rgb(238,204,100)">;
</span></span><span style=3D"padding:0px 0px 0px 2px"><span style=3D"color:=
rgb(204,204,204);font-family:Consolas;font-size:10pt;white-space:pre-wrap">=
<span style=3D"color:rgb(115,158,202);font-weight:bold">CREATE</span> <span=
 style=3D"color:rgb(115,158,202);font-weight:bold">TABLE</span> <span style=
=3D"color:rgb(204,155,117)">dbo</span>.<span style=3D"color:rgb(183,136,211=
)">googledocs_tbl_clone_part_0</span> <span style=3D"color:rgb(115,158,202)=
;font-weight:bold">PARTITION</span> <span style=3D"color:rgb(115,158,202);f=
ont-weight:bold">OF</span> <span style=3D"color:rgb(158,158,158)">dbo</span=
>.<span style=3D"color:rgb(158,158,158)">googledocs_tbl_clone</span>  <span=
 style=3D"color:rgb(115,158,202);font-weight:bold">FOR</span> <span style=
=3D"color:rgb(115,158,202);font-weight:bold">VALUES</span> <span style=3D"c=
olor:rgb(115,158,202);font-weight:bold">WITH</span> (<span style=3D"color:r=
gb(158,158,158)">modulus</span> <span style=3D"color:rgb(192,192,192)">84</=
span>, <span style=3D"color:rgb(158,158,158)">remainder</span> <span style=
=3D"color:rgb(192,192,192)">0</span>)<span style=3D"color:rgb(238,204,100)"=
>;
...
</span></span><span style=3D"padding:0px 0px 0px 2px"><span style=3D"color:=
rgb(204,204,204);font-family:Consolas;font-size:10pt;white-space:pre-wrap">=
<span style=3D"color:rgb(115,158,202);font-weight:bold">CREATE</span> <span=
 style=3D"color:rgb(115,158,202);font-weight:bold">TABLE</span> <span style=
=3D"color:rgb(204,155,117)">dbo</span>.<span style=3D"color:rgb(183,136,211=
)">googledocs_tbl_clone_part_83</span> <span style=3D"color:rgb(115,158,202=
);font-weight:bold">PARTITION</span> <span style=3D"color:rgb(115,158,202);=
font-weight:bold">OF</span> <span style=3D"color:rgb(158,158,158)">dbo</spa=
n>.<span style=3D"color:rgb(158,158,158)">googledocs_tbl_clone</span>  <spa=
n style=3D"color:rgb(115,158,202);font-weight:bold">FOR</span> <span style=
=3D"color:rgb(115,158,202);font-weight:bold">VALUES</span> <span style=3D"c=
olor:rgb(115,158,202);font-weight:bold">WITH</span> (<span style=3D"color:r=
gb(158,158,158)">modulus</span> <span style=3D"color:rgb(192,192,192)">84</=
span>, <span style=3D"color:rgb(158,158,158)">remainder</span> <span style=
=3D"color:rgb(192,192,192)">83</span>)<span style=3D"color:rgb(238,204,100)=
">;


</span></span></span><span style=3D"color:rgb(204,204,204);font-family:Cons=
olas;font-size:10pt;white-space:pre-wrap"><span style=3D"color:rgb(238,204,=
100)"></span></span></span><span style=3D"color:rgb(204,204,204);font-famil=
y:Consolas;font-size:10pt;white-space:pre-wrap"><span style=3D"color:rgb(23=
8,204,100)"></span></span></span></div><br><div class=3D"gmail_quote"><div =
dir=3D"ltr" class=3D"gmail_attr">On Mon, Dec 23, 2024 at 11:48=E2=80=AFPM D=
ivyansh Gupta JNsThMAudy &lt;<a href=3D"mailto:ag1567827@gmail.com" rel=3D"=
noreferrer noreferrer noreferrer" target=3D"_blank">ag1567827@gmail.com</a>=
&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px =
0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><p d=
ir=3D"ltr">Adrian, the partition is on userid using hash partition with 84 =
partitions</p>
<p dir=3D"ltr">Ron, there could be more than 20 Million records possible fo=
r a single userid in that case if I create index on userid only not on othe=
r column the query is taking more than 30 seconds to return the results.</p=
>
<br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon=
, 23 Dec 2024, 11:40=E2=80=AFpm Ron Johnson, &lt;<a href=3D"mailto:ronljohn=
sonjr@gmail.com" rel=3D"noreferrer noreferrer noreferrer" target=3D"_blank"=
>ronljohnsonjr@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail=
_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204=
,204);padding-left:1ex"><div dir=3D"ltr"><div>If your queries all reference=
 userid, then you only need indices on gdid and userid.</div><br><div class=
=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Dec 23, 2024=
 at 12:49=E2=80=AFPM Divyansh Gupta JNsThMAudy &lt;<a href=3D"mailto:ag1567=
827@gmail.com" rel=3D"noreferrer noreferrer noreferrer noreferrer" target=
=3D"_blank">ag1567827@gmail.com</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><p dir=3D"ltr">I have one confusion with t=
his design if I opt to create 50 columns I need to create 50 index which wi=
ll work with userid index in Bitmap on the other hand if I create a JSONB c=
olumn I need to create a single index ?</p>
<br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon=
, 23 Dec 2024, 11:10=E2=80=AFpm Ron Johnson, &lt;<a href=3D"mailto:ronljohn=
sonjr@gmail.com" rel=3D"noreferrer noreferrer noreferrer noreferrer" target=
=3D"_blank">ronljohnsonjr@gmail.com</a>&gt; wrote:<br></div><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid =
rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div>Given what you jus=
t wrote, I&#39;d stick with 50 separate t* columns.=C2=A0 Simplifies querie=
s, simplifies updates, and eliminates JSONB conversions.</div><br><div clas=
s=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Dec 23, 202=
4 at 12:29=E2=80=AFPM Divyansh Gupta JNsThMAudy &lt;<a href=3D"mailto:ag156=
7827@gmail.com" rel=3D"noreferrer noreferrer noreferrer noreferrer noreferr=
er" target=3D"_blank">ag1567827@gmail.com</a>&gt; wrote:<br></div><blockquo=
te class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px =
solid rgb(204,204,204);padding-left:1ex"><p dir=3D"ltr">Values can be updat=
ed based on customer actions</p>
<p dir=3D"ltr">All rows won&#39;t have all 50 key value pairs always if I m=
ake those keys into columns the rows might have null value on the other han=
d if it is JSONB then the key value pair will not be there</p>
<p dir=3D"ltr">Yes in UI customers can search for the key value pairs</p>
<p dir=3D"ltr">During data population the key value pair will be empty arra=
y in case of JSONB column or NULL in case of table columns, later when cust=
omer performs some actions that time the key value pairs will populate and =
update, based on what action customer performs.<br>
</p>
<br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon=
, 23 Dec 2024, 10:51=E2=80=AFpm Divyansh Gupta JNsThMAudy, &lt;<a href=3D"m=
ailto:ag1567827@gmail.com" rel=3D"noreferrer noreferrer noreferrer noreferr=
er noreferrer" target=3D"_blank">ag1567827@gmail.com</a>&gt; wrote:<br></di=
v><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;borde=
r-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Let=
9;s make it more understandable, here is the table schema with 50 columns i=
n it=C2=A0<br><br>CREATE TABLE dbo.googledocs_tbl (<br>	gdid int8 GENERATED=
 BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1 MAXVALUE 922337203685477=
5807 START 1 CACHE 1 NO CYCLE) NOT NULL,<br>	userid int8 NOT NULL,<br>	t1 i=
nt4 NULL,<br>	t2 int4 NULL,<br>	t3 int4 NULL,<br>	t4 int4 NULL,<br>	t5 int4=
 NULL,<br>	t6 int4 NULL,<br>	t7 int4 NULL,<br>	t8 int4 NULL,<br>	t9 int4 NU=
LL,<br>	t10 int4 NULL,<br>	t11 int4 NULL,<br>	t12 int4 NULL,<br>	t13 int4 N=
ULL,<br>	t14 int4 NULL,<br>	t15 int4 NULL,<br>	t16 int4 NULL,<br>	t17 int4 =
NULL,<br>	t18 int4 NULL,<br>	t19 int4 NULL,<br>	t20 int4 NULL,<br>	t21 int4=
 NULL,<br>	t22 int4 NULL,<br>	t23 int4 NULL,<br>	t24 int4 NULL,<br>	t25 int=
4 NULL,<br>	t26 int4 NULL,<br>	t27 int4 NULL,<br>	t28 int4 NULL,<br>	t29 in=
t4 NULL,<br>	t30 int4 NULL,<br>	t31 int4 NULL,<br>	t32 int4 NULL,<br>	t33 i=
nt4 NULL,<br>	t34 int4 NULL,<br>	t35 int4 NULL,<br>	t36 int4 NULL,<br>	t37 =
int4 NULL,<br>	t38 int4 NULL,<br>	t39 int4 NULL,<br>	t40 int4 NULL,<br>	t41=
 int4 NULL,<br>	t42 int4 NULL,<br>	t43 int4 NULL,<br>	t44 int4 NULL,<br>	t4=
5 int4 NULL,<br>	t46 int4 NULL,<br>	t47 int4 NULL,<br>	t48 int4 NULL,<br>	t=
49 int4 NULL,<br>	t50 int4 NULL,<br>	CONSTRAINT googledocs_tbl_pkey PRIMARY=
 KEY (gdid),<br>);<br><br>Every time when i query I will query it along wit=
h userid=C2=A0<br>Ex : where userid =3D 12345678 and t1 in (1,2,3) and t2 i=
n (0,1,2)<br>more key filters if customer applies=C2=A0<br><br>On the other=
 hand if I create a single jsonb column the schema will look like :<br><br>=
CREATE TABLE dbo.googledocs_tbl (<br>	gdid int8 GENERATED BY DEFAULT AS IDE=
NTITY( INCREMENT BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE=
 1 NO CYCLE) NOT NULL,<br>	userid int8 NOT NULL,<br>	addons_json jsonb defa=
ult &#39;{}&#39;::jsonb<br>	CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdi=
d),<br>);<br><br>and the query would be like=C2=A0<br>where userid =3D 1234=
5678 and ((addons_json=C2=A0@&gt; {t1:1}) or=C2=A0

(addons_json=C2=A0<a class=3D"gmail_plusreply" id=3D"m_4628158351019534282m=
_-7352191020320552055m_2851015775117840618m_825930680353329927m_-8977558299=
889630118m_1979415867627385579m_226170299586707328m_2525891115250520179m_57=
04739134775453558m_4822255652052756050m_-1215567552791878704gmail-plusReply=
Chip-0" rel=3D"noreferrer noreferrer noreferrer noreferrer noreferrer noref=
errer">@&gt; {t1:2}) or=C2=A0</a>

(addons_json=C2=A0<a class=3D"gmail_plusreply" id=3D"m_4628158351019534282m=
_-7352191020320552055m_2851015775117840618m_825930680353329927m_-8977558299=
889630118m_1979415867627385579m_226170299586707328m_2525891115250520179m_57=
04739134775453558m_4822255652052756050m_-1215567552791878704gmail-plusReply=
Chip-0" rel=3D"noreferrer noreferrer noreferrer noreferrer noreferrer noref=
errer">@&gt; {t1:3})<br>more key filters if customer applies=C2=A0<br><br><=
br></a></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail=
_attr">On Mon, Dec 23, 2024 at 10:38=E2=80=AFPM David G. Johnston &lt;<a hr=
ef=3D"mailto:david.g.johnston@gmail.com" rel=3D"noreferrer noreferrer noref=
errer noreferrer noreferrer noreferrer" target=3D"_blank">david.g.johnston@=
gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><div dir=3D"auto"><br><br><div class=3D"gmail_quote" dir=3D"auto=
"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Dec 23, 2024, 10:01 Divyans=
h Gupta JNsThMAudy &lt;<a href=3D"mailto:ag1567827@gmail.com" rel=3D"norefe=
rrer noreferrer noreferrer noreferrer noreferrer noreferrer" target=3D"_bla=
nk">ag1567827@gmail.com</a>&gt; wrote:</div><blockquote class=3D"gmail_quot=
e" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)=
;padding-left:1ex"><div dir=3D"auto"><p dir=3D"ltr"><br></p><p dir=3D"ltr">=
So here my question is considering one JSONB column is perfect or consideri=
ng 50 columns will be more optimised.</p></div></blockquote></div><div dir=
=3D"auto">The relational database engine is designed around the column-base=
d approach.=C2=A0 Especially if the columns are generally unchanging, combi=
ned with using fixed-width data types.</div><div dir=3D"auto"><br></div><di=
v dir=3D"auto">David J.</div><div dir=3D"auto"><br></div><div class=3D"gmai=
l_quote" dir=3D"auto"><blockquote class=3D"gmail_quote" style=3D"margin:0px=
 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
</blockquote></div></div>
</blockquote></div>
</blockquote></div>
</blockquote></div><div><br clear=3D"all"></div><div><br></div><span class=
=3D"gmail_signature_prefix">-- </span><br><div dir=3D"ltr" class=3D"gmail_s=
ignature"><div dir=3D"ltr">Death to &lt;Redacted&gt;, and butter sauce.<div=
>Don&#39;t boil me, I&#39;m still alive.<br><div><div>&lt;Redacted&gt; lobs=
ter!</div></div></div></div></div></div>
</blockquote></div>
</blockquote></div><div><br clear=3D"all"></div><div><br></div><span class=
=3D"gmail_signature_prefix">-- </span><br><div dir=3D"ltr" class=3D"gmail_s=
ignature"><div dir=3D"ltr">Death to &lt;Redacted&gt;, and butter sauce.<div=
>Don&#39;t boil me, I&#39;m still alive.<br><div><div>&lt;Redacted&gt; lobs=
ter!</div></div></div></div></div></div>
</blockquote></div>
</blockquote></div>
</blockquote></div><div><br clear=3D"all"></div><div><br></div><span class=
=3D"gmail_signature_prefix">-- </span><br><div dir=3D"ltr" class=3D"gmail_s=
ignature"><div dir=3D"ltr">Death to &lt;Redacted&gt;, and butter sauce.<div=
>Don&#39;t boil me, I&#39;m still alive.<br><div><div>&lt;Redacted&gt; lobs=
ter!</div></div></div></div></div></div>
</blockquote></div>
</blockquote></div>
</blockquote></div><div><br clear=3D"all"></div><div><br></div><span class=
=3D"gmail_signature_prefix">-- </span><br><div dir=3D"ltr" class=3D"gmail_s=
ignature"><div dir=3D"ltr">Death to &lt;Redacted&gt;, and butter sauce.<div=
>Don&#39;t boil me, I&#39;m still alive.<br><div><div>&lt;Redacted&gt; lobs=
ter!</div></div></div></div></div></div>
</blockquote></div>

--0000000000001014370629f4c473--