Re: Need help in database design - Divyansh Gupta JNsThMAudy

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Divyansh Gupta JNsThMAudy <[email protected]>
To: Ron Johnson <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: Need help in database design
Date: Mon, 23 Dec 2024 23:48:01 +0530
Message-ID: <CAHesJ5KtKm9fjhMdR1+cC-M5jW98Sz6sWKbt0mN6SJcfkq9eig@mail.gmail.com> (raw)
In-Reply-To: <CANzqJaD-MwXzvg97q0iLvAdkf=DnUMOq0Ex2_eNU7sTxEL7bfA@mail.gmail.com>
References: <CAHesJ5LES3aTDf=xp7NOwrADQ_HWC-Spsv7yLu9ZY+zxzZO53A@mail.gmail.com>
	<[email protected]>
	<CAHesJ5+ASNoSNMiC5Ms0Ts=gw7v2_UeBpUT=phujO4yE_XCbEw@mail.gmail.com>
	<CAHesJ5JbkCBZ2f_AvUr8+KWnGPAsobu4zyfnWm8bEeb7X9oqDQ@mail.gmail.com>
	<[email protected]>
	<CAHesJ5JLzhHiGSBSkJZ7x7rGgHeeByP=wWk1D5GG=x8cJ5YY6Q@mail.gmail.com>
	<CAKFQuwYdpzwcbSdQ8TvZ-nVjPeHVVz+5=bWofCbUK+p_o=axrQ@mail.gmail.com>
	<CAHesJ5+yTenkAxOT8H33Cfe=1b2kSyXGqxFYfYz5fgYAVVvFmw@mail.gmail.com>
	<CAHesJ5KaJ8p7QhB9UUoFEbA87cU7ke4GBMkKR3q2FJPVv9GXyw@mail.gmail.com>
	<CANzqJaB_s8eXCZJvYO9CLvgJNqrshD=G5GgECi1M9=vk-JHjdQ@mail.gmail.com>
	<CAHesJ5LgLi9-uGCk3J9TUkuyttysz3fzTaP+o57EjcBtwDYKZA@mail.gmail.com>
	<CANzqJaD-MwXzvg97q0iLvAdkf=DnUMOq0Ex2_eNU7sTxEL7bfA@mail.gmail.com>

Adrian, the partition is on userid using hash partition with 84 partitions

Ron, there could be more than 20 Million records possible for a single
userid in that case if I create index on userid only not on other column
the query is taking more than 30 seconds to return the results.

On Mon, 23 Dec 2024, 11:40 pm Ron Johnson, <[email protected]> wrote:

> If your queries all reference userid, then you only need indices on gdid
> and userid.
>
> On Mon, Dec 23, 2024 at 12:49 PM Divyansh Gupta JNsThMAudy <
> [email protected]> wrote:
>
>> I have one confusion with this design if I opt to create 50 columns I
>> need to create 50 index which will work with userid index in Bitmap on the
>> other hand if I create a JSONB column I need to create a single index ?
>>
>> On Mon, 23 Dec 2024, 11:10 pm Ron Johnson, <[email protected]>
>> wrote:
>>
>>> Given what you just wrote, I'd stick with 50 separate t* columns.
>>> Simplifies queries, simplifies updates, and eliminates JSONB conversions.
>>>
>>> On Mon, Dec 23, 2024 at 12:29 PM Divyansh Gupta JNsThMAudy <
>>> [email protected]> wrote:
>>>
>>>> Values can be updated based on customer actions
>>>>
>>>> All rows won't have all 50 key value pairs always if I make those keys
>>>> into columns the rows might have null value on the other hand if it is
>>>> JSONB then the key value pair will not be there
>>>>
>>>> Yes in UI customers can search for the key value pairs
>>>>
>>>> During data population the key value pair will be empty array in case
>>>> of JSONB column or NULL in case of table columns, later when customer
>>>> performs some actions that time the key value pairs will populate and
>>>> update, based on what action customer performs.
>>>>
>>>> On Mon, 23 Dec 2024, 10:51 pm Divyansh Gupta JNsThMAudy, <
>>>> [email protected]> wrote:
>>>>
>>>>> Let's make it more understandable, here is the table schema with 50
>>>>> columns in it
>>>>>
>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1
>>>>> MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>> userid int8 NOT NULL,
>>>>> t1 int4 NULL,
>>>>> t2 int4 NULL,
>>>>> t3 int4 NULL,
>>>>> t4 int4 NULL,
>>>>> t5 int4 NULL,
>>>>> t6 int4 NULL,
>>>>> t7 int4 NULL,
>>>>> t8 int4 NULL,
>>>>> t9 int4 NULL,
>>>>> t10 int4 NULL,
>>>>> t11 int4 NULL,
>>>>> t12 int4 NULL,
>>>>> t13 int4 NULL,
>>>>> t14 int4 NULL,
>>>>> t15 int4 NULL,
>>>>> t16 int4 NULL,
>>>>> t17 int4 NULL,
>>>>> t18 int4 NULL,
>>>>> t19 int4 NULL,
>>>>> t20 int4 NULL,
>>>>> t21 int4 NULL,
>>>>> t22 int4 NULL,
>>>>> t23 int4 NULL,
>>>>> t24 int4 NULL,
>>>>> t25 int4 NULL,
>>>>> t26 int4 NULL,
>>>>> t27 int4 NULL,
>>>>> t28 int4 NULL,
>>>>> t29 int4 NULL,
>>>>> t30 int4 NULL,
>>>>> t31 int4 NULL,
>>>>> t32 int4 NULL,
>>>>> t33 int4 NULL,
>>>>> t34 int4 NULL,
>>>>> t35 int4 NULL,
>>>>> t36 int4 NULL,
>>>>> t37 int4 NULL,
>>>>> t38 int4 NULL,
>>>>> t39 int4 NULL,
>>>>> t40 int4 NULL,
>>>>> t41 int4 NULL,
>>>>> t42 int4 NULL,
>>>>> t43 int4 NULL,
>>>>> t44 int4 NULL,
>>>>> t45 int4 NULL,
>>>>> t46 int4 NULL,
>>>>> t47 int4 NULL,
>>>>> t48 int4 NULL,
>>>>> t49 int4 NULL,
>>>>> t50 int4 NULL,
>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>> );
>>>>>
>>>>> Every time when i query I will query it along with userid
>>>>> Ex : where userid = 12345678 and t1 in (1,2,3) and t2 in (0,1,2)
>>>>> more key filters if customer applies
>>>>>
>>>>> On the other hand if I create a single jsonb column the schema will
>>>>> look like :
>>>>>
>>>>> CREATE TABLE dbo.googledocs_tbl (
>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1
>>>>> MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
>>>>> userid int8 NOT NULL,
>>>>> addons_json jsonb default '{}'::jsonb
>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
>>>>> );
>>>>>
>>>>> and the query would be like
>>>>> where userid = 12345678 and ((addons_json @> {t1:1}) or  (addons_json @>
>>>>> {t1:2}) or  (addons_json @> {t1:3})
>>>>> more key filters if customer applies
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2024 at 10:38 PM David G. Johnston <
>>>>> [email protected]> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>>
>>>>>>> So here my question is considering one JSONB column is perfect or
>>>>>>> considering 50 columns will be more optimised.
>>>>>>>
>>>>>> The relational database engine is designed around the column-based
>>>>>> approach.  Especially if the columns are generally unchanging, combined
>>>>>> with using fixed-width data types.
>>>>>>
>>>>>> David J.
>>>>>>
>>>>>>
>>>
>>> --
>>> Death to <Redacted>, and butter sauce.
>>> Don't boil me, I'm still alive.
>>> <Redacted> lobster!
>>>
>>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>

view thread (19+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Need help in database design
  In-Reply-To: <CAHesJ5KtKm9fjhMdR1+cC-M5jW98Sz6sWKbt0mN6SJcfkq9eig@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox