Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tPn0Z-00FJRM-8U for pgsql-general@arkaria.postgresql.org; Mon, 23 Dec 2024 18:18:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tPn0Y-00E5Kk-Cw for pgsql-general@arkaria.postgresql.org; Mon, 23 Dec 2024 18:18:22 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tPn0X-00E5KV-VA for pgsql-general@lists.postgresql.org; Mon, 23 Dec 2024 18:18:21 +0000 Received: from mail-oi1-x233.google.com ([2607:f8b0:4864:20::233]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tPn0Q-001GDq-Jv for pgsql-general@postgresql.org; Mon, 23 Dec 2024 18:18:20 +0000 Received: by mail-oi1-x233.google.com with SMTP id 5614622812f47-3eba559611aso1215503b6e.0 for ; Mon, 23 Dec 2024 10:18:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734977893; x=1735582693; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=+Qd3PgpUUuQ9PcTb/mKVASfYjIFfEuo5DaUWjacvENU=; b=bLRpv5AHiEEd1aiYAlKq6L2OutRnFkxLo73GMTHcJkbWgqPZPhPxLFAWDddQ1Dh3oI xMH977Xams2qbtb5m/eflNCVnAyOx4rU0beTcNhRE/QKmybPlI1L+5eOPrArSayelCJ+ NT6gJl1z2UOYrBU77Nf6/QZ8Vm3IGz1xmUULWYIUpSTYe8TUJFplFpxyIn7qZ6yUz12j ZPfhDgPs0Q94s3mxRVGKXzccXVy9GbhP/4oHHvpM9FbRCRYi6J8GhZzBWVTUDtErY8p6 b3ltjp0F4Urfww86evmoKHJZnf9Wo6wT5fs+P6/PURVXTsqOsqew2BOV51jr7Eb8RivW s3Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734977893; x=1735582693; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+Qd3PgpUUuQ9PcTb/mKVASfYjIFfEuo5DaUWjacvENU=; b=rfZa3+hO+quORDHQxtGrCxbBHvK6UDsfVT2qf/kiBwkJ9FThpiif/ZC+zSRMmZ55j6 eLUWxyDDf16D51LeyLYFpBrjPaAJ2Q5RkhV/PTrlWS3mN7uPwJZcbKZPlp2GAQ/euRS1 yIJYwiJdXp0i9BUxo4Yi4Q27+47t4QBuAtUhhzksg5vjJz0p7rxItLP+vYd+1/MiVmFn WxCPlfbk84jji/HRdhMKbEBOZ9XilDCPDOJfnCl5K34AGAaxAlPV25oCa8xBfabu67ic eMtoA9BgIR0gJwmp+gXEnKunLoX3XRIVEgaXTN9dYvvunHxZvV36X5ySPRGZaJIOAfCe vcPw== X-Gm-Message-State: AOJu0YzwVcexiPf70s8cGQJJYCCoAv35bHtii5KiOLUResYL99aTVyPZ TsIOGYf5lRMnS/pCZdO958LroWctz5GjCDpPBcpPaBBqyUWHCCi1W+zO/QlZdvwqIwxKAwdABcf CvWWk2HC0hhl3UaeNtSKTlWXZ43370lCYcTxw8g== X-Gm-Gg: ASbGncuqNF+ruhe9G8QpUizTgHAcQD1fG5J4Ve8Z9ZHal/ffGGMBpGORS7XUzcH5Uk5 VZO1Ip1lHCJcshWj4eI0CtkPF60yIFkgpjMQPxpTpL7UyumbuKyFIAfjh13AEWkxhprDVy6hr X-Google-Smtp-Source: AGHT+IH5ef9BVMk7OyJ/4w1UFb7fGGiT1Hc0hi3utAP/EUxzCYChUkj6TZYqfPrhEyQR/ULlu6FXxbCh6KFLgsMjfXc= X-Received: by 2002:a05:6870:b14a:b0:29e:6096:c271 with SMTP id 586e51a60fabf-2a7fb00b0b6mr7335189fac.9.1734977892833; Mon, 23 Dec 2024 10:18:12 -0800 (PST) MIME-Version: 1.0 References: <06e1f1ee-74b2-43a2-9a63-da20ae455ae2@aklaver.com> In-Reply-To: From: Divyansh Gupta JNsThMAudy Date: Mon, 23 Dec 2024 23:48:01 +0530 Message-ID: Subject: Re: Need help in database design To: Ron Johnson Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000fee3dc0629f40434" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000fee3dc0629f40434 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Adrian, the partition is on userid using hash partition with 84 partitions Ron, there could be more than 20 Million records possible for a single userid in that case if I create index on userid only not on other column the query is taking more than 30 seconds to return the results. On Mon, 23 Dec 2024, 11:40=E2=80=AFpm Ron Johnson, wrote: > If your queries all reference userid, then you only need indices on gdid > and userid. > > On Mon, Dec 23, 2024 at 12:49=E2=80=AFPM Divyansh Gupta JNsThMAudy < > ag1567827@gmail.com> wrote: > >> I have one confusion with this design if I opt to create 50 columns I >> need to create 50 index which will work with userid index in Bitmap on t= he >> other hand if I create a JSONB column I need to create a single index ? >> >> On Mon, 23 Dec 2024, 11:10=E2=80=AFpm Ron Johnson, >> wrote: >> >>> Given what you just wrote, I'd stick with 50 separate t* columns. >>> Simplifies queries, simplifies updates, and eliminates JSONB conversion= s. >>> >>> On Mon, Dec 23, 2024 at 12:29=E2=80=AFPM Divyansh Gupta JNsThMAudy < >>> ag1567827@gmail.com> wrote: >>> >>>> Values can be updated based on customer actions >>>> >>>> All rows won't have all 50 key value pairs always if I make those keys >>>> into columns the rows might have null value on the other hand if it is >>>> JSONB then the key value pair will not be there >>>> >>>> Yes in UI customers can search for the key value pairs >>>> >>>> During data population the key value pair will be empty array in case >>>> of JSONB column or NULL in case of table columns, later when customer >>>> performs some actions that time the key value pairs will populate and >>>> update, based on what action customer performs. >>>> >>>> On Mon, 23 Dec 2024, 10:51=E2=80=AFpm Divyansh Gupta JNsThMAudy, < >>>> ag1567827@gmail.com> wrote: >>>> >>>>> Let's make it more understandable, here is the table schema with 50 >>>>> columns in it >>>>> >>>>> CREATE TABLE dbo.googledocs_tbl ( >>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1 >>>>> MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, >>>>> userid int8 NOT NULL, >>>>> t1 int4 NULL, >>>>> t2 int4 NULL, >>>>> t3 int4 NULL, >>>>> t4 int4 NULL, >>>>> t5 int4 NULL, >>>>> t6 int4 NULL, >>>>> t7 int4 NULL, >>>>> t8 int4 NULL, >>>>> t9 int4 NULL, >>>>> t10 int4 NULL, >>>>> t11 int4 NULL, >>>>> t12 int4 NULL, >>>>> t13 int4 NULL, >>>>> t14 int4 NULL, >>>>> t15 int4 NULL, >>>>> t16 int4 NULL, >>>>> t17 int4 NULL, >>>>> t18 int4 NULL, >>>>> t19 int4 NULL, >>>>> t20 int4 NULL, >>>>> t21 int4 NULL, >>>>> t22 int4 NULL, >>>>> t23 int4 NULL, >>>>> t24 int4 NULL, >>>>> t25 int4 NULL, >>>>> t26 int4 NULL, >>>>> t27 int4 NULL, >>>>> t28 int4 NULL, >>>>> t29 int4 NULL, >>>>> t30 int4 NULL, >>>>> t31 int4 NULL, >>>>> t32 int4 NULL, >>>>> t33 int4 NULL, >>>>> t34 int4 NULL, >>>>> t35 int4 NULL, >>>>> t36 int4 NULL, >>>>> t37 int4 NULL, >>>>> t38 int4 NULL, >>>>> t39 int4 NULL, >>>>> t40 int4 NULL, >>>>> t41 int4 NULL, >>>>> t42 int4 NULL, >>>>> t43 int4 NULL, >>>>> t44 int4 NULL, >>>>> t45 int4 NULL, >>>>> t46 int4 NULL, >>>>> t47 int4 NULL, >>>>> t48 int4 NULL, >>>>> t49 int4 NULL, >>>>> t50 int4 NULL, >>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid), >>>>> ); >>>>> >>>>> Every time when i query I will query it along with userid >>>>> Ex : where userid =3D 12345678 and t1 in (1,2,3) and t2 in (0,1,2) >>>>> more key filters if customer applies >>>>> >>>>> On the other hand if I create a single jsonb column the schema will >>>>> look like : >>>>> >>>>> CREATE TABLE dbo.googledocs_tbl ( >>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1 >>>>> MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, >>>>> userid int8 NOT NULL, >>>>> addons_json jsonb default '{}'::jsonb >>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid), >>>>> ); >>>>> >>>>> and the query would be like >>>>> where userid =3D 12345678 and ((addons_json @> {t1:1}) or (addons_js= on @> >>>>> {t1:2}) or (addons_json @> {t1:3}) >>>>> more key filters if customer applies >>>>> >>>>> >>>>> >>>>> On Mon, Dec 23, 2024 at 10:38=E2=80=AFPM David G. Johnston < >>>>> david.g.johnston@gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy < >>>>>> ag1567827@gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> So here my question is considering one JSONB column is perfect or >>>>>>> considering 50 columns will be more optimised. >>>>>>> >>>>>> The relational database engine is designed around the column-based >>>>>> approach. Especially if the columns are generally unchanging, combi= ned >>>>>> with using fixed-width data types. >>>>>> >>>>>> David J. >>>>>> >>>>>> >>> >>> -- >>> Death to , and butter sauce. >>> Don't boil me, I'm still alive. >>> lobster! >>> >> > > -- > Death to , and butter sauce. > Don't boil me, I'm still alive. > lobster! > --000000000000fee3dc0629f40434 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Adrian, the partition is on userid using hash partition with= 84 partitions

Ron, there could be more than 20 Million records possible fo= r a single userid in that case if I create index on userid only not on othe= r column the query is taking more than 30 seconds to return the results.

On Mon, 23 Dec 2024, 11:40=E2=80=AFpm Ron Johnson, <ronljohnsonjr@gmail.com> wrot= e:
If your qu= eries all reference userid, then you only need indices on gdid and userid.<= /div>
O= n Mon, Dec 23, 2024 at 12:49=E2=80=AFPM Divyansh Gupta JNsThMAudy <ag15= 67827@gmail.com> wrote:

I have one confusion with this design if I op= t to create 50 columns I need to create 50 index which will work with useri= d index in Bitmap on the other hand if I create a JSONB column I need to cr= eate a single index ?


On Mon= , 23 Dec 2024, 11:10=E2=80=AFpm Ron Johnson, <ronljohnsonjr@gmail.c= om> wrote:
Given what you just wrote, I'd stick with 50 se= parate t* columns.=C2=A0 Simplifies queries, simplifies updates, and elimin= ates JSONB conversions.

On Mon, Dec 23, 2024 at 12:29=E2=80=AFPM Divyansh Gu= pta JNsThMAudy <ag1567827@gmail.com> wrote:
=

Values can= be updated based on customer actions

All rows won't have all 50 key value pairs always if I m= ake those keys into columns the rows might have null value on the other han= d if it is JSONB then the key value pair will not be there

Yes in UI customers can search for the key value pairs

During data population the key value pair will be empty arra= y in case of JSONB column or NULL in case of table columns, later when cust= omer performs some actions that time the key value pairs will populate and = update, based on what action customer performs.


On Mon= , 23 Dec 2024, 10:51=E2=80=AFpm Divyansh Gupta JNsThMAudy, <= ag1567827@gmail.com> wrote:
Let's make it more understandable, = here is the table schema with 50 columns in it=C2=A0

CREATE TABLE db= o.googledocs_tbl (
gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMEN= T BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NO= T NULL,
userid int8 NOT NULL,
t1 int4 NULL,
t2 int4 NULL,
t= 3 int4 NULL,
t4 int4 NULL,
t5 int4 NULL,
t6 int4 NULL,
t7 i= nt4 NULL,
t8 int4 NULL,
t9 int4 NULL,
t10 int4 NULL,
t11 in= t4 NULL,
t12 int4 NULL,
t13 int4 NULL,
t14 int4 NULL,
t15 i= nt4 NULL,
t16 int4 NULL,
t17 int4 NULL,
t18 int4 NULL,
t19 = int4 NULL,
t20 int4 NULL,
t21 int4 NULL,
t22 int4 NULL,
t23= int4 NULL,
t24 int4 NULL,
t25 int4 NULL,
t26 int4 NULL,
t2= 7 int4 NULL,
t28 int4 NULL,
t29 int4 NULL,
t30 int4 NULL,
t= 31 int4 NULL,
t32 int4 NULL,
t33 int4 NULL,
t34 int4 NULL,
= t35 int4 NULL,
t36 int4 NULL,
t37 int4 NULL,
t38 int4 NULL,
= t39 int4 NULL,
t40 int4 NULL,
t41 int4 NULL,
t42 int4 NULL, t43 int4 NULL,
t44 int4 NULL,
t45 int4 NULL,
t46 int4 NULL, t47 int4 NULL,
t48 int4 NULL,
t49 int4 NULL,
t50 int4 NULL,<= br> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
);

Every t= ime when i query I will query it along with userid=C2=A0
Ex : where user= id =3D 12345678 and t1 in (1,2,3) and t2 in (0,1,2)
more key filters if = customer applies=C2=A0

On the other hand if I create a single jsonb = column the schema will look like :

CREATE TABLE dbo.googledocs_tbl (=
gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1 = MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
userid= int8 NOT NULL,
addons_json jsonb default '{}'::jsonb
CONST= RAINT googledocs_tbl_pkey PRIMARY KEY (gdid),
);

and the query wo= uld be like=C2=A0
where userid =3D 12345678 and ((addons_json=C2=A0@>= {t1:1}) or=C2=A0 (addons_json=C2=A0@&g= t; {t1:2}) or=C2=A0 (addons_json=C2=A0@&g= t; {t1:3})
more key filters if customer applies=C2=A0



On = Mon, Dec 23, 2024 at 10:38=E2=80=AFPM David G. Johnston <david.g.johnston@gmail.com> wrote:


On Mo= n, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy <ag1567827@gmail.com> wrote:


So = here my question is considering one JSONB column is perfect or considering = 50 columns will be more optimised.

The relational database engine is designed around the column-based ap= proach.=C2=A0 Especially if the columns are generally unchanging, combined = with using fixed-width data types.

David J.



--
Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.
<Redacted> lobs= ter!


--
Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.
<Redacted> lobs= ter!
--000000000000fee3dc0629f40434--