Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tPpal-00Ferc-Cb for pgsql-general@arkaria.postgresql.org; Mon, 23 Dec 2024 21:03:55 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tPpaj-00G35I-QD for pgsql-general@arkaria.postgresql.org; Mon, 23 Dec 2024 21:03:53 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tPpaj-00G35A-At for pgsql-general@lists.postgresql.org; Mon, 23 Dec 2024 21:03:53 +0000 Received: from mail-oi1-x232.google.com ([2607:f8b0:4864:20::232]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tPpag-001HZw-39 for pgsql-general@postgresql.org; Mon, 23 Dec 2024 21:03:52 +0000 Received: by mail-oi1-x232.google.com with SMTP id 5614622812f47-3ebb669df2eso2044364b6e.1 for ; Mon, 23 Dec 2024 13:03:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734987829; x=1735592629; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qaxWrwhTTpUrZ7fhdVdp/AtwVixaZ927KAsV6sS3s2U=; b=BralT/J+wgO5zMB71AXbzeXmodCawDLYWX4KACUcKSmT4arFVBUCHKe1mv2PrqTFh/ b5Lu2rvj7IYJSuY2No65C+xw5Ysi8vTR3Q0YrZSYd74HNiL9e2dUkUae59JtFr8l1rgL x/UNtMNCQJv6dOlKs47S9T8mVv5UQJWjaClzvN0SHpTgfYVECgtg1luTYgSULxSoLVUM 652mFooNFM5Wc1dru4DxhIzWVEqOVLk5efvU6Re2TWENsGQP8U3aq0eJtZIAJOZ0NsgG a3+LGt+YRHh92XQJ03ukJoKCkQc2cZNs9EokTf81c2/kjEnMczF2SU/Vz7hjotKeJZyg Jk4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734987829; x=1735592629; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qaxWrwhTTpUrZ7fhdVdp/AtwVixaZ927KAsV6sS3s2U=; b=gXABxDSL7Zu30dT2Dm1CW6d91Fu2hSqN4ifMNkYXcXLOGzcAsjy76eriJbAiduR0Ob gRz0kJLktnkKcYg5jERYgv0zO79vxA9FHaBM7263oSXzkBat9gc4b7CPBcBL0Xvi/sK4 49/et0URVEiVR+KlqE1GeXf5cKqM82iD7I9HUE6rhBL77Xbxdtg1JvPEmXdHEdD9FDTF ZiCdUKPpua8Q+TqCcFaei5wNG1ZoZuiVaiXsk1vdce/mEeSfDlAfIVFI7+LE7RD9gQgQ 14anGdbYYieBuEGE9pN76Wd0OTYpyr6KfKxQEdma/ytdBM7URSLhw6N91xtf0ToC5cTh IyXw== X-Forwarded-Encrypted: i=1; AJvYcCXWTcp6/rehuCI7z5enBa2ITdHKOAOf0mY1a3yRZ7qT/rhFMC3ffUcoJ1O5GuY9y+JQxQSrXAwT0ZgWw1Z3@postgresql.org X-Gm-Message-State: AOJu0YzOFI3u5/Kb972Z28FZCIgf1JhsWA/j97sdFl1pES+3Ul8jklAY yXIbBz4DND1dLe4KzL9A8a8StTQfHyCq59VG5lH1OnN/LSxBOhYdxJ7CqwtSCy4f07WnX7zsuJ3 +vVjSddURNZMyAO8wLD9FYwnXiN0= X-Gm-Gg: ASbGncsNKifNjlekAR1+i9kgWVy6Xta90hW9BgR9WoVB6NzAXO5KEaZ6TDtorcRZdmj 5iV0H4Sxmcd+axQIDKwkEwCo3U3dVXifJkGv11w== X-Google-Smtp-Source: AGHT+IEck4KU2d3qHD4Be0vnT3SyJXyA/gVhn3F1vyhdOAYmajZZ7Yy8Osrjbabq81gRN7LL/rM/jtHEV3/Che3w380= X-Received: by 2002:a05:6808:3a0c:b0:3eb:3dd6:81bc with SMTP id 5614622812f47-3ed89071a93mr8524141b6e.15.1734987828729; Mon, 23 Dec 2024 13:03:48 -0800 (PST) MIME-Version: 1.0 References: <06e1f1ee-74b2-43a2-9a63-da20ae455ae2@aklaver.com> In-Reply-To: From: "David G. Johnston" Date: Mon, 23 Dec 2024 14:03:12 -0700 Message-ID: Subject: Re: Need help in database design To: Divyansh Gupta JNsThMAudy Cc: Ron Johnson , pgsql-general Content-Type: multipart/alternative; boundary="000000000000389dae0629f655d2" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000389dae0629f655d2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Dec 23, 2024 at 11:26=E2=80=AFAM Divyansh Gupta JNsThMAudy < ag1567827@gmail.com> wrote: > > gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE 1 > MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, > > userid int8 NOT NULL, > > useremail varchar(600) NOT NULL, > > title public.citext NULL, > > authorname varchar(600) NULL, > > authoremail varchar(600) NULL, > > updated varchar(300) NOT NULL, > > entryid varchar(2000) NOT NULL, > > lastmodifiedby varchar(600) NULL, > > lastmodifiedbyemail varchar(600) NULL, > > "size" varchar(300) NULL, > > contenttype varchar(250) NULL, > > fileextension varchar(50) NULL, > > docfoldername public.citext NULL, > > folderresourceid public.citext NULL, > > filesize int8 DEFAULT 0 NOT NULL, > > retentionstatus int2 DEFAULT 0 NOT NULL, > > docfileref int8 NULL, > > usid int4 NULL, > > archivepath varchar(500) NULL, > > createddate timestamp(6) DEFAULT NULL::timestamp without time zone NULL, > > zipfilename varchar(100) NULL, > > oncreatedat timestamp(6) DEFAULT clock_timestamp() NOT NULL, > > onupdateat timestamp(6) DEFAULT clock_timestamp() NOT NULL, > > startsnapshot int4 DEFAULT 0 NOT NULL, > > currentsnapshot int4 DEFAULT 0 NOT NULL, > > dismiss int2 DEFAULT 0 NOT NULL, > > checksum varchar NULL, > > typeoffile int2 GENERATED ALWAYS AS ( > > CASE > > WHEN authoremail::text =3D useremail::text THEN 0::smallint > > ELSE 1::smallint > > END) STORED NOT NULL, > > parquetfilename varchar(100) NULL, > > metadata_vector tsvector GENERATED ALWAYS AS (to_tsvector('english':: > regconfig, (((((COALESCE(title::character varying, ''::text::character > varying)::text || ' '::text) || (COALESCE(docfoldername::text, ''::text) > || ' '::text)) || (COALESCE(authorname, ''::text::character varying)::tex= t > || ' '::text)) || (COALESCE(fileextension, ''::text::character varying):: > text || ' '::text)) || (COALESCE(lastmodifiedby, ''::text::character > varying)::text || ' '::text)) || COALESCE(contenttype, ''::character > varying::text::character varying)::text)) STORED NULL, > > isfileencrypted int4 DEFAULT 0 NULL, > > addons_json jsonb DEFAULT '{}'::jsonb NULL, > > CONSTRAINT googledocs_tbl_clone_pkey PRIMARY KEY (gdid, userid), > > CONSTRAINT fk_googledocs_tbl_clone_users_tbl FOREIGN KEY (userid) > REFERENCES dbo.users_tbl(uid) ON DELETE CASCADE > > ) > > PARTITION BY HASH (userid); > >> >>>> There are a lot of unconventional choices for data types and column naming there...and it doesn't seem as normalized as it could be. Besides that, consider that this probably should be at least two tables. Put columns that you expect to change whenever the user makes a typical edit on one table. Put those columns that can never change, or at least would rarely do so, on another. Narrower tables is usually a win and with this division you aren't throwing away and replacing all of the static data each time the volatile data changes. David J. --000000000000389dae0629f655d2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Mon, Dec 23, 2024 at 11:26=E2=80=AFAM Divyansh Gupta JN= sThMAudy <ag1567827@gmail.com= > wrote:

=

gdid int8<= /span> GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE <= span style=3D"color:rgb(192,192,192)">1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,

userid int8 NOT NULL,

useremail varchar(600) NOT NULL,

title public.citext NULL,

authorname = varchar(600) NULL,

authoremail varchar(600) NULL,

updated varchar(300) NOT NULL,

entryid varchar(2000) NOT NULL,

lastmodifiedby varchar(600= ) NULL,

<= p style=3D"margin:0px"> lastmodifiedby= email varcha= r(600) NULL,

"size" varchar(300) NULL,

contenttype varchar(250) NU= LL,

fileextension varchar(50) NULL,

docfoldername = public.citext NULL,

folderresourceid public.citext NULL,

filesize int8 DEFAULT 0 NOT NULL,

retentionstatus int2 DEFAULT 0 NOT NULL,

docfileref int8 NULL,

usid int4 NULL,

archivepath varchar(500) NULL,

createddate timestamp(6) DEFAULT NULL::timestamp without time zone NULL,

zipfilename varchar(100) NU= LL,

oncreatedat timestamp(6) DEFAULT clock_timestamp() NOT NULL,

onupdateat timestamp(6) DEFAULT clock_timestamp() NOT NULL,

startsnapshot int4 DEFAULT 0 <= span style=3D"color:rgb(115,158,202);font-weight:bold">NOT NULL,

currentsnapshot int4 DEFAULT 0 NOT NULL,

dismiss int2 DEFAULT 0 NOT NULL,

checksum varchar NULL,

= typeoffile int2 GENERATED ALWAYS AS (

CASE

WHEN authoremail::text =3D useremail::text TH= EN 0::smallint

ELSE 1::smallint

END) STORED NOT NULL,

parquetfilename varchar(100) NULL,

metadata_vector tsvector GENERATED ALWAYS AS (to_= tsvector('english'::regconfig, (((((COALESCE<= /span>(title::character varying, ''::text::character varying)::text || ' '<= /span>::text= ) || (COALESCE(docfoldername::text, ''::text) || ' '::text)) || (COALESCE(authorname,= ''::text::character varying)::text || ' '::text)) || (COALESCE(fileextensi= on, ''::text::character varying)::text || ' '::text)) || (COALESCE(lastmodifiedby, ''= ;::text::character varying)::= text || ' '::text)) || COALESCE(contenttype, &#= 39;'::ch= aracter vary= ing::text::character varying)= ::text)) STORED NULL,

isfileencrypted= int4 DEFAULT 0 NULL,

addons_json jsonb DEFAULT = 9;{}'::j= sonb NULL,

CONSTRAINT goo= gledocs_tbl_clone_pkey PRIMARY KEY (gdid, userid),

CONSTRAINT fk_googledocs_tbl_clone_users_tbl= FOREIGN KEY (userid) REFERENCES dbo.users_tbl(uid) ON DELETE CASCADE

)

PARTITION B= Y HASH (userid);


=

There are a lot of = unconventional choices for data types and column naming there...and it=C2= =A0doesn't seem as normalized as it could be.

Besi= des that, consider that this probably should be at least two tables.=C2=A0 = Put columns that you expect to change whenever the user makes a typical edi= t on one table.=C2=A0 Put those columns that can never change, or at least = would rarely do so, on another.=C2=A0 Narrower tables is usually a win and = with this division you aren't throwing away and replacing all of the st= atic data each time the volatile data changes.

David J= .

--000000000000389dae0629f655d2--