Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sPTPj-004nUC-Lr for pgsql-general@arkaria.postgresql.org; Thu, 04 Jul 2024 20:50:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sPTPf-002t7K-Nz for pgsql-general@arkaria.postgresql.org; Thu, 04 Jul 2024 20:50:44 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sPTPf-002t6U-Al for pgsql-general@lists.postgresql.org; Thu, 04 Jul 2024 20:50:44 +0000 Received: from mail-oa1-x32.google.com ([2001:4860:4864:20::32]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sPTPZ-000RiK-5x for pgsql-general@lists.postgresql.org; Thu, 04 Jul 2024 20:50:43 +0000 Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-25e1610e359so507984fac.1 for ; Thu, 04 Jul 2024 13:50:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720126234; x=1720731034; darn=lists.postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=sBw3NX4mH3DQvOS3jSp5VeSZPDYPjYFuak79M/pMkjQ=; b=c5bGvbo29yW+dPDaGuneB2VTVLcXFJ+jhMfOo+FYXKlIbHFzPHYSWdu3ze9HkR9Cv+ knBPa2lPeAnHd7IB2XnPKQ6eD0lJGiARnfZbq5zvx6NtIVPE/RaZT/9uYBOcbOlNtXVf YB0z4n3bC8LRMe/OKHX9zBG7JeE2jOqbMi4CXIHZoPmtpYdrwO0DV1DhAHq8TRNQBxCr K1IaY6JTwTEfsKxwYF/13KLD3Et3EejgSvLZ8BVuJoupfgMqwBPWd6t4FeS6/sB5Hg6+ Zls2G+tQD/yFyF0Sw4OY9Gtg53OLIiUBqmIquleXcAPCsFKtrr8qpaZMYBNCJpghdeBJ 8tFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720126234; x=1720731034; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sBw3NX4mH3DQvOS3jSp5VeSZPDYPjYFuak79M/pMkjQ=; b=mbl88Ra3t9OOhg3r2OiBKLhP2v/FGlfmhWHcSCbbc9Glqp+ZShn4Aq62ye78ZZcTSv iXIw6AIeI05jtlMPNYDeLg60z6ay4FkGR1Z34NbJ7t7p2tPTe9Q4DRnu6hrQPSpsoKHy 9X1lYcOyW0xDmNdKJbNhkL61wvw1bcu+9/RxQsd/gJyzvwmPSVPv/JJFFtFLzqHSPZm2 JJf3d41hrVLxmfTklulfXsqSZMncmPOs3+Vkt85jTrZgEQSR7gsvTebkfg5nEsaAJxxe PCvo0wa4D0Tg9VO9xdjVlGNnxiisfyklpO9c0LI/l7KHE7uj9yF+WKSpv+9luxLmk8wl fSRQ== X-Gm-Message-State: AOJu0YwBlvuMjmikZFufhbOKKV8vTMGbSgegpv7RcD1DBB8oUOGfaSaJ W24oo/xYn8P+9UN+FDlXsw1/qIVPnQ5Bm4HV4xZ8pT4WZuItFehmtdfOqfpiRBoVUPw9tdYQqD0 MKVEp3753CBD8BOOsHNsw6up5gF9SyzCO X-Google-Smtp-Source: AGHT+IH0TMY1S9oH+Zmdg1AYBju6Y+A8SO1VSavLjqPf4qSYqJ2V/MMRbAr5qBN5SM4g4Qlk0hp+3SSIhQQPNu+JUKA= X-Received: by 2002:a05:6870:8182:b0:25e:2208:6c8a with SMTP id 586e51a60fabf-25e2b8cf279mr2364028fac.4.1720126234046; Thu, 04 Jul 2024 13:50:34 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ron Johnson Date: Thu, 4 Jul 2024 16:50:23 -0400 Message-ID: Subject: Re: Design strategy for table with many attributes To: "pgsql-generallists.postgresql.org" Content-Type: multipart/alternative; boundary="00000000000026388a061c72191c" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000026388a061c72191c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Jul 4, 2024 at 3:38=E2=80=AFPM Lok P wrote: > Hello, > In one of the applications we are getting transactions in messages/events > format and also in files and then they are getting parsed and stored into > the relational database. The number of attributes/columns each transactio= n > has is ~900+. Logically they are part of one single transaction > Nothing out of the ordinary. > and should be stored in one table as one single row. > Says who? > There will be ~500million such transactions each day coming into the > system. And there will be approx ~10K peak write TPS and 5K read TPS in > target state. This system has a postgres database as a "source of truth" = or > OLTP store. And then data moves to snowflakes for the olap store. > > Initially when the system was designed the number of attributes per > transaction was <100 but slowly the business wants to keep/persist other > attributes too in the current system and the number of columns keep growi= ng. > > However, as worked with some database systems , we get few suggestions > from DBA's to not have many columns in a single table. For example in > oracle they say not to go beyond ~255 columns as then row chaining and ro= w > migration type of things are going to hunt us. Also we are afraid > concurrent DMLS on the table may cause this as a contention point. So I > wanted to understand , in such a situation what would be the best design > approach we should use irrespective of databases? Or say, what is the > maximum number of columns per table we should restrict? Should we break t= he > single transaction into multiple tables like one main table and other > addenda tables with the same primary key to join and fetch the results > wherever necessary? > You need database normalization. It's a big topic. Here's a good simple explanation: https://popsql.com/blog/normalization-in-sql --00000000000026388a061c72191c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Jul 4, 2024 at 3:38=E2=80=AFPM Lo= k P <loknath.73@gmail.com>= ; wrote:
Hello,
In one of the applications w= e are getting transactions in messages/events format and also in files and = then they are getting parsed and stored into the relational database. The n= umber of attributes/columns each transaction has is ~900+. Logically they a= re part of one single transaction

No= thing out of the ordinary.
=C2=A0
and should be stored in one table = as one single row.

Says who?
=C2=A0
There will be ~500million such transactions each day coming into the= system. And there will be approx ~10K peak write TPS and 5K read TPS in ta= rget state. This system has a postgres database as a "source of truth&= quot; or OLTP store. And then data moves to snowflakes for the olap store.<= br>
Initially when the system was designed the number of attributes per = transaction was <100 but slowly the business wants to keep/persist other= attributes too in the current system and the number of columns keep growin= g.

However, as worked with some database systems , we get few sugges= tions from DBA's to not have many columns in a single table. For exampl= e in oracle they say not to go beyond ~255 columns as then row chaining and= row migration type of things are going to hunt us. Also we are afraid conc= urrent DMLS on the table may cause this as a contention point. So I wanted = to understand , in such a situation what would be the best design approach = we should use irrespective of databases? Or say, what is the maximum number= of columns per table we should restrict? Should we break the single transa= ction into multiple tables like one main table and other addenda tables wit= h the same primary key to join and fetch the results wherever necessary?

You need database normalization.=C2= =A0 It's a big topic.=C2=A0 Here's a good simple explanation:
=
--00000000000026388a061c72191c--