Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sPbcA-005Q2y-To for pgsql-general@arkaria.postgresql.org; Fri, 05 Jul 2024 05:36:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sPbc8-006Wm1-KF for pgsql-general@arkaria.postgresql.org; Fri, 05 Jul 2024 05:36:09 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sPbc8-006Wlt-8Q for pgsql-general@lists.postgresql.org; Fri, 05 Jul 2024 05:36:08 +0000 Received: from mail-oi1-x22a.google.com ([2607:f8b0:4864:20::22a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sPbc5-000VM8-Um for pgsql-general@lists.postgresql.org; Fri, 05 Jul 2024 05:36:08 +0000 Received: by mail-oi1-x22a.google.com with SMTP id 5614622812f47-3d562882f4cso740016b6e.3 for ; Thu, 04 Jul 2024 22:36:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720157763; x=1720762563; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DoTfA3twUg1oJ4sL2Gouh6CFMnGsWrrzgWuXmECbPoo=; b=Qs3FRUhEomi55VYuwZLtFfEE0aSndXETf/7z7Bq3Oi7TbTqdG3PnxUYngmp4J8WlcC IV4o9OixLqnepWlynua/6BUC2/5DfIM2YGsdKH/L+cwbt3YXsZTfMSf1jnBFvtmvQF4o 3FOA5OD1g/VC0h+bdLB1AhukPHpI2I7n6PuFUWM2LzXYmjkEMHllLhKZA2j2CI95YqRV ez8eajjpGRDx2H835oj2jK9Nv8SE9PZmQpmD/Yo6htD8W1p9iyupRru8r47J8VwOW6w0 5hM5WnGF0aFRGaw0cF/QXVPO596xZKF3p/eVTZ/FRzxqZk5TMEltpYd4uFsBl4ATrTnN 32bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720157763; x=1720762563; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DoTfA3twUg1oJ4sL2Gouh6CFMnGsWrrzgWuXmECbPoo=; b=rgaM+Hs6DAIGd+fVRREWvs4Rjf8gcLXXuWbyHrISkmyQri60sbkLN32IBILl1OGjHo nOcw/BJiVYQCY5VLOS2N5wSFM6u/m3GizmDSbjL+UHeWj3v9HoHNvTIRGUAwvsExteEi L3gPgsAcO/MMAnkDf1nqaSPwBHZk0aOrjAgo81vqrzKAyszVdXlg9O+YezlrrRgkCx8f e6jh7qCMjeG7i/tMlemG7KI9Pcpw5yhSi+0Wq0EOny3UtLkM/8lZPdZ7eOq0/Z/oSrRK a1tVbL/+gLutoJIA7OsXLfyN5FlCz12zeFahXJ/sl1uaERypwRjoMzfldAVAi4Fp0tmP sgOA== X-Gm-Message-State: AOJu0Yyyc6Vo+v2yemgdBNtznCzv3V2Zr52Cz+jm/bAjIIMXqqNng4M8 /KhQ+XNV88SSV8vVh3Hw5ZGk8qluCNOGiFHmZNjr4JJ+sOxI5f2oyEsbaMqwj7rpTjnzYvBqsRy iYNOjinY7Bud6OwS351dzgQ+kCPKGnQ== X-Google-Smtp-Source: AGHT+IHVfQAJc0P5vIwsyBymIuRPy3628XS5KqqeuMZVzk+XXAV1xUN1IA4bqcQfU65YvnNwP1O1Q5epPcMma4hukyQ= X-Received: by 2002:a05:6808:1383:b0:3d6:9c05:1aff with SMTP id 5614622812f47-3d914c43713mr4160335b6e.10.1720157763486; Thu, 04 Jul 2024 22:36:03 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac9:6850:0:b0:539:aa10:6c7 with HTTP; Thu, 4 Jul 2024 22:36:03 -0700 (PDT) In-Reply-To: References: From: "David G. Johnston" Date: Thu, 4 Jul 2024 22:36:03 -0700 Message-ID: Subject: Re: Design strategy for table with many attributes To: Lok P Cc: pgsql-general Content-Type: multipart/alternative; boundary="0000000000007351e3061c79701f" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000007351e3061c79701f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thursday, July 4, 2024, Lok P wrote: > > But do you also suggest keeping those table pieces related to each other > through the same primary key ? > > Yes, everyone row produced from the input data =E2=80=9Crow=E2=80=9D should= get the same ID associated with it - either as an entire PK or a component of a multi-column PK/unique index. > > > Won't there be a problem when we load the data like say for example , in > normal scenario the data load will be to one table but when we break it t= o > multiple tables it will happen to all the individual pieces, won't that > cause additional burden to the data load? > Yes, doing this requires additional CPU time to perform the work. I=E2=80= =99d say IO is hopefully a wash. > > Also I understand the technical limitation of the max number of > columns per table is ~1600. But should you advise to restrict/stop us to > some low number long before reaching that limit , such that we will not > face any anomalies when we grow in future. > In a row-oriented system wider is worser. I start evaluation of table design with that in mind at the fourth column (including the surrogate key that is usually present, and the natural key - ignoring auditing columns.) > > And if we should maintain any specific order in the columns from start t= o > end column in the specific table? > There is material out there on micro-optimizing column ordering to match with alignment boundaries. I=E2=80=99d the benefit is meaningful but there= is a non-trivial cost to actually setup the testing to verify that what you=E2= =80=99ve figured out is working. Never actually done it myself. Though it actually seems like something someone could/may have written an algorithm for (though I do not recall ever seeing mention of one.) David J. --0000000000007351e3061c79701f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thursday, July 4, 2024, Lok P <loknath.73@gmail.com> wrote:

Bu= t do you also suggest keeping those table pieces related to each other thro= ugh the same primary key ?


Yes, everyone row produced from the input data =E2=80= =9Crow=E2=80=9D should get the same ID associated with it - either as an en= tire PK or a component of a multi-column PK/unique index.=C2=A0
<= br>

=C2=A0Won't there be a problem when we loa= d the data like say for example , in normal scenario=C2=A0the data load wil= l be to one table but when we break it to multiple tables it will happen to= all the individual pieces, won't that cause additional burden to the d= ata load?

Yes, doing this= requires additional CPU time to perform the work.=C2=A0 I=E2=80=99d say IO= is hopefully a wash.
=C2=A0
=

Also I unde= rstand the technical limitation of the max number of columns=C2=A0per table= is ~1600. But should you advise to restrict/stop us to some low number lon= g before reaching that limit , such that we will not face any anomalies whe= n we grow in future.

In a= row-oriented system wider is worser.=C2=A0 I start evaluation of table des= ign with that in mind at the fourth column (including the surrogate key tha= t is usually present, and the natural key - ignoring auditing columns.)
=C2=A0

=C2=A0And if we should maintain any = specific order in the columns from start to end column in the specific tabl= e?

There is material out = there on micro-optimizing column ordering to match with alignment boundarie= s.=C2=A0 I=E2=80=99d the benefit is meaningful but there is a non-trivial c= ost to actually setup the testing to verify that what you=E2=80=99ve figure= d out is working.=C2=A0 Never actually done it myself.=C2=A0 Though it actu= ally seems like something someone could/may have written an algorithm for (= though I do not recall ever seeing mention of one.)

David J.

--0000000000007351e3061c79701f--