Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sPSHP-004h40-76 for pgsql-general@arkaria.postgresql.org; Thu, 04 Jul 2024 19:38:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sPSHL-002Of3-5K for pgsql-general@arkaria.postgresql.org; Thu, 04 Jul 2024 19:38:03 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sPSHK-002OeS-Q9 for pgsql-general@lists.postgresql.org; Thu, 04 Jul 2024 19:38:03 +0000 Received: from mail-yb1-xb29.google.com ([2607:f8b0:4864:20::b29]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sPSHJ-000RG3-G7 for pgsql-general@lists.postgresql.org; Thu, 04 Jul 2024 19:38:02 +0000 Received: by mail-yb1-xb29.google.com with SMTP id 3f1490d57ef6-e03a5e54fa6so180040276.3 for ; Thu, 04 Jul 2024 12:38:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720121879; x=1720726679; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=XL7r92PjSVywHp3SKJjhZ7Ba1rrIKFWEEjlzb5sclbE=; b=Uz+rmw5CHT57EzQ1n2Jzbc9WVAVS7MFnl7hHs1v9edBYICTghhDMhuX7Ex5/cm5Y0I Uxipzhtv5xDCGHPo1w+S7Ry/bDuCYsLxuZXoNPOF2K8h/E9lR156LgPJwS2VI4cAVBtz erGkbwMkZ62DTUAcujiQv1zz7DBJ8XdvOgpz5elpKD0Gsa93LwpPJwTETcEG40YuX2qO zFXcmun5INS/6+yCgDmfOQ3LNca2bB7sli3nhNNbiw0cHolZ4l2itgNBX8AbtHh3wpx0 lQ4SlzXq2KBEuJ+SIt8aZuNbyy6vqttno5/EAaMjgocRacOGRuWWtZ4kDypzmbJ5VciL MUdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720121879; x=1720726679; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=XL7r92PjSVywHp3SKJjhZ7Ba1rrIKFWEEjlzb5sclbE=; b=PHgB9bA4MMHoUHse3k6xq2R1ipGxpiwiGGQW+GNKLUbLmGrAGPGIkKL7/QAJ4mmDMM mv2DYfaNrZul7vLtdkjYGnC8LK0/n4TzHx+m22BR74EkNr0nHMvU58awYms9f49HQs0u Yq/qvUIRF7JE3yCLt2Brh/10aK7670S+5YtYlayHrVa1Fov46DUPeEe89mfVvJu9KzFj wX8jOV2vuYx+CqCTXQUtge5Dajyy9HQpfMtJSjrPCZv57tTgOXK4zoey/ePpXJqhf2MQ /bl81aoHcIn44AWg+eHsh6cdB+Km4Vpc4Nw1UDh89aIdEFLQpVK4h6YjTysGwQ3UJ0V1 k+8g== X-Gm-Message-State: AOJu0Yx5Sp7VblAHa4fr5O5MwPSnJHyGURLWUVgP0ad2Goy4t0EmhAB2 UNxnReazerfjPc4SC62n6P7/P1Qq9Pn29i7nPXd1W4fpACMfTr3L/ImndCaib/Sabs6vLj3hA6m umjMDK9d6AR7BK4x7HtWxoAzsGds0nXvr X-Google-Smtp-Source: AGHT+IGf7I0AqKgQkbfdzINdLOW876raetklKlGPVQdJ8C/zV10n01VWAZNMSsf1/E7cCnaQtG1tHPspLNx1wEcU+7c= X-Received: by 2002:a25:fc09:0:b0:e03:b3b5:8d9e with SMTP id 3f1490d57ef6-e03d1a8cc20mr225181276.5.1720121879290; Thu, 04 Jul 2024 12:37:59 -0700 (PDT) MIME-Version: 1.0 From: Lok P Date: Fri, 5 Jul 2024 01:07:47 +0530 Message-ID: Subject: Design strategy for table with many attributes To: pgsql-general Content-Type: multipart/alternative; boundary="00000000000095f181061c711547" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000095f181061c711547 Content-Type: text/plain; charset="UTF-8" Hello, In one of the applications we are getting transactions in messages/events format and also in files and then they are getting parsed and stored into the relational database. The number of attributes/columns each transaction has is ~900+. Logically they are part of one single transaction and should be stored in one table as one single row. There will be ~500million such transactions each day coming into the system. And there will be approx ~10K peak write TPS and 5K read TPS in target state. This system has a postgres database as a "source of truth" or OLTP store. And then data moves to snowflakes for the olap store. Initially when the system was designed the number of attributes per transaction was <100 but slowly the business wants to keep/persist other attributes too in the current system and the number of columns keep growing. However, as worked with some database systems , we get few suggestions from DBA's to not have many columns in a single table. For example in oracle they say not to go beyond ~255 columns as then row chaining and row migration type of things are going to hunt us. Also we are afraid concurrent DMLS on the table may cause this as a contention point. So I wanted to understand , in such a situation what would be the best design approach we should use irrespective of databases? Or say, what is the maximum number of columns per table we should restrict? Should we break the single transaction into multiple tables like one main table and other addenda tables with the same primary key to join and fetch the results wherever necessary? Regards Lok --00000000000095f181061c711547 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,
In one of the applications we are getting transa= ctions in messages/events format and also in files and then they are gettin= g parsed and stored into the relational database. The number of attributes/= columns each transaction has is ~900+. Logically they are part of one singl= e transaction and should be stored in one table as one single row. There wi= ll be ~500million such transactions each day coming into the system. And th= ere will be approx ~10K peak write TPS and 5K read TPS in target state. Thi= s system has a postgres database as a "source of truth" or OLTP s= tore. And then data moves to snowflakes for the olap store.

Initiall= y when the system was designed the number of attributes per transaction was= <100 but slowly the business wants to keep/persist other attributes too= in the current system and the number of columns keep growing.

Howev= er, as worked with some database systems , we get few suggestions from DBA&= #39;s to not have many columns in a single table. For example in oracle the= y say not to go beyond ~255 columns as then row chaining and row migration = type of things are going to hunt us. Also we are afraid concurrent DMLS on = the table may cause this as a contention point. So I wanted to understand ,= in such a situation what would be the best design approach we should use i= rrespective of databases? Or say, what is the maximum number of columns per= table we should restrict? Should we break the single transaction into mult= iple tables like one main table and other addenda tables with the same prim= ary key to join and fetch the results wherever necessary?

Regards
Lok
--00000000000095f181061c711547--