Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1snzfQ-007UBm-W0 for pgsql-general@arkaria.postgresql.org; Tue, 10 Sep 2024 12:08:22 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1snzfQ-00CHru-8C for pgsql-general@arkaria.postgresql.org; Tue, 10 Sep 2024 12:08:20 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1snzfP-00CHrm-Oq for pgsql-general@lists.postgresql.org; Tue, 10 Sep 2024 12:08:19 +0000 Received: from mail-vs1-xe34.google.com ([2607:f8b0:4864:20::e34]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1snzfM-000Sh9-Jq for pgsql-general@lists.postgresql.org; Tue, 10 Sep 2024 12:08:18 +0000 Received: by mail-vs1-xe34.google.com with SMTP id ada2fe7eead31-49bd32f6a11so220815137.1 for ; Tue, 10 Sep 2024 05:08:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725970096; x=1726574896; darn=lists.postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=PcMVRpk70AY8jGm84+5eMGXXlWNvOyaW9y4kOR/pXXU=; b=XgMzSNQYJ4XT6hE0U73ApFh2WwEa01YHxwZOG+RpSV53k7GWJ2Iup1i8HNeDNpedqc VTJDDuY9HL0bxJgD1DzmZehdZ3ujxm2BV4LduL3yPV/8GAl6h16s0FqeVk8GY3d4GmKr on50ez+j5/DM35IosEUibo2JjhiavMii96/e06TWshVi7Zd69zf10MY0XddSmJzNA0b1 b5q6TUysSsv9QBw7k10EG5fqpLZxJ+8S8fly61G2QYJrFEhXYSwQ1oHRRYOR3ewMss38 eX8m6BwY9ROL789HtQZCpumW00D8w7KW9IXBUA2BTawSIMDa0nfKsYnQkevyKdC91gNu J0NA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725970096; x=1726574896; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PcMVRpk70AY8jGm84+5eMGXXlWNvOyaW9y4kOR/pXXU=; b=TTi9AVFRS7DagWW61WK5Wh58jABeaS7GCRZAw2m97X5oaZFmNZG/TIWBBv3ox9WT+y j1xaJJuqukJqW5Rnl03ACLXk79mSVaRDpNEO1Zj1Fhh5OcrgVkO1BUY4qlY4yin4ZWnq eaKV/xaReC0rJzaMsoQUMFaDpnFPbFLkcrlZbZgHo7x/wGpwZqP1jx0n0/m/2HIA6GYJ n5xmPE+61WBEPCrxKUvV2uALBLgQD50bxT+f1qn/WjIzRYmzSDLGBctE5sgQsZFBI6nY RqIZozQVQOIUpqOBp0vQz8LbnYjeX4hpLJ4bfb+vT76BeEGAEkj/bBpu0KDNOhGXyR2D uQhg== X-Gm-Message-State: AOJu0YwNrT1D+mCfUrMDYQJVbT+Tz0Ri94mLWDWOGl7oY2FynFS0k/Sz RJVgrviCiOd5escLgJlVRu/UVKxQnTMGkaRdN4Xgdrn8kiSEtYrtheA6qcAd7g2owqBvidtgAE7 cYqxqmVp5MDlqOPfMZOtPVFnD/X2/cA== X-Google-Smtp-Source: AGHT+IFp9Uax7VKYiqn5LcCL4odCHyBvOaRCksDRTgUjWYHFlUMviTCjCVSRM6hbJsSR2teLyAcTQjCzKHTxpUiF59E= X-Received: by 2002:a05:6122:a09:b0:4f5:2c0c:8528 with SMTP id 71dfb90a1353d-502bede4defmr8364067e0c.11.1725970095611; Tue, 10 Sep 2024 05:08:15 -0700 (PDT) MIME-Version: 1.0 References: <20240910091121.2pgurwdqwse4xm4j@hjp.at> In-Reply-To: <20240910091121.2pgurwdqwse4xm4j@hjp.at> From: Shaheed Haque Date: Tue, 10 Sep 2024 13:08:03 +0100 Message-ID: Subject: Re: Database schema for "custom fields" To: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000713dc80621c2ba74" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000713dc80621c2ba74 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable The relational purists will gave their concerns, but especially given what you described about your performance and volumetrics, there is a reason why JSON(b) is a thing. For type checking, and more, I've had success a multi-key approach so that one entry might comprise: - A "name" - A "type" - A "value" Of course you can add more as needed. On Tue, 10 Sep 2024, 10:11 Peter J. Holzer, wrote: > On 2024-09-10 12:08:14 +0500, Muhammad Usman Khan wrote: > > There is not a properly defined solution but you can try the > > Entity-Attribute-Value (EAV) Model. This is an alternative approach, > where a > > separate table is used to store custom fields as attributes for each > record. > > New fields can be added without altering the schema. There will be no > need for > > DDL changes. There might be some cons as you might need multiple joins = to > > retrieve all fields for a given record. > > I think this is essentially Matthias' option 3: > > > On Tue, 10 Sept 2024 at 11:57, Matthias Leisi > wrote: > > > > I=E2=80=99m looking for input on a database design question. > > > > Suppose you have an application that allows the user to add some > kind of > > field to the application (=E2=80=9Ecustom fields=E2=80=9C, =E2=80= =9Euser defined fields=E2=80=9C, > =E2=80=9Eextended > > fields=E2=80=9C, =E2=80=A6), which could be of different types (eg = string, int, > bool, date, > > array of , =E2=80=A6), and which would have some ad= ditional > > properties (like a display name or description, or some access > control > > flags). > [...] > > How would you design this from a DB point of view? I see a few > options, but > > all have some drawbacks: > [...] > > 3) Use a =E2=80=9Edata table=E2=80=9C with one column per potential= type (fieldid, > > valstring, valint, valbool, =E2=80=A6). Drawback: complex to query,= waste of > > storage? Pro: use all DB features on =E2=80=9Etrue=E2=80=9C columns= , but without > needing > > DDL privileges. > > > > Are these the right drawbacks and pro arguments? Do you see other > options? > > I pretty much agree with your analysis. I used to use your option 3 a > lot, mostly because I thought that the schema should be fixed at design > time and not changed by the application. I'm less dogmatic now and would > probably lean more to your option 1 (let the application add columns to > an "extension table"). > > hp > > -- > _ | Peter J. Holzer | Story must make more sense than reality. > |_|_) | | > | | | hjp@hjp.at | -- Charles Stross, "Creative writing > __/ | http://www.hjp.at/ | challenge!" > --000000000000713dc80621c2ba74 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

The relational purists will gave their concerns, but especia= lly given what you described about your performance and volumetrics, there = is a reason why JSON(b) is a thing. For type checking, and more, I've h= ad success a multi-key approach so that one entry might comprise:

- A "name"
- A "type"
- A "value"

Of course you can add more as needed.


On Tue= , 10 Sep 2024, 10:11 Peter J. Holzer, <hjp-pgsql@hjp.at> wrote:
= On 2024-09-10 12:08:14 +0500, Muhammad Usman Khan wrote:
> There is not a properly defined=C2=A0solution but you can try the
> Entity-Attribute-Value (EAV) Model. This is an alternative approach, w= here a
> separate table is used to store custom fields as attributes for each r= ecord.
> New fields can be added without altering the schema. There will be no = need for
> DDL changes. There might be some cons as you might need multiple joins= to
> retrieve all fields for a given record.

I think this is essentially Matthias' option 3:

> On Tue, 10 Sept 2024 at 11:57, Matthias Leisi <matthias@leisi.net> wrote:
>
>=C2=A0 =C2=A0 =C2=A0I=E2=80=99m looking for input on a database design = question.=C2=A0
>
>=C2=A0 =C2=A0 =C2=A0Suppose you have an application that allows the use= r to add some kind of
>=C2=A0 =C2=A0 =C2=A0field to the application (=E2=80=9Ecustom fields=E2= =80=9C, =E2=80=9Euser defined fields=E2=80=9C, =E2=80=9Eextended
>=C2=A0 =C2=A0 =C2=A0fields=E2=80=9C, =E2=80=A6), which could be of diff= erent types (eg string, int, bool, date,
>=C2=A0 =C2=A0 =C2=A0array of <any other type>, =E2=80=A6), and wh= ich would have some additional
>=C2=A0 =C2=A0 =C2=A0properties (like a display name or description, or = some access control
>=C2=A0 =C2=A0 =C2=A0flags).
[...]
>=C2=A0 =C2=A0 =C2=A0How would you design this from a DB point of view? = I see a few options, but
>=C2=A0 =C2=A0 =C2=A0all have some drawbacks:
[...]
>=C2=A0 =C2=A0 =C2=A03) Use a =E2=80=9Edata table=E2=80=9C with one colu= mn per potential type (fieldid,
>=C2=A0 =C2=A0 =C2=A0valstring, valint, valbool, =E2=80=A6). Drawback: c= omplex to query, waste of
>=C2=A0 =C2=A0 =C2=A0storage? Pro: use all DB features on =E2=80=9Etrue= =E2=80=9C columns, but without needing
>=C2=A0 =C2=A0 =C2=A0DDL privileges.


>=C2=A0 =C2=A0 =C2=A0Are these the right drawbacks and pro arguments? Do= you see other options?

I pretty much agree with your analysis. I used to use your option 3 a
lot, mostly because I thought that the schema should be fixed at design
time and not changed by the application. I'm less dogmatic now and woul= d
probably lean more to your option 1 (let the application add columns to
an "extension table").

=C2=A0 =C2=A0 =C2=A0 =C2=A0 hp

--
=C2=A0 =C2=A0_=C2=A0 | Peter J. Holzer=C2=A0 =C2=A0 | Story must make more = sense than reality.
|_|_) |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 |
| |=C2=A0 =C2=A0|
hjp@hjp.at=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 --= Charles Stross, "Creative writing
__/=C2=A0 =C2=A0| http://www.hjp.at/ |=C2=A0 =C2=A0 =C2=A0 =C2=A0ch= allenge!"
--000000000000713dc80621c2ba74--