MIME-Version: 1.0
References: <CAKAnmm+tbPMdP8ccrJ-o_LVgC6ADdOEoh2=J+zyWNLab6B3+_Q@mail.gmail.com>
In-Reply-To: <CAKAnmm+tbPMdP8ccrJ-o_LVgC6ADdOEoh2=J+zyWNLab6B3+_Q@mail.gmail.com>
From: Pavel Stehule <pavel.stehule@gmail.com>
Date: Wed, 17 Jul 2024 19:41:17 +0200
Message-ID: <CAFj8pRA8aJdafbL2oK6z97jGWSKpSrfQC3RDC_oKbJa-v46rFg@mail.gmail.com>
Subject: Re: Planet Postgres and the curse of AI
To: Greg Sabino Mullane <htamfids@gmail.com>
Cc: pgsql-general <pgsql-general@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="000000000000603cd9061d74fa2d"
Archived-At: <https://www.postgresql.org/message-id/CAFj8pRA8aJdafbL2oK6z97jGWSKpSrfQC3RDC_oKbJa-v46rFg%40mail.gmail.com>
Precedence: bulk

--000000000000603cd9061d74fa2d
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

st 17. 7. 2024 v 19:22 odes=C3=ADlatel Greg Sabino Mullane <htamfids@gmail.=
com>
napsal:

> I've been noticing a growing trend of blog posts written mostly, if not
> entirely, with AI (aka LLMs, ChatGPT, etc.). I'm not sure where to raise
> this issue. I considered a blog post, but this mailing list seemed a bett=
er
> forum to generate a discussion.
>
> The problem is two-fold as I see it.
>
> First, there is the issue of people trying to game the system by churning
> out content that is not theirs, but was written by a LLM. I'm not going t=
o
> name specific posts, but after a while it gets easy to recognize things
> that are written mostly by AI.
>
> These blog posts are usually generic, describing some part of Postgres
> in an impersonal, mid-level way. Most of the time the facts are not
> wrong, per se, but they lack nuances that a real DBA would bring to the
> discussion, and often leave important things out. Code examples are often
> wrong in subtle ways. Places where you might expect a deeper discussion a=
re
> glossed over.
>
> So this first problem is that it is polluting the Postgres blogs with
> overly bland, moderately helpful posts that are not written by a human, a=
nd
> do not really bring anything interesting to the table. There is a place f=
or
> posts that describe basic Postgres features, but the ones written by huma=
ns
> are much better. (yeah, yeah, "for now" and all hail our AI overlords in
> the future).
>
> The second problem is worse, in that LLMs are not merely gathering
> information, but have the ability to synthesize new conclusions and facts=
.
> In short, they can lie. Or hallucinate. However you want to call it, it's=
 a
> side effect of the way LLMs work. In a technical field like Postgres, thi=
s
> can be a very bad thing. I don't know how widespread this is, but I was
> tipped off about this over a year ago when I came across a blog suggestin=
g
> using the "max_toast_size configuration parameter". For those not
> familiar, I can assure you that Postgres does not have, nor will likely
> ever have, a GUC with that name.
>
> As anyone who has spoken with ChatGPT knows, getting small important
> details correct is not its forte. I love ChatGPT and actually use it dail=
y.
> It is amazing at doing certain tasks. But writing blog posts should not b=
e
> one of them.
>
> Do we need a policy or a guideline for Planet Postgres? I don't know. It
> can be a gray line. Obviously spelling and grammar checking is quite
> okay, and making up random GUCs is not, but the middle bit is very hazy.
> (Human) thoughts welcome.
>

It is very unpleasant to read a long article, and at the end to understand
so there is zero valuable information. Terrible situation was on planet
mariadb https://mariadb.org/planet/, but now it was cleaned. I am for some
form of moderating - and gently touching an author that writes articles
without extra value against documentation.

Regards

Pavel


>
> Cheers,
> Greg
>
>

--000000000000603cd9061d74fa2d
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">st 17. 7. 2024 v=C2=A019:22 odes=C3=
=ADlatel Greg Sabino Mullane &lt;<a href=3D"mailto:htamfids@gmail.com">htam=
fids@gmail.com</a>&gt; napsal:<br></div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);pad=
ding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr"=
>I&#39;ve been noticing a growing trend of blog posts written mostly, if no=
t entirely,=C2=A0with AI (aka LLMs, ChatGPT, etc.). I&#39;m not sure where =
to raise this issue.=C2=A0I considered a blog post, but this mailing list s=
eemed a better forum to=C2=A0generate a discussion.<br><br>The problem is t=
wo-fold as I see it.<br><br>First, there is the issue of people trying to g=
ame the system by churning out=C2=A0content that is not theirs, but was wri=
tten by a LLM. I&#39;m not going to name=C2=A0specific posts, but after a w=
hile it gets easy to recognize things that=C2=A0are written mostly by AI.<b=
r><br>These blog posts are usually generic, describing some part of Postgre=
s in=C2=A0an impersonal, mid-level way. Most of the time=C2=A0the facts are=
 not wrong,=C2=A0per se, but they lack nuances that a real DBA would bring =
to the discussion,=C2=A0and often leave important things out. Code examples=
 are often wrong in subtle ways. Places where you might expect a deeper dis=
cussion are glossed over.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">=
So this first problem is that it is polluting the Postgres blogs with overl=
y=C2=A0bland, moderately helpful posts that are not written by a human, and=
 do not=C2=A0really bring anything interesting to the table. There is a pla=
ce for posts=C2=A0that describe basic Postgres features, but the ones writt=
en by humans are=C2=A0much better. (yeah, yeah, &quot;for now&quot; and all=
 hail our AI overlords in the future).<br><br>The second problem is worse, =
in that LLMs are not merely gathering information,=C2=A0but have the abilit=
y to synthesize new conclusions and facts. In short, they can lie.=C2=A0Or =
hallucinate. However you want to call it, it&#39;s a side effect of the way=
 LLMs work. In a technical=C2=A0field like Postgres, this can be a very bad=
 thing. I don&#39;t know how widespread this=C2=A0is, but I was tipped off =
about this over a year ago when I came across a blog=C2=A0suggesting using =
the &quot;max_toast_size configuration parameter&quot;. For those not famil=
iar,=C2=A0I can assure you that Postgres does not have, nor will likely eve=
r have, a GUC with that name.<br><br>As anyone who has spoken with ChatGPT =
knows, getting small important details=C2=A0correct is not its forte. I lov=
e ChatGPT and actually use it daily. It is=C2=A0amazing at doing certain ta=
sks. But writing blog posts should not be one of them.<br><br>Do we need a =
policy or a guideline for Planet Postgres? I don&#39;t know.=C2=A0It can be=
 a gray line. Obviously spelling and grammar checking is quite okay,=C2=A0a=
nd making up random GUCs is not, but the middle bit is very hazy. (Human) t=
houghts welcome.<br></div></div></div></blockquote><div><br></div><div>It i=
s very unpleasant to read a long article, and at the end to understand so t=
here is zero valuable information. Terrible situation was on planet mariadb=
 <a href=3D"https://mariadb.org/planet/">https://mariadb.org/planet/</a>, b=
ut now it was cleaned. I am for some form of moderating - and gently touchi=
ng an author that writes articles without extra value against documentation=
. <br></div><div><br></div><div>Regards</div><div><br></div><div>Pavel<br><=
/div><div><br></div><div>=C2=A0<br></div><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);pa=
dding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr=
"></div><div dir=3D"ltr"><br></div><div>Cheers,</div><div>Greg</div><div><b=
r></div></div></div>
</blockquote></div></div>

--000000000000603cd9061d74fa2d--