Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sU8fE-000jvv-UJ for pgsql-general@arkaria.postgresql.org; Wed, 17 Jul 2024 17:42:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sU8fD-003hBJ-0e for pgsql-general@arkaria.postgresql.org; Wed, 17 Jul 2024 17:42:03 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sU8fC-003hBB-HV for pgsql-general@lists.postgresql.org; Wed, 17 Jul 2024 17:42:03 +0000 Received: from mail-yw1-x112f.google.com ([2607:f8b0:4864:20::112f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sU8f5-0003ep-K5 for pgsql-general@lists.postgresql.org; Wed, 17 Jul 2024 17:42:01 +0000 Received: by mail-yw1-x112f.google.com with SMTP id 00721157ae682-6561850a7bcso479217b3.3 for ; Wed, 17 Jul 2024 10:41:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721238114; x=1721842914; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=HX+HN/VNMQYjxbuRfFcUsfgPTg6xh6rqwy9RRSHjPfU=; b=hEhUZ+cMafarmF5NIPdxUaMAGb28VOr/io7KVeCD0+bXBayHctjpVPXFD34GBGZOvk Bu/yh3Ew9S9tvfQFQ9kGCloJovIxzmVNmpm+boSgmEyN5uHnEGKapXLoScMTlZYE86u1 3gs0KXif3181nydSCiee5L1/hTY281mlriZZ6a7+JpnU394WV7M4RDJ9VR0aHbQMGk1Z HUDxhXJYaMZtZ8E86fO/L5prPDfF/OgT4Z+2Y/GZpvfIrvTg0sTv2mcm4ZqSscYQkq6+ JUrUeucQvzOqXC60WIKMZMoe0uryQMQanin1U8dByY49QPKFOdIcXS2acgmPxcmphQ2E 1JAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721238114; x=1721842914; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HX+HN/VNMQYjxbuRfFcUsfgPTg6xh6rqwy9RRSHjPfU=; b=S6/hQTl9Ob54I9Wgt/FU3x1C9onfuacRJN4VJyuE5tBtyLcAzrduG9KoTCyslYDe5M 4jAQX4o1s0+MkzOliCBvzWyg7tc6MGL7SeNjO55BB9Rg8HQ7s2g73UaTOotcuRXjXxM1 cvbry0fF0VTID3jmjTYjcLtiU69OPIL+W7RqHRjizeKvl7xwC1A0MDr/Qo6rwbjLq41y SN0bniPTCe+SubsAGI4fZ+KHD9tuOIjfY0g8K071fAKrSLm/V5hqaQrjvstecmAEqtLp tojCx8nFHza9muaYTr7ASdnoCb0k3NmduKz0WpTnkLgnuj6g7wPo8WNX44gkSWd6R7jS AR8Q== X-Gm-Message-State: AOJu0YzGoHk3tgMndK/pPcXdPPpBVS0QVOqCHh1G7GSWr/AD81PJT+Gc GR3D0aEtqTTTtNMwDyjOM9Z7jAlZoKTAoIeeCuQMnPz4x1/jjjY1eKRVDN6zEb7wIBtaq/rlZ4Z g5imAu9RibrKMum343vZHEkrY4qU= X-Google-Smtp-Source: AGHT+IFKXwbZ051iei5hj7LfcMN8Vod/spkU++mjnXvMuz+qDY5dJh0YGKaqHWUgGR+2aiWln8ciRNKJwQR/yC1D6Zo= X-Received: by 2002:a0d:ea07:0:b0:630:8fe1:b626 with SMTP id 00721157ae682-66500d3501bmr27927277b3.48.1721238114284; Wed, 17 Jul 2024 10:41:54 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Pavel Stehule Date: Wed, 17 Jul 2024 19:41:17 +0200 Message-ID: Subject: Re: Planet Postgres and the curse of AI To: Greg Sabino Mullane Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000603cd9061d74fa2d" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000603cd9061d74fa2d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable st 17. 7. 2024 v 19:22 odes=C3=ADlatel Greg Sabino Mullane napsal: > I've been noticing a growing trend of blog posts written mostly, if not > entirely, with AI (aka LLMs, ChatGPT, etc.). I'm not sure where to raise > this issue. I considered a blog post, but this mailing list seemed a bett= er > forum to generate a discussion. > > The problem is two-fold as I see it. > > First, there is the issue of people trying to game the system by churning > out content that is not theirs, but was written by a LLM. I'm not going t= o > name specific posts, but after a while it gets easy to recognize things > that are written mostly by AI. > > These blog posts are usually generic, describing some part of Postgres > in an impersonal, mid-level way. Most of the time the facts are not > wrong, per se, but they lack nuances that a real DBA would bring to the > discussion, and often leave important things out. Code examples are often > wrong in subtle ways. Places where you might expect a deeper discussion a= re > glossed over. > > So this first problem is that it is polluting the Postgres blogs with > overly bland, moderately helpful posts that are not written by a human, a= nd > do not really bring anything interesting to the table. There is a place f= or > posts that describe basic Postgres features, but the ones written by huma= ns > are much better. (yeah, yeah, "for now" and all hail our AI overlords in > the future). > > The second problem is worse, in that LLMs are not merely gathering > information, but have the ability to synthesize new conclusions and facts= . > In short, they can lie. Or hallucinate. However you want to call it, it's= a > side effect of the way LLMs work. In a technical field like Postgres, thi= s > can be a very bad thing. I don't know how widespread this is, but I was > tipped off about this over a year ago when I came across a blog suggestin= g > using the "max_toast_size configuration parameter". For those not > familiar, I can assure you that Postgres does not have, nor will likely > ever have, a GUC with that name. > > As anyone who has spoken with ChatGPT knows, getting small important > details correct is not its forte. I love ChatGPT and actually use it dail= y. > It is amazing at doing certain tasks. But writing blog posts should not b= e > one of them. > > Do we need a policy or a guideline for Planet Postgres? I don't know. It > can be a gray line. Obviously spelling and grammar checking is quite > okay, and making up random GUCs is not, but the middle bit is very hazy. > (Human) thoughts welcome. > It is very unpleasant to read a long article, and at the end to understand so there is zero valuable information. Terrible situation was on planet mariadb https://mariadb.org/planet/, but now it was cleaned. I am for some form of moderating - and gently touching an author that writes articles without extra value against documentation. Regards Pavel > > Cheers, > Greg > > --000000000000603cd9061d74fa2d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
st 17. 7. 2024 v=C2=A019:22 odes=C3= =ADlatel Greg Sabino Mullane <htam= fids@gmail.com> napsal:
I've been noticing a growing trend of blog posts written mostly, if no= t entirely,=C2=A0with AI (aka LLMs, ChatGPT, etc.). I'm not sure where = to raise this issue.=C2=A0I considered a blog post, but this mailing list s= eemed a better forum to=C2=A0generate a discussion.

The problem is t= wo-fold as I see it.

First, there is the issue of people trying to g= ame the system by churning out=C2=A0content that is not theirs, but was wri= tten by a LLM. I'm not going to name=C2=A0specific posts, but after a w= hile it gets easy to recognize things that=C2=A0are written mostly by AI.
These blog posts are usually generic, describing some part of Postgre= s in=C2=A0an impersonal, mid-level way. Most of the time=C2=A0the facts are= not wrong,=C2=A0per se, but they lack nuances that a real DBA would bring = to the discussion,=C2=A0and often leave important things out. Code examples= are often wrong in subtle ways. Places where you might expect a deeper dis= cussion are glossed over.

= So this first problem is that it is polluting the Postgres blogs with overl= y=C2=A0bland, moderately helpful posts that are not written by a human, and= do not=C2=A0really bring anything interesting to the table. There is a pla= ce for posts=C2=A0that describe basic Postgres features, but the ones writt= en by humans are=C2=A0much better. (yeah, yeah, "for now" and all= hail our AI overlords in the future).

The second problem is worse, = in that LLMs are not merely gathering information,=C2=A0but have the abilit= y to synthesize new conclusions and facts. In short, they can lie.=C2=A0Or = hallucinate. However you want to call it, it's a side effect of the way= LLMs work. In a technical=C2=A0field like Postgres, this can be a very bad= thing. I don't know how widespread this=C2=A0is, but I was tipped off = about this over a year ago when I came across a blog=C2=A0suggesting using = the "max_toast_size configuration parameter". For those not famil= iar,=C2=A0I can assure you that Postgres does not have, nor will likely eve= r have, a GUC with that name.

As anyone who has spoken with ChatGPT = knows, getting small important details=C2=A0correct is not its forte. I lov= e ChatGPT and actually use it daily. It is=C2=A0amazing at doing certain ta= sks. But writing blog posts should not be one of them.

Do we need a = policy or a guideline for Planet Postgres? I don't know.=C2=A0It can be= a gray line. Obviously spelling and grammar checking is quite okay,=C2=A0a= nd making up random GUCs is not, but the middle bit is very hazy. (Human) t= houghts welcome.

It i= s very unpleasant to read a long article, and at the end to understand so t= here is zero valuable information. Terrible situation was on planet mariadb= https://mariadb.org/planet/, b= ut now it was cleaned. I am for some form of moderating - and gently touchi= ng an author that writes articles without extra value against documentation= .

Regards

Pavel
<= /div>

=C2=A0

Cheers,
Greg
--000000000000603cd9061d74fa2d--