MIME-Version: 1.0
References: 
 <CAFj8pRBWMLP3Vyr8z+19eaiJKQoVtBfmDhNJFKXDX6uFzd4vBQ@mail.gmail.com>
 <c7eee33c5051f973ab256ee7202c15771c33f47d.camel@cybertec.at>
 <afb4e296d7ce57986f23bcfdee39b259b8f85f56.camel@cybertec.at>
 <CAFj8pRAjU-X6rEE9=1++PdtXOPc2uo=yu-tcFXByi-kN3B_7Vw@mail.gmail.com>
 <CAFj8pRC+hPCc2X88xC=pTJoqmVPApDsageZOMyqaxi5788WxHA@mail.gmail.com>
 <CAFj8pRDJ9cq00VYSHxs6LsoHNWjhYXyWWBtV6UgeWwhs0AHa9A@mail.gmail.com>
 <CAFj8pRBPXTcw_3fpKtgVthV2+9rZGhxitZ40DnAwCrK601TZZg@mail.gmail.com>
 <ndtfl4tsnpkb7m7hwvnmlpsascpgd3a7xvjmjhtxffsbrgygtm@4du6zsmnnwq5>
 <CAFj8pRAu4XvNCGu1751t=2YEqLqTjDA3FavMExm2S0KYQq=DdQ@mail.gmail.com>
 <CAFj8pRAsEoeZv0HEnA8CKgFKDSQ-wYw18Os1vdksWCV7ez2bVw@mail.gmail.com>
 <3chredgnjcmccym2kczawfih226b4ac6co7p6z4jeofevrcosi@mrsxkx2x2c65>
 <CAFj8pRDV_jp9Lsc7aN_HL48yjTFMKtCw1KVEcgxK4D0=d93xeA@mail.gmail.com>
In-Reply-To: 
 <CAFj8pRDV_jp9Lsc7aN_HL48yjTFMKtCw1KVEcgxK4D0=d93xeA@mail.gmail.com>
From: Pavel Stehule <pavel.stehule@gmail.com>
Date: Fri, 15 Nov 2024 05:45:58 +0100
Message-ID: 
 <CAFj8pRB9X5XOU1gRfMAvF-uQMNFr6VqwTzzxxfqRjUeJ_0BN7g@mail.gmail.com>
Subject: Re: proposal: schema variables
To: Dmitry Dolgov <9erthalion6@gmail.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>, Erik Rijkers <er@xs4all.nl>,
	Michael Paquier <michael@paquier.xyz>, Amit Kapila <amit.kapila16@gmail.com>,
	DUVAL REMI <REMI.DUVAL@cheops.fr>,
	PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="00000000000080638f0626ec40ef"
Archived-At: 
 <https://www.postgresql.org/message-id/CAFj8pRB9X5XOU1gRfMAvF-uQMNFr6VqwTzzxxfqRjUeJ_0BN7g%40mail.gmail.com>
Precedence: bulk

--00000000000080638f0626ec40ef
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=C4=8Dt 14. 11. 2024 v 8:41 odes=C3=ADlatel Pavel Stehule <pavel.stehule@gm=
ail.com>
napsal:

>
>
> st 13. 11. 2024 v 17:35 odes=C3=ADlatel Dmitry Dolgov <9erthalion6@gmail.=
com>
> napsal:
>
>> > On Sun, Nov 10, 2024 at 06:51:40PM GMT, Pavel Stehule wrote:
>> > ne 10. 11. 2024 v 17:19 odes=C3=ADlatel Pavel Stehule <
>> pavel.stehule@gmail.com>
>> > napsal:
>> > I thought a lot of time about better solutions for identifier collisio=
ns
>> > and I really don't think so there is some consistent user friendly
>> syntax.
>> > Personally I think there is an easy already implemented solution -
>> > convention - just use a dedicated schema for variables and this schema
>> > should not be in the search path. Or use secondary convention - like
>> using
>> > prefix "__" for session variables. Common convention is using "_" for
>> > PLpgSQL variables. I searched how this issue is solved in other
>> databases,
>> > or in standard, and I found nothing special. The Oracle and SQL/PSM ha=
s
>> a
>> > concept of visibility - the variables are not visible outside packages
>> or
>> > modules, but Postgres has nothing similar. It can be emulated by a
>> > dedicated schema without inserting a search path, but it is less stron=
g.
>> >
>> > I think we can introduce an alternative syntax, that will not be user
>> > friendly or readable friendly, but it can be without collisions - or c=
an
>> > decrease possible risks.
>> >
>> > It is nothing new - SQL does it with old, "new" syntax of inner joins,
>> or
>> > in Postgres we can
>> >
>> > where salary < 40000
>> >
>> > or
>> >
>> > where pg_catalog.int4lt(salary, 40000);
>> >
>> >
>> > or some like we use for operators OPERATOR(*schema*.*operatorname*)
>> >
>> > So introducing VARIABLE(schema.variablename) syntax as an alternative
>> > syntax for accessing variables I really like. I strongly prefer to use
>> this
>> > as only alternative (secondary) syntax, because I don't think it is
>> > friendly syntax or writing friendly, but it is safe, and I can imagine
>> > tools that can replace generic syntax to this special, or that detects
>> > generic syntax and shows some warning. Then users can choose what they
>> > prefer. Two syntaxes - generic and special can be good enough for all =
-
>> and
>> > this can be perfectly consistent with current Postgres.
>>
>> As far as I recall, last time this topic was discussed in hackers, two
>> options were proposed: the one with VARIABLE(name), what you mention
>> here; and another one with adding variables to the FROM clause. The
>> VARIABLE(...) syntax didn't get much negative feedback, so I guess why
>> not -- if you find it fitting, it would be interesting to see the
>> implementation.
>>
>> I'm afraid it should not be just an alternative syntax, but the only one
>> allowed, because otherwise I don't see how scenarious like "drop a
>> column with the same name" could be avoided. As in the previous thread:
>>
>>     -- we've got a variable b at the same time
>>     SELECT a, b FROM table1;
>>
>> Then dropping the column b, but everything still works beause the
>> variable b got silently picked up. But if it would be required to say
>> VARIABLE(b), then all fine.
>>
>
> In this scenario you will get  a warning related to variable shadowing
> (before you drop a column).
>
> I think this issue can be partially similar to creating two equally named
> tables in different schemas (both schemas are in search path). When you
> drop one table, the query will work, but the result is different. It is t=
he
> same issue. The SQL has no concept of shadowing and on the base line it i=
s
> not necessary. But when you integrate SQL with some procedural code then
> you should solve this issue (or accept). This issue is real, and it is in
> every procedural enhancement of SQL that I know with the same syntax.  On
> the other hand I doubt this is a real issue. The changes of system
> catalogue are tested before production - so probably you will read a
> warning about a shadowed variable, and probably you will get different
> results, because variable b has the same value for all rows, and probably
> will have different value than column b. I can imagine the necessity of
> disabling this warning on production systems. Shadowing by self is not an
> issue, probably, but it is a signal of code quality problems.
>
> But this scenario is real, and then it is a question if the warning about
> shadowed variables should be only optional and if it can be disabled. May=
be
> not. Generally the shadowing is a strange concept - it is safeguard again=
st
> serious issues, but it should not be used generally and everywhere the
> developer should rename the conflict identifiers.
>

There can be another example against usage of the FROM clause for
variables. Because it solves just one special case, but others are not
covered.

Theoretically, variables can have the same names as tables. The table
overshadows the variable, so it can work. But when somebody drops the
variable, then the query still can work. So requirement of usage variable
in FROM clause protects us just against drop column, but not against
dropping table. In Postgres the dropping table is possibly risky due
search_path (that introduces shadowing concept) without introduction
variables. There is a possibility of this issue, but how common is this
issue?

Regards

Pavel


>
> Regards
>
> Pavel
>
>
>> And to make sure we're on the same page, could you post couple of
>> examples from curretly existing tests in the patch, how are they going
>> to look like with this proposal?
>>
>> About adding variables to the FROM clause. Looks like this option was
>> quite popular, and you've mentioned some technical challenges
>> implementing that. If you'd like to go with another approach, it would
>> be great to elaborate on that -- maybe even with a PoC, to make a
>> convincing point here.
>>
>

--00000000000080638f0626ec40ef
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">=C4=8Dt 14. 11. 2024 v=C2=A08:41 odes=
=C3=ADlatel Pavel Stehule &lt;<a href=3D"mailto:pavel.stehule@gmail.com">pa=
vel.stehule@gmail.com</a>&gt; napsal:<br></div><blockquote class=3D"gmail_q=
uote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,2=
04);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">st 13. 11. 2024=
 v=C2=A017:35 odes=C3=ADlatel Dmitry Dolgov &lt;<a href=3D"mailto:9erthalio=
n6@gmail.com" target=3D"_blank">9erthalion6@gmail.com</a>&gt; napsal:<br></=
div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bor=
der-left:1px solid rgb(204,204,204);padding-left:1ex">&gt; On Sun, Nov 10, =
2024 at 06:51:40PM GMT, Pavel Stehule wrote:<br>
&gt; ne 10. 11. 2024 v 17:19 odes=C3=ADlatel Pavel Stehule &lt;<a href=3D"m=
ailto:pavel.stehule@gmail.com" target=3D"_blank">pavel.stehule@gmail.com</a=
>&gt;<br>
&gt; napsal:<br>
&gt; I thought a lot of time about better solutions for identifier collisio=
ns<br>
&gt; and I really don&#39;t think so there is some consistent user friendly=
 syntax.<br>
&gt; Personally I think there is an easy already implemented solution -<br>
&gt; convention - just use a dedicated schema for variables and this schema=
<br>
&gt; should not be in the search path. Or use secondary convention - like u=
sing<br>
&gt; prefix &quot;__&quot; for session variables. Common convention is usin=
g &quot;_&quot; for<br>
&gt; PLpgSQL variables. I searched how this issue is solved in other databa=
ses,<br>
&gt; or in standard, and I found nothing special. The Oracle and SQL/PSM ha=
s a<br>
&gt; concept of visibility - the variables are not visible outside packages=
 or<br>
&gt; modules, but Postgres has nothing similar. It can be emulated by a<br>
&gt; dedicated schema without inserting a search path, but it is less stron=
g.<br>
&gt;<br>
&gt; I think we can introduce an alternative syntax, that will not be user<=
br>
&gt; friendly or readable friendly, but it can be without collisions - or c=
an<br>
&gt; decrease possible risks.<br>
&gt;<br>
&gt; It is nothing new - SQL does it with old, &quot;new&quot; syntax of in=
ner joins, or<br>
&gt; in Postgres we can<br>
&gt;<br>
&gt; where salary &lt; 40000<br>
&gt;<br>
&gt; or<br>
&gt;<br>
&gt; where pg_catalog.int4lt(salary, 40000);<br>
&gt;<br>
&gt;<br>
&gt; or some like we use for operators OPERATOR(*schema*.*operatorname*)<br=
>
&gt;<br>
&gt; So introducing VARIABLE(schema.variablename) syntax as an alternative<=
br>
&gt; syntax for accessing variables I really like. I strongly prefer to use=
 this<br>
&gt; as only alternative (secondary) syntax, because I don&#39;t think it i=
s<br>
&gt; friendly syntax or writing friendly, but it is safe, and I can imagine=
<br>
&gt; tools that can replace generic syntax to this special, or that detects=
<br>
&gt; generic syntax and shows some warning. Then users can choose what they=
<br>
&gt; prefer. Two syntaxes - generic and special can be good enough for all =
- and<br>
&gt; this can be perfectly consistent with current Postgres.<br>
<br>
As far as I recall, last time this topic was discussed in hackers, two<br>
options were proposed: the one with VARIABLE(name), what you mention<br>
here; and another one with adding variables to the FROM clause. The<br>
VARIABLE(...) syntax didn&#39;t get much negative feedback, so I guess why<=
br>
not -- if you find it fitting, it would be interesting to see the<br>
implementation.<br>
<br>
I&#39;m afraid it should not be just an alternative syntax, but the only on=
e<br>
allowed, because otherwise I don&#39;t see how scenarious like &quot;drop a=
<br>
column with the same name&quot; could be avoided. As in the previous thread=
:<br>
<br>
=C2=A0 =C2=A0 -- we&#39;ve got a variable b at the same time<br>
=C2=A0 =C2=A0 SELECT a, b FROM table1;<br>
<br>
Then dropping the column b, but everything still works beause the<br>
variable b got silently picked up. But if it would be required to say<br>
VARIABLE(b), then all fine. <br></blockquote></div></div></blockquote><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:=
1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D=
"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px=
 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"></blockquot=
e><div><br></div><div>In this scenario you will get=C2=A0 a warning related=
 to variable shadowing (before you drop a column).</div><div><br></div><div=
>I think this issue can be partially similar to creating two equally named =
tables in different schemas (both schemas are in search path). When you dro=
p one table, the query will work, but the result is different. It is the sa=
me issue. The SQL has no concept of shadowing and on the base line it is no=
t necessary. But when you integrate SQL with some procedural code then you =
should solve this issue (or accept). This issue is real, and it is in every=
 procedural enhancement of SQL that I know with the same syntax.=C2=A0 On t=
he other hand I doubt this is a real issue. The changes of system catalogue=
 are tested before production - so probably you will read a warning about a=
 shadowed variable, and probably you will get different results, because va=
riable b has the same value for all rows, and probably will have different =
value than column b. I can imagine the necessity of disabling this warning =
on production systems. Shadowing by self is not an issue, probably, but it =
is a signal of code quality problems.<br></div><div><br></div><div>But this=
 scenario is real, and then it is a question if the warning about shadowed =
variables should be only optional and if it can be disabled. Maybe not. Gen=
erally the shadowing is a strange concept - it is safeguard against serious=
 issues, but it should not be used generally and everywhere the developer s=
hould rename the conflict identifiers.<br></div></div></div></blockquote><d=
iv><br></div><div>There can be another example against usage of the FROM cl=
ause for variables. Because it solves just one special case, but others are=
 not covered.</div><div><br></div><div>Theoretically, variables can have th=
e same names as tables. The table overshadows the variable, so it can work.=
 But when somebody drops the variable, then the query still can work. So re=
quirement of usage variable in FROM clause protects us just against drop co=
lumn, but not against dropping table. In Postgres the dropping table is pos=
sibly risky due search_path (that introduces shadowing concept) without int=
roduction variables. There is a possibility of this issue, but how common i=
s this issue? <br></div><div><br></div><div>Regards</div><div><br></div><di=
v>Pavel<br></div><br><div>=C2=A0</div><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);paddi=
ng-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div></div><div><b=
r></div><div>Regards</div><div><br></div><div>Pavel<br></div><div><br></div=
><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border=
-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
And to make sure we&#39;re on the same page, could you post couple of<br>
examples from curretly existing tests in the patch, how are they going<br>
to look like with this proposal?<br>
<br>
About adding variables to the FROM clause. Looks like this option was<br>
quite popular, and you&#39;ve mentioned some technical challenges<br>
implementing that. If you&#39;d like to go with another approach, it would<=
br>
be great to elaborate on that -- maybe even with a PoC, to make a<br>
convincing point here.<br>
</blockquote></div></div>
</blockquote></div></div>

--00000000000080638f0626ec40ef--