public inbox for [email protected]
help / color / mirror / Atom feedFrom: Jürgen Purtz <[email protected]>
To: David G. Johnston <[email protected]>
Cc: Peter Eisentraut <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Thomas Kellerer <[email protected]>
Cc: Pg Docs <[email protected]>
Subject: Re: Change JOIN tutorial to focus more on explicit joins
Date: Thu, 22 Oct 2020 15:32:00 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAKFQuwZY5EqtMH_9SzTG=wSxq14m3FuwqPNDeeyK=Z6JO5-vUw@mail.gmail.com>
References: <[email protected]>
<CA+hUKG+K+75fo+cKA2Vb_kmFsxX=9qgi48_dLqPyShfhFdSKMw@mail.gmail.com>
<[email protected]>
<[email protected]>
<[email protected]>
<CAKFQuwZY5EqtMH_9SzTG=wSxq14m3FuwqPNDeeyK=Z6JO5-vUw@mail.gmail.com>
On 22.10.20 01:40, David G. Johnston wrote:
> On Fri, Sep 4, 2020 at 2:36 AM Jürgen Purtz <[email protected]
> <mailto:[email protected]>> wrote:
>
> On 04.09.20 08:52, Peter Eisentraut wrote:
> >
> > For the remaining patch I have a couple of concerns:
>
>
> This patch should not be changing the formatting choices for these
> queries, just the addition of a JOIN clause and modification of the
> WHERE clause. Specifically, SELECT is left-aligned while all
> subsequent clauses indent under it. Forced alignment by adding extra
> spaces isn't done here either. I have not altered those in the attached.
>
> Did some word-smithing on the first paragraph. The part about the
> cross-join was hurt by "in some way" and "may be" is not needed.
>
> Pointing out that values from both tables doesn't seem like an
> improvement when the second item covers that and it is more specific
> in noting that the city name that is joined on appears twice - once
> from each table.
>
> ON expression is more precise and the reader should be ok with the term.
>
> Removal of the exercise is good. Not the time to discuss cross join
> anyway. Given that "ON true" works the cross join form isn't even
> required.
>
> In the FROM clause form I would not add table prefixes to the column
> names. They are not part of the form changing. If discussion about
> table prefixing is desired it should be done explicitly and by
> itself. They are used later on, I didn't check to see whether that
> was covered or might be confusing.
>
> I suggested a wording for why to use join syntax that doesn't involve
> legacy and points out its merit compared to sticking a join expression
> into the where clause.
>
> The original patch missed having the syntax for the first left outer
> join conform to the multi-line query writing standard you introduced.
> I did not change.
>
> The "AND" ON clause should just go with (not changed):
>
> ON (w1.temp_lo < w2.temp_lo
> AND w1.temp_hi > w2.temp_high);
>
> Attaching my suggestions made on top of the attached original
> 0002-query.patch
>
> David J.
>
(Hopefully) I have integrated all of David's suggestions as well as the
following rules:
- Syntax formatting with the previously used 4 spaces plus newline for JOIN
- Table aliases only when necessary or explicitly discussed
The discussion about the explicit vs. implicit syntax is added to the
"As join expressions serve a specific purpose ... " sentence and creates
a paragraph of its own.
The patch is build on top of master.
--
J. Purtz
Attachments:
[text/x-patch] 0003-query.patch (6.6K, 3-0003-query.patch)
download | inline diff:
diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index e73e805ec4..532427ab4e 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -440,13 +440,12 @@ SELECT DISTINCT city
<para>
Thus far, our queries have only accessed one table at a time.
- Queries can access multiple tables at once, or access the same
- table in such a way that multiple rows of the table are being
- processed at the same time. A query that accesses multiple rows
- of the same or different tables at one time is called a
- <firstterm>join</firstterm> query. As an example, say you wish to
- list all the weather records together with the location of the
- associated city. To do that, we need to compare the <structfield>city</structfield>
+ Queries which access multiple tables (including repeats) at once are called
+ <firstterm>join</firstterm> queries. They internally combine
+ each row from one table with each row of a second table. An expression is
+ specified to limit which pairs of rows are returned.
+ For example, to return all the weather records together with the location of the
+ associated city, the database compare the <structfield>city</structfield>
column of each row of the <structname>weather</structname> table with the
<structfield>name</structfield> column of all rows in the <structname>cities</structname>
table, and select the pairs of rows where these values match.
@@ -461,10 +460,17 @@ SELECT DISTINCT city
<programlisting>
SELECT *
- FROM weather, cities
- WHERE city = name;
+ FROM weather
+ JOIN cities ON (city = name);
</programlisting>
+ After the keyword <command>ON</command> follows the
+ expression comparing their rows. In this case, the
+ definition how to compare their rows. In this case, the
+ column <varname>city</varname> of table <varname>weather</varname>
+ must be equal to the column <varname>name</varname>
+ of table <varname>cities</varname>.
+
<screen>
city | temp_lo | temp_hi | prcp | date | name | location
---------------+---------+---------+------+------------+---------------+-----------
@@ -499,23 +505,14 @@ SELECT *
<literal>*</literal>:
<programlisting>
SELECT city, temp_lo, temp_hi, prcp, date, location
- FROM weather, cities
- WHERE city = name;
+ FROM weather
+ JOIN cities ON (city = name);
</programlisting>
</para>
</listitem>
</itemizedlist>
</para>
- <formalpara>
- <title>Exercise:</title>
-
- <para>
- Attempt to determine the semantics of this query when the
- <literal>WHERE</literal> clause is omitted.
- </para>
- </formalpara>
-
<para>
Since the columns all had different names, the parser
automatically found which table they belong to. If there
@@ -526,8 +523,8 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
<programlisting>
SELECT weather.city, weather.temp_lo, weather.temp_hi,
weather.prcp, weather.date, cities.location
- FROM weather, cities
- WHERE cities.name = weather.city;
+ FROM weather
+ JOIN cities ON (cities.name = weather.city);
</programlisting>
It is widely considered good style to qualify all column names
@@ -537,15 +534,29 @@ SELECT weather.city, weather.temp_lo, weather.temp_hi,
<para>
Join queries of the kind seen thus far can also be written in this
- alternative form:
+ form:
<programlisting>
SELECT *
- FROM weather INNER JOIN cities ON (weather.city = cities.name);
+ FROM weather, cities
+ WHERE city = name;
</programlisting>
- This syntax is not as commonly used as the one above, but we show
- it here to help you understand the following topics.
+ This syntax pre-dates the <literal>JOIN</literal> and <literal>ON</literal>
+ keywords. The tables are simply listed in the <literal>FROM</literal>,
+ comma-separated, and the comparison expression added to the
+ <literal>WHERE</literal> clause.
+ </para>
+
+ <para>
+ As join expressions serve a specific
+ purpose in a multi-table query it is preferable to make them stand-out
+ by using join clauses to introduce additional tables into the query.
+ The results from the old, implicit syntax and the new, explicit
+ JOIN/ON syntax are identical. But for a reader of the statement
+ its meaning is now easier to understand: the join condition is
+ introduced by its own key word whereas previously the condition was
+ merged into the WHERE clause together with other conditions.
</para>
<indexterm><primary>join</primary><secondary>outer</secondary></indexterm>
@@ -558,12 +569,13 @@ SELECT *
found we want some <quote>empty values</quote> to be substituted
for the <structname>cities</structname> table's columns. This kind
of query is called an <firstterm>outer join</firstterm>. (The
- joins we have seen so far are inner joins.) The command looks
- like this:
+ joins we have seen so far are <firstterm>inner joins</firstterm>.)
+ The command looks like this:
<programlisting>
SELECT *
- FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
+ FROM weather
+ LEFT OUTER JOIN cities ON (city = name);
</programlisting>
<screen>
@@ -593,10 +605,9 @@ SELECT *
</para>
</formalpara>
+ <indexterm><primary>join</primary><secondary>self</secondary></indexterm>
+ <indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
<para>
- <indexterm><primary>join</primary><secondary>self</secondary></indexterm>
- <indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
-
We can also join a table against itself. This is called a
<firstterm>self join</firstterm>. As an example, suppose we wish
to find all the weather records that are in the temperature range
@@ -610,10 +621,9 @@ SELECT *
<programlisting>
SELECT w1.city, w1.temp_lo AS low, w1.temp_hi AS high,
- w2.city, w2.temp_lo AS low, w2.temp_hi AS high
- FROM weather w1, weather w2
- WHERE w1.temp_lo < w2.temp_lo
- AND w1.temp_hi > w2.temp_hi;
+ w2.city, w2.temp_lo AS low, w2.temp_hi AS high
+ FROM weather w1
+ JOIN weather w2 ON (w1.temp_lo < w2.temp_lo AND w1.temp_hi > w2.temp_hi);
</programlisting>
<screen>
@@ -630,8 +640,8 @@ SELECT w1.city, w1.temp_lo AS low, w1.temp_hi AS high,
queries to save some typing, e.g.:
<programlisting>
SELECT *
- FROM weather w, cities c
- WHERE w.city = c.name;
+ FROM weather w
+ JOIN cities c ON (w.city = c.name);
</programlisting>
You will encounter this style of abbreviating quite frequently.
</para>
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Change JOIN tutorial to focus more on explicit joins
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox