Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jbMjS-00088t-8l for pgsql-docs@arkaria.postgresql.org; Wed, 20 May 2020 11:17:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1jbMjP-0001K0-97 for pgsql-docs@arkaria.postgresql.org; Wed, 20 May 2020 11:17:51 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jbMjP-0001Jq-1W; Wed, 20 May 2020 11:17:51 +0000 Received: from mout.kundenserver.de ([212.227.126.130]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jbMjM-0003co-DX; Wed, 20 May 2020 11:17:50 +0000 Received: from [192.168.178.43] ([77.181.156.248]) by mrelayeu.kundenserver.de (mreue012 [212.227.15.129]) with ESMTPSA (Nemesis) id 1MpCz1-1jGuUW413X-00qimu; Wed, 20 May 2020 13:17:32 +0200 From: =?UTF-8?Q?J=c3=bcrgen_Purtz?= Subject: Re: Add A Glossary To: Laurenz Albe , pgsql-hackers@postgresql.org, Pg Docs Cc: Erik Rijkers , Fabien COELHO , Corey Huinker , Justin Pryzby , Roger Harkavy References: <20200517152851.GA31376@alvherre.pgsql> <39586b2ad7ee14a50fd3aacdc412b6d826da0039.camel@cybertec.at> Message-ID: <838c8e85-f3fe-dc68-9367-b413e167578d@purtz.de> Date: Wed, 20 May 2020 13:17:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <39586b2ad7ee14a50fd3aacdc412b6d826da0039.camel@cybertec.at> Content-Type: multipart/alternative; boundary="------------37AC58D180AE5FE02EE74AB6" Content-Language: en-US X-Provags-ID: V03:K1:FN1G48bdbTq6J5d3bKFhZ9DZLASI63+u96rXkdBmWrbqbdm3Q3w N9sKwWJaI8qa4n5tBVhK+uqK70uTAQg7NubNHmnQg25zxvSYnMV8lhqXwDfeWv+UoF3wbPQ OcXeZvpvI1YgVzVSM9J5UZAJgp0GXyYGvevTcqvuMRf4lu8wr8Z3c7SmfzJA4/blp+Lw2aB r62llbD3ya8sLWdiTjdNA== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:YxUQrre8jPk=:pMLd7akzdNkB8ZGO3QIl0j b61BdOjBB3kmMvMPlYYJy5tb9a2lXnpnCjOeelrtvYnnjjXQmgHIWis3q0Y3tjjZ6ja27hUwX 5OuirjYmg6IRcKsloDGMb+58hZqLlZybUTvNg5OKSPSLngQea6AuGt99zKupXGvlgtUGjgg34 CvDgajHgKQ4Js3ZY5bQlAuvoaP29lNdBaycQDPX1exWkJAZCiLsenAOhfqe/EvwP/wyGvRfDB /Bgm1Mc5bQ1cYiwc8ssLQ+uvdHcpxAyxwacCzZmRx/5PYEb1+4uy4pxDXvbq5J1DOBQ5AhxjL T7nWff8cNn3+85BiKo+yCGwPEDMBre2DjmQjbwg8UHdyecgefQBeDfeoQyuDjfecoibCZNK54 lYtWxObNH0RC+Kop+WkASMCpxeUVPCKbshTCneItOZnjIYsB/4WhJFf5XPisP+TFFh4OI3Z60 D/Km9n0djoZ4mdhFSmIe5AGIdzy3ER7vB5m3SiV7EVRoI0e7/Uz0oX0MPx3NVqhPityihbUZy RVjqFCJvxI9qz347wlg3Y9zJd5LqoTwJC6qyBS26aHdDOYIoRfG1SBggXm3UfM0nDazqTF3xm 8+6qfvTpBAukAiWxWo3luUBA+6rgL9WSoLnkY433e7kKS3uKaUoiPIcKNSFS4TY+aC7I9qWhK uVz3SvO5oN3EtcN3g3NtwczcEyvysZQKHsiUfMZGklEb4JBpLr5FhnN/z4Ip06Q5EhNYjYxHF ceI0dsbC6EZnTpSnAM05KxpxhYdGUe4qcgt3VZjiVrn2F+RDm0YP1bytSyrwdhTNZgv3GBNSL aNQfDF6EBCcNcOTmPlYQbKnFuZg1PnlnODtNviPYz/hpPTebZsne9SMoQ5BmbHxH7yI3RmO List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk This is a multi-part message in MIME format. --------------37AC58D180AE5FE02EE74AB6 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 19.05.20 08:17, Laurenz Albe wrote: > On Mon, 2020-05-18 at 18:08 +0200, Jürgen Purtz wrote: >> cluster/instance: PG (mainly) consists of a group of processes that commonly >> act on shared buffers. The processes are very closely related to each other >> and with the buffers. They exist altogether or not at all. They use a common >> initialization file and are incarnated by one command. Everything exists >> solely in RAM and therefor has a fluctuating nature. In summary: they build >> a unit and this unit needs to have a name of itself. In some pages we used >> to use the term *instance* - sometimes in extended forms: *database instance*, >> *PG instance*, *standby instance*, *standby server instance*, *server instance*, >> or *remote instance*. For me, the term *instance* makes sense, the extensions >> *standby instance* and *remote instance* in their context too. > FWIW, I feel somewhat like Alvaro on that point; I use those terms synonymously, > perhaps distinguishing between a "started cluster" and a "stopped cluster". > After all, "cluster" refers to "a cluster of databases", which are there, regardless > if you start the server or not. > > The term "cluster" is unfortunate, because to most people it suggests a group of > machines, so the term "instance" is better, but that ship has sailed long ago. > > The static part of a cluster to me is the "data directory". cluster/instance: The different nature (static/dynamic) of what I call "cluster" and "instance" as well as the existence of the two commands "initdb — create a new PostgreSQL database cluster" and "pg_ctl — initialize, start, stop, or control a PostgreSQL server" confirms me in my opinion that we need two different terms for them. Those two terms shall not be synonym to each other, they label distinct things. If people prefer "data directory" instead of "cluster", this is ok for me. There are situations where we need a single term for both of them. "Instance and its data directory" or "Instance and its cluster" are too wordy. In many cases we use "database server" or "server" in this sense. Imo "Server" is too short and ambiguous. "database server", the plural form "databases server", or the new term "cluster server", which is more accurate, would be ok for me. (Similar to "server", the term "cluster" is also used in many different contexts - but only outside of the PG world; within our context "cluster" is not ambiguous.) >> server/host: We need a term to describe the underlying hardware respectively >> the virtual machine or container, where PG is running. I suggest to use both >> *server* and *host*. In computer science, both have their eligibility and are >> widely used. Everybody understands *client/server architecture* or *host* in >> TCP/IP configuration. We cannot change such matter of course. I suggest to >> use both depending on the context, but with the same meaning: "real hardware, >> a container, or a virtual machine". > On this I have a strong opinion because of my Unix mindset. > "machine" and "host" are synonyms, and it doesn't matter to the database if they > are virtualized or not. You can always disambiguate by adding "virtual" or "physical". > > A "server" is a piece of software that responds to client requests, never a machine. > In my book, this is purely Windows jargon. The term "client-server architecture" > that you quote emphasized that. > > Perhaps "machine" would be the preferable term, because "host" is more prone to > misunderstandings (except in a networking context). > server/host: I agree that we are not interested in the question whether there is real hardware or any virtualization container. We are even not interested in the operating system. Our primary concern is the existence of a port of the Internet Protocol. But is the term "server" appropriate to name an IP-port? Additionally, "server" is used for other meanings: a) the previously mentioned "database server" b) a (virtual) machine: "server-side", "... the file ... loaded by the server ..." c) binaries "... the server must be built with SSL support ..." d) whenever it seems to be appropriate: "standby server", "... the server parses query ...", "server configuration", "server process". Because of its ambiguous usage, the definition of "server" must clarify the allowed meanings. What's about: -- server: Depending on the context, the term *server* denotes: * An IP-port which is offered by any OS.   ????? * A - possibly virtualized - machine * An abbreviation for the slightly longer term "database(s)/cluster server"  ??? this will support the readability, but not the clarity ??? * More ? -- The term "host" is used mainly for IP configuration "host name", "host address" and in the context of compiling "host language", "host variable". These are clear situations and can be defined easily. --------------37AC58D180AE5FE02EE74AB6 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit
On 19.05.20 08:17, Laurenz Albe wrote:
On Mon, 2020-05-18 at 18:08 +0200, Jürgen Purtz wrote:
cluster/instance: PG (mainly) consists of a group of processes that commonly
act on shared buffers. The processes are very closely related to each other
and with the buffers. They exist altogether or not at all. They use a common
initialization file and are incarnated by one command. Everything exists
solely in RAM and therefor has a fluctuating nature. In summary: they build
a unit and this unit needs to have a name of itself. In some pages we used
to use the term *instance* - sometimes in extended forms: *database instance*,
*PG instance*, *standby instance*, *standby server instance*, *server instance*,
or *remote instance*.  For me, the term *instance* makes sense, the extensions
*standby instance* and *remote instance* in their context too.
FWIW, I feel somewhat like Alvaro on that point; I use those terms synonymously,
perhaps distinguishing between a "started cluster" and a "stopped cluster".
After all, "cluster" refers to "a cluster of databases", which are there, regardless
if you start the server or not.

The term "cluster" is unfortunate, because to most people it suggests a group of
machines, so the term "instance" is better, but that ship has sailed long ago.

The static part of a cluster to me is the "data directory".

cluster/instance: The different nature (static/dynamic) of what I call "cluster" and "instance" as well as the existence of the two commands "initdb — create a new PostgreSQL database cluster" and  "pg_ctl — initialize, start, stop, or control a PostgreSQL server" confirms me in my opinion that we need two different terms for them. Those two terms shall not be synonym to each other, they label distinct things. If people prefer "data directory" instead of "cluster", this is ok for me.

There are situations where we need a single term for both of them. "Instance and its data directory" or "Instance and its cluster" are too wordy. In many cases we use "database server" or "server" in this sense. Imo "Server" is too short and ambiguous. "database server", the plural form "databases server", or the new term "cluster server", which is more accurate, would be ok for me. (Similar to "server", the term "cluster" is also used in many different contexts - but only outside of the PG world; within our context "cluster" is not ambiguous.) 

server/host: We need a term to describe the underlying hardware respectively
the virtual machine or container, where PG is running. I suggest to use both
*server* and *host*. In computer science, both have their eligibility and are
widely used. Everybody understands *client/server architecture* or *host* in
TCP/IP configuration. We cannot change such matter of course. I suggest to
use both depending on the context, but with the same meaning: "real hardware,
a container, or a virtual machine".
On this I have a strong opinion because of my Unix mindset.
"machine" and "host" are synonyms, and it doesn't matter to the database if they
are virtualized or not.  You can always disambiguate by adding "virtual" or "physical".

A "server" is a piece of software that responds to client requests, never a machine.
In my book, this is purely Windows jargon.  The term "client-server architecture"
that you quote emphasized that.

Perhaps "machine" would be the preferable term, because "host" is more prone to
misunderstandings (except in a networking context).

server/host: I agree that we are not interested in the question whether there is real hardware or any virtualization container. We are even not interested in the operating system. Our primary concern is the existence of a port of the Internet Protocol. But is the term "server" appropriate to name an IP-port? Additionally, "server" is used for other meanings: a) the previously mentioned "database server" b) a (virtual) machine: "server-side", "... the file ... loaded by the server ..." c) binaries "... the server must be built with SSL support ..." d) whenever it seems to be appropriate: "standby server", "... the server parses query ...", "server configuration", "server process".

Because of its ambiguous usage, the definition of "server" must clarify the allowed meanings. What's about:

--

server: Depending on the context, the term *server* denotes:

  • An IP-port which is offered by any OS.   ?????
  • A - possibly virtualized - machine
  • An abbreviation for the slightly longer term "database(s)/cluster server"  ??? this will support the readability, but not the clarity ???
  • More ?

--

The term "host" is used mainly for IP configuration "host name", "host address" and in the context of compiling "host language", "host variable". These are clear situations and can be defined easily.


--------------37AC58D180AE5FE02EE74AB6--