Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jZNmD-0007BA-S5 for pgsql-hackers@arkaria.postgresql.org; Fri, 15 May 2020 00:00:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1jZNmB-0003zb-9a for pgsql-hackers@arkaria.postgresql.org; Fri, 15 May 2020 00:00:31 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jZNmA-0003v0-Kk for pgsql-hackers@lists.postgresql.org; Fri, 15 May 2020 00:00:31 +0000 Received: from mail-qt1-x843.google.com ([2607:f8b0:4864:20::843]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1jZNm2-0005nl-Ps for pgsql-hackers@postgresql.org; Fri, 15 May 2020 00:00:29 +0000 Received: by mail-qt1-x843.google.com with SMTP id t25so523581qtc.0 for ; Thu, 14 May 2020 17:00:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=2ndquadrant-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=STFKS12E3vLYK3UNmgJCBOkO7FoCR90LCYdRWS1nrkU=; b=l2C+im1DNy0AaQ1uJQXfE+XduH3nfe6upUjgPOeUh4jLbdK51rpEqdyvRX0mfmsobM UdlLV0e8Rnn2Uo4kav0KxMXGmFbVU6PjsTB9+4fgez2ESVxH7mcQJYqsYupZHV4+GWGc 4LnZEhdujyHOfT0BOZLZ4bCLj9XRrwdfZaENc46meQB1d+BbWkZcV4yJ2WmrEsKFfZqd MW7DMSy8Kk3NNFuCS8z9TCIIjkyrPTjHsAK6KCpcW69eekLrXvl8wnyZz18fMvaWdxih VYjyb5QqwgOBLYJmIgdaZ1dxFLXRviczXGjKbH+YBmZlAJ5b7g+HbiVh3AbTngIw/dMq Wamg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=STFKS12E3vLYK3UNmgJCBOkO7FoCR90LCYdRWS1nrkU=; b=YQIt+pkOXxJt63epTj8H/0hClN6iWtPK0sr/eh+v/taf01PiTiM3bZ+eWSVR1GSPbt x3QCXvsLWQyoQ8763SfVosjzwluCcmFk4hf8Hw58EfSm3SzCgJpX3ew2wFa8k3oFnZEq vMM+AQEbycDhcGYEl3vpKoPFZoUP8O4YvBZ8gYz/sIxP1tZhlJ2Zlm0t3UU8DmO/ZW+1 m4kge8cnAewCNFu9ADN3HhhjDA0MAF/P+LthlabFvIpZ2uHSPiuYJkY858uG9ep52Dnf WOwnziSvxcWPz0e+9nlEl85BjNeZAWFWRJJhR8Y/ARpHPdbp9Xgll31trZnyz6XkVbtO MgmQ== X-Gm-Message-State: AOAM530hEz2DYCPDqBaGe3vDU+9PQ6STUaLyyp8sAMxtLw70XBjXu2Io vVDOJj/Xb6k31dQHnadskCuaBg== X-Google-Smtp-Source: ABdhPJzDLd/whjorKZEylwGhZdO8iGJnskguvzXq9wfFxW5kjqlcydqP1s4T849NjPD/gtResQtlHQ== X-Received: by 2002:ac8:6d0a:: with SMTP id o10mr765947qtt.141.1589500821716; Thu, 14 May 2020 17:00:21 -0700 (PDT) Received: from nimloth.alvh.no-ip.org ([190.95.18.252]) by smtp.gmail.com with ESMTPSA id d123sm354050qkb.28.2020.05.14.17.00.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 May 2020 17:00:20 -0700 (PDT) Received: by nimloth.alvh.no-ip.org (Postfix, from userid 1000) id 83CFF30069A; Thu, 14 May 2020 20:00:17 -0400 (-04) Date: Thu, 14 May 2020 20:00:17 -0400 From: Alvaro Herrera To: Erik Rijkers , Laurenz Albe , Fabien COELHO , =?iso-8859-1?Q?J=FCrgen?= Purtz , Peter Eisentraut , Corey Huinker Cc: Justin Pryzby , Roger Harkavy , pgsql-hackers@postgresql.org, Michael Paquier Subject: Re: Add A Glossary Message-ID: <20200515000017.GA12687@alvherre.pgsql> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="zYM0uCDKw75PZbzx" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <280f7b1e-2e81-5d0e-2589-b5ea6313ad74@purtz.de> <20200405220715.GA20664@alvherre.pgsql> <686e65ba48801791816771e993ff5eebe571fac9.camel@cybertec.at> <7b9b469e804777ac9df4d37716db935e@xs4all.nl> User-Agent: Mutt/1.10.1 (2018-07-13) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk --zYM0uCDKw75PZbzx Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit Thanks everybody. I have compiled together all the suggestions and the result is in the attached patch. Some of it is of my own devising. * I changed "instance", and made "cluster" be mostly a synonym of that. * I removed "global SQL object" and made "SQL object" explain it. * Added definitions for ACID, sequence, bloat, fork, FSM, VM, data page, transaction ID, epoch. * Changed "a SQL" to "an sql" everywhere. * Sorted alphabetically. * Removed caps in term names. I think I should get this pushed, and if there are further suggestions, they're welcome. Dim Fontaine and others suggested a number of terms that could be included; see https://twitter.com/alvherre/status/1246192786287865856 -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services --zYM0uCDKw75PZbzx Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="glossfixes-2.patch" diff --git a/doc/src/sgml/acronyms.sgml b/doc/src/sgml/acronyms.sgml index f638665dc9..b05c065546 100644 --- a/doc/src/sgml/acronyms.sgml +++ b/doc/src/sgml/acronyms.sgml @@ -766,7 +766,7 @@ XID - Transaction Identifier + Transaction identifier diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml index 8c6cb6e942..d4255215aa 100644 --- a/doc/src/sgml/glossary.sgml +++ b/doc/src/sgml/glossary.sgml @@ -7,8 +7,22 @@ + + ACID + + + Atomicity, + consistency, + isolation, and + durability. + A set of properties of database transactions intended to guarantee validity + in concurrent operation and even in event of errors, power failures, etc. + + + + - Aggregate Function + Aggregate function A function that @@ -35,11 +49,15 @@ to make decisions about how to execute queries. + + (Don't confuse this term with the ANALYZE option + to the command.) + - Analytic Function + Analytic function @@ -106,8 +124,8 @@ Process of an instance - which act on behalf of client sessions - and handle their requests. + which acts on behalf of a client session + and handles its requests. (Don't confuse this term with the similar terms @@ -118,7 +136,7 @@ - Background Worker (process) + Background worker (process) Process within an instance, @@ -138,10 +156,11 @@ - Background Writer (process) + Background writer (process) - A process that continuously writes dirty pages from + A process that continuously writes dirty + data pages from shared memory to the file system. It wakes up periodically, but works only for a short period in order to distribute its expensive I/O @@ -155,6 +174,16 @@ + + Bloat + + + Space in data pages which does not contain relevant data, + such as unused (free) space or outdated row versions. + + + + Cast @@ -190,7 +219,7 @@ - Check Constraint + Check constraint A type of constraint @@ -208,15 +237,6 @@ - - Checkpointer (process) - - - A specialized process responsible for executing checkpoints. - - - - Checkpoint @@ -244,6 +264,15 @@ + + Checkpointer (process) + + + A specialized process responsible for executing checkpoints. + + + + Class (archaic) @@ -262,27 +291,6 @@ - - Cluster - - - A group of databases plus their - global SQL objects. The - cluster is managed by exactly one - instance. A newly created - Cluster will have three databases created automatically. They are - template0, template1, and - postgres. It is expected that an application will - create one or more additional database aside from these three. - - - (Don't confuse the PostgreSQL-specific term - Cluster with the SQL - command CLUSTER). - - - - Column @@ -363,7 +371,10 @@ A restriction on the values of data allowed within a - Table. + table, + or in attributes of a + + domain. For more information, see @@ -373,18 +384,18 @@ - Data Area + Data area - Data Directory + Data directory The base directory on the filesystem of a server that contains all data files and subdirectories associated with a - cluster with the + instance with the exception of tablespaces. The environment variable PGDATA is commonly used to refer to the @@ -416,15 +427,31 @@ - Database Server + Database server + + Data page + + + The basic structure used to store relation data. + All pages are of the same size. + Data pages are typically stored on disk, each in a specific file, + and can be read to shared buffers + where they can be modified, becoming + dirty. They get clean by being written down + to disk. New pages, which initially exist in memory only, are also + dirty until written. + + + + Datum - The internal representation of one value of a SQL + The internal representation of one value of an SQL data type. @@ -434,7 +461,7 @@ Delete - A SQL command which removes + An SQL command which removes rows from a given table or relation. @@ -459,6 +486,11 @@ + + Epoch + + + Extension @@ -475,12 +507,13 @@ - File Segment + File segment A physical file which stores data for a given relation. - File segments are limited in size by a configuration value, + File segments are limited in size by a configuration value + (typically 1 gigabyte), so if a relation exceeds that size, it is split into multiple segments. @@ -495,14 +528,14 @@ - Foreign Data Wrapper + Foreign data wrapper A means of representing data that is not contained in the local database so that it appears as if were in local - table(s). With a Foreign Data Wrapper it is + table(s). With a foreign data wrapper it is possible to define a foreign server and - foreign tables. + foreign tables. For more information, see @@ -512,7 +545,7 @@ - Foreign Key + Foreign key A type of constraint @@ -527,7 +560,7 @@ - Foreign Server + Foreign server A named collection of @@ -544,7 +577,7 @@ - Foreign Table + Foreign table A relation which appears to have @@ -564,6 +597,35 @@ + + Fork + + + Each of the separate segmented file sets that a relation stores its + data in. There exist a main fork and two secondary + forks: the free space map + visibility map. + + + + + + Free space map (fork) + + + A storage structure that keeps metadata about each data page in a table's + main storage space. The free space map entry for each space stores the + amount of free space that's available for future tuples, and is structured + so it is efficient to search for available space for a new tuple of a given + size. + + + For more information, see + . + + + + Function @@ -579,28 +641,11 @@ - - Global SQL Object - - - SQL objects which do - not belong to a specific - database. - - - These objects are - roles, - tablespaces, - replication origins, and subscriptions for logical replication. - - - - Grant - A SQL command that is used to allow + An SQL command that is used to allow users or role to access specific objects within the database. @@ -620,7 +665,8 @@ attributes (i.e. the data) for a relation. The heap is realized within - segment files. + segmented files + in the relation's main fork. @@ -629,10 +675,11 @@ Host - A computer that communicates with other hosts over a network. - This term can be used to refer to either a - client - or a server. + A computer that communicates with other computers over a network. + This is sometimes used as a synonym for + server. + It is also used to refer to a computer where + client processes run. @@ -659,7 +706,7 @@ Insert - A SQL command used to add new data into a + An SQL command used to add new data into a table. @@ -673,27 +720,28 @@ Instance - An instance is a group of processes, - its supporting storage space, - plus their - common shared memory, - running on a single server. - The instance - handles all key features of a DBMS: read and write - access to files and shared memory, assurance of - the ACID paradigm, MVCC, - connections to client programs, backup, - recovery, replication, privileges, etc. + A set of databases and accompanying global SQL objects that are stored in + the same data directory + in a single server. + If running, one + postmaster process + manages a group of backend and auxiliary processes that communicate + using a common shared memory + area. Many instances can run on the same + server + as long as their TCP ports do not conflict. - An instance manages exactly one - cluster. + The instance handles all key features of a DBMS: + read and write access to files and shared memory, + assurance of the ACID properties, + connections to client processes, + backup, recovery, replication, privileges, etc. - Many instances can run on the same server as - long as their TCP/IP ports do not conflict. - Different instances on a server may use the - same or different versions of PostgreSQL. + In PostgreSQL, the term + cluster is also sometimes used to refer to an instance. + (Don't confuse this term with the SQL command CLUSTER.) @@ -717,7 +765,7 @@ Join - A SQL keyword used in SELECT statements for + An SQL keyword used in SELECT statements for combining data from multiple relations. @@ -748,7 +796,7 @@ - Log File + Log file Log files contain human-readable text lines about events. @@ -761,6 +809,29 @@ + + Log record + + + Archaic term for a WAL record. + + + + + + Logged + + + A table is considered + logged if changes to it are sent to the + WAL. By default, all regular + tables are logged. A table can be specified as + unlogged either at + creation time or via the ALTER TABLE command. + + + + Logger (process) @@ -780,29 +851,6 @@ - - Log Record - - - Archaic term for a WAL record. - - - - - - Logged - - - A table is considered - logged if changes to it are sent to the - WAL. By default, all regular - tables are logged. A table can be specified as - unlogged either at - creation time or via the ALTER TABLE command. - - - - Master (server) @@ -831,7 +879,7 @@ - Materialized View + Materialized view A relation that is @@ -869,8 +917,8 @@ Null - A concept of non-existence that is a central tenet of Relational - Database Theory. It represents the absence of value. + A concept of non-existence that is a central tenet of relational + database theory. It represents the absence of a definite value. @@ -881,7 +929,7 @@ - Parallel Query + Parallel query The ability to handle parts of executing a @@ -918,7 +966,7 @@ - Partitioned Table + Partitioned table A relation that is @@ -960,7 +1008,7 @@ - Primary Key + Primary key A special case of a @@ -1011,7 +1059,7 @@ - Query Planner + Query planner The part of PostgreSQL that is devoted to @@ -1034,7 +1082,7 @@ - Referential Integrity + Referential integrity A means of restricting data in one relation @@ -1055,6 +1103,7 @@ attributes defined in a specific order. Tables, + sequences, views, foreign tables, materialized views, and @@ -1068,7 +1117,7 @@ - Replica + Replica (server) A database that is paired @@ -1098,12 +1147,13 @@ - Result Set + Result set A data structure transmitted from a backend process to - client program upon the completion of a SQL + a client upon the + completion of an SQL command, usually a SELECT but it can be an INSERT, UPDATE, or DELETE command if the RETURNING @@ -1135,7 +1185,7 @@ A collection of access privileges to the instance. - Roless are themselves a privilege that can be granted to other roles. + Roles are themselves a privilege that can be granted to other roles. This is often done for convenience or to ensure completeness when multiple users need the same privileges. @@ -1170,7 +1220,7 @@ Savepoint - A special mark inside the sequence of steps in a + A special mark in the sequence of steps in a transaction. Data modifications after this point in time may be reverted to the time of the savepoint. @@ -1186,13 +1236,15 @@ Schema - A schema is a namespace for SQL objects, + A schema is a namespace for + SQL objects, which all reside in the same database. Each SQL object must reside in exactly one schema. - The names of SQL objects of the same type in the same schema are enforced unique. + The names of SQL objects of the same type in the same schema are enforced + to be unique. There is no restriction on reusing a name in multiple schemas. @@ -1205,7 +1257,7 @@ - More generically, the term Schema is used to mean + More generically, the term schema is used to mean all data descriptions (table definitions, constraints, comments, etc) for a given database or @@ -1264,7 +1316,10 @@ A computer on which PostgreSQL instances run. The term server denotes real hardware, a - container, or a Virtual Machine. + container, or a virtual machine. + + + This term is sometimes used to refer to an instance or to a host. @@ -1279,45 +1334,21 @@ - - Shared Memory + Shared memory RAM which is used by the processes common to an @@ -1342,28 +1373,32 @@ + + Standby (server) + + + SQL Object - A table, - view, - materialized view, - index, - constraint, - sequence, - function, - procedure, - trigger, - data type, or operator. Every one of those SQL objects - belong to exactly one Schema. - - - There also exist SQL objects that do not belong to schemas; those include - extensions, - data type cases, - and - foreign data wrappers. + Any object that can be created with a CREATE + command. Most objects are specific to one database, and are commonly + known as local objects. + Roles, + tablespaces, + replication origins, subscriptions for logical replication, and + databases themselves are not local SQL objects since they exist + entirely outside of any specific database; + they are called global objects. + + + Most local objects belong to a specific + schema in their containing database. + There also exist local objects that do not belong to schemas; some examples are + extensions, + data type casts, and + foreign data wrappers. For more information, see @@ -1373,7 +1408,7 @@ - SQL Standard + SQL standard A series of documents that define the SQL language. @@ -1382,11 +1417,11 @@ - Stats Collector + Stats collector This process collects statistical information about the - Cluster's activities. + instance's activities. For more information, see @@ -1396,22 +1431,21 @@ - System Catalog + System catalog A collection of tables which describe the structure of all SQL objects - of each database - and the global SQL objects - of the cluster. + of the instance. The system catalog resides in the schema pg_catalog. These tables contain data in internal representation and are not typically considered useful for user examination; a number of user-friendlier views also in schema pg_catalog offer more convenient access to some of that information, while additional tables and views - exist in schema information_schema that expose some + exist in schema information_schema + (see ) that expose some of the same and additional information as mandated by the SQL standard. @@ -1431,7 +1465,7 @@ attributes, in the same order, having the same name and type per position). A table is the most common form of - Relation in + relation in PostgreSQL. @@ -1461,7 +1495,7 @@ - Temporary Table + Temporary table Tables that exist either @@ -1499,6 +1533,30 @@ + + Transaction ID + + + The numerical, unique, sequentially-assigned identifier that each + transaction receives when it first causes a database modification. + Frequently abbreviated xid. + When stored on disk, xids are only 32-bits wide, so only + approximately four billion write transactions IDs can be generated; + to permit the system to run for longer than that would allow, + epochs are used, also 32 bits wide. + When the counter reaches the maximum xid value, it starts over at + 3 (values under that are reserved) and the + epoch value is incremented by one. + In some contexts, both the epoch and xid values are + considered together as a single 64-bit value. + + + For more information, see + . + + + + Trigger @@ -1508,7 +1566,7 @@ UPDATE, DELETE, TRUNCATE) is applied to a relation. - A Trigger executes within the same + A trigger executes within the same transaction as the statement which invoked it, and if the function fails, then the invoking statement also fails. @@ -1535,7 +1593,7 @@ - Unique Constraint + Unique constraint A type of constraint @@ -1577,7 +1635,7 @@ Update - A SQL command used to modify + An SQL command used to modify rows that may already exist in a specified table. It cannot create or remove rows. @@ -1621,7 +1679,7 @@ The process of removing outdated tuple versions from tables, and other closely related - garbage-collection-like processing required by PostgreSQL's + processing required by PostgreSQL's implementation of MVCC. This can be initiated through the use of the VACUUM command, but can also be handled automatically @@ -1651,8 +1709,28 @@ + + Visibility map (fork) + + + A storage structure that keeps metadata about each data page + in a table's main storage space. The visibility map entry for + each page stores two bits: the first one + (all-visible) indicates that all tuples + in the page are visible to all transactions. The second one + (all-frozen) indicate that all tuples + in the page are marked frozen. + + + + + + WAL + + + - WAL Archiver (process) + WAL archiver (process) A process that saves copies of WAL files @@ -1667,7 +1745,7 @@ - WAL File + WAL file Also known as WAL segment or @@ -1678,7 +1756,7 @@ and are written in sequential order, interspersing changes as they occur in multiple simultaneous sessions. If the system crashes, the files are read in order, and each of the - changes are replayed to restore the system to the state as it was + changes is replayed to restore the system to the state as it was before the crash. @@ -1696,13 +1774,8 @@ - - WAL - - - - WAL Record + WAL record A low-level description of an individual data change. @@ -1719,17 +1792,17 @@ - WAL Segment + WAL segment - WAL Writer (process) + WAL writer (process) A process that writes WAL records - from shared memory to - WAL files. + from shared memory to + WAL files. For more information, see @@ -1739,7 +1812,7 @@ - Window Function + Window function A type of function whose @@ -1759,7 +1832,7 @@ - Write-Ahead Log + Write-ahead log The journal that keeps track of the changes in the --zYM0uCDKw75PZbzx--