Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.92) (envelope-from ) id 1jFNnE-0007vn-1G for pgsql-hackers@arkaria.postgresql.org; Fri, 20 Mar 2020 19:58:56 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1jFNnA-0005Uj-SG for pgsql-hackers@arkaria.postgresql.org; Fri, 20 Mar 2020 19:58:52 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1jFNnA-0005Ub-DT for pgsql-hackers@lists.postgresql.org; Fri, 20 Mar 2020 19:58:52 +0000 Received: from mail-qt1-x841.google.com ([2607:f8b0:4864:20::841]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1jFNn5-0006LM-S8 for pgsql-hackers@postgresql.org; Fri, 20 Mar 2020 19:58:51 +0000 Received: by mail-qt1-x841.google.com with SMTP id f20so6084874qtq.6 for ; Fri, 20 Mar 2020 12:58:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telsasoft-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=qBH1bFa2YU4DMMC3EFclYLge4U2cj8kGzW0Nd+3eozQ=; b=Wmv2FEyAvXkqyiaZymjWTIJXP7o4KFSMYIqM8x5E6TEJfLN5Q9EJdAetITKGrwk1bo sftODq4TXrgi32Wa9o4gZHQmTiLysgNxBbxcxStBZCFDadJkKSchYEiJKd5js/shGmMU ny2xaKjg0stbHDuTboGm/rHCkj0Wnkst2jN2iGbUvwVMcwjtuc6/bJVdQDQE3+50sgpw itjXcZZeQ/ojrPoNQTb2ay0BbicEaz0WB8BRvmOloA51IEgr6yQG4jylbrLu/rcdo/h+ X7+PaVWAFSqJjo8ljYLpftgODVeu1XkBY3LnoY/vVJQ67WMsylA0R7Oz6A3XGnnFTuUa 9f9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=qBH1bFa2YU4DMMC3EFclYLge4U2cj8kGzW0Nd+3eozQ=; b=VPIvExrLrLeijZ26kdO+bwYaY9z2XD/UNRx/rSI9UfhvY+g1TVTexjm1eGZgmnXhvg 0g/vPxO/HtgaFyEuil+3mpBzMbrY73lUb+oCEK3BSXpop/mLAp8AvE45EPXr9zRbZ0/3 Ra2I+jNzfb8Dtli0lojsEHKSNu69kBdn9kh3nPdR4BRBOt3/M91TPICP4qCGLJ4oGtFR eu6O3EQQeGwGiGjBm40YQBLW3QRRw02nbelcyvx1Rxqp+cMOJfvkiAKC6y6pfcxYitdQ 6geo736R8MJTmF3haR1rMrJqq0nCEIcj2rymQqjMQK4KTauRC0FrVx8g+UvNg3v/45RV jybA== X-Gm-Message-State: ANhLgQ03D/w8HPk6zkApojMArwt4V58w065Qdb2pV7yeGO2XGnqsW03b AHMISSmstytjtzbbIE01Xf+/PA== X-Google-Smtp-Source: ADFU+vt+3EmvCbkSCt5lrh6qnJLJSGrkEv55l0fcfzLkFTBo9GKhK8hOkV0luNHPbB+nw+DSguD1+A== X-Received: by 2002:ac8:7184:: with SMTP id w4mr6945578qto.263.1584734324994; Fri, 20 Mar 2020 12:58:44 -0700 (PDT) Received: from pryzbyj (charmander.telsasoft.com. [50.244.222.1]) by smtp.gmail.com with ESMTPSA id l45sm5618080qtb.8.2020.03.20.12.58.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 20 Mar 2020 12:58:43 -0700 (PDT) Received: by pryzbyj (Postfix, from userid 1000) id 0B4BA800C5C; Fri, 20 Mar 2020 14:58:41 -0500 (CDT) Date: Fri, 20 Mar 2020 14:58:41 -0500 From: Justin Pryzby To: Alvaro Herrera Cc: Corey Huinker , =?iso-8859-1?Q?J=FCrgen?= Purtz , Roger Harkavy , pgsql-hackers@postgresql.org, Fabien COELHO , Michael Paquier Subject: Re: Add A Glossary Message-ID: <20200320195841.GA13662@telsasoft.com> References: <20200320001122.GA19602@alvherre.pgsql> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200320001122.GA19602@alvherre.pgsql> User-Agent: Mutt/1.9.4 (2018-02-28) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk On Thu, Mar 19, 2020 at 09:11:22PM -0300, Alvaro Herrera wrote: > + Aggregate > + > + > + To combine a collection of data values into a single value, whose > + value may not be of the same type as the original values. > + Aggregate Functions > + combine multiple Rows that share a common set > + of values into one Row, which means that the > + only data visible in the values in common, and the aggregates of the IS the values in common ? (or, "is the shared values") > + Analytic > + > + > + A Function whose computed value can reference > + values found in nearby Rows of the same > + Result Set. > + Archiver Can you change that to archiver process ? > + Atomic .. > + > + In reference to an operation: An event that cannot be completed in > + part: it must either entirely succeed or entirely fail. A series of Can you say: "an action which is not allowed to partially succed and then fail, ..." > + Autovacuum Say autovacuum process ? > + > + > + Processes that remove outdated MVCC I would say "A set of processes that remove..." > + Records of the Heap and I'm not sure, can you say "tuples" ? > + Backend Process > + > + > + Processes of an Instance which act on behalf of Say DATABASE instance > + Backend Server > + > + > + See Instance. same > + Background Worker > + > + > + Individual processes within an Instance, which same > + run system- or user-supplied code. Typical use cases are processes > + which handle parts of an SQL query to take > + advantage of parallel execution on servers with multiple > + CPUs. I would say "A typical use case is" > + Background Writer Add "process" ? > + > + > + Writes continuously dirty pages from Shared Say "Continuously writes" > + Memory to the file system. It starts periodically, but Hm, maybe "wakes up periodically" > + Cast > + > + > + A conversion of a Datum from its current data > + type to another data type. maybe just say A conversion of a Datum another data type. > + Catalog > + > + > + The SQL standard uses this standalone term to > + indicate what is called a Database in > + PostgreSQL's terminology. Maybe remove "standalone" ? > + Checkpointer Process > + A process that writes dirty pages and WAL > + Records to the file system and creates a special Does the chckpointer actually write WAL ? > + checkpoint record. This process is initiated when predefined > + conditions are met, such as a specified amount of time has passed, or > + a certain volume of records have been collected. collected or written? I would say: > + A checkpoint is usually initiated by > + a specified amount of time having passed, or > + a certain volume of records having been written. > + Checkpoint > + > + > + A Checkpoint is a point in time Extra space > + > + Connection > + > + > + A TCP/IP or socket line for inter-process I don't know if I've ever heard the phase "socket line" I guess you mean a unix socket. > + Constraint > + > + > + A concept of restricting the values of data allowed within a > + Table. Just say: "A restriction on the values..."? > + Data Area Remove this ? I've never heard this phrase before. > + > + > + The base directory on the filesystem of a > + Server that contains all data files and > + subdirectories associated with a Cluster with > + the exception of tablespaces. The environment variable Should add an entry for "tablespace". > + Datum > + > + > + The internal representation of a SQL data type. I'm not sure if should use "a SQL" or "an SQL", but not both. > + Delete > + > + > + A SQL command whose purpose is to remove just say "which removes" > + > + File Segment > + > + > + If a heap or index file grows in size over 1 GB, it will be split 1GB is the default "segment size", which you should define. > + > + Foreign Data Wrapper > + > + > + A means of representing data that is not contained in the local > + Database as if were in local > + Table(s). I'd say: + A means of representing data as a Table(s) even though + it is not contained in the local Database > + > + Foreign Key > + > + > + A type of Constraint defined on one or more > + Columns in a Table which > + requires the value in those Columns to uniquely > + identify a Row in the specified > + Table. An FK doesn't require the values in its table to be unique, right ? I'd say something like: "..which enforces that the values in those Columns are also present in an(other) table." Reference Referential Integrity? > + Function > + > + > + Any pre-defined transformation of data. Many > + Functions are already defined within > + PostgreSQL itself, but can also be > + user-defined. I would remove "pre-", since you mentioned that it can be user-defined. > + Global SQL Object > + > + > + > + Not all SQL Objects belong to a certain > + Schema. Some belong to the complete > + Database, or even to the complete > + Cluster. These are referred to as > + Global SQL Objects. Collations and Extensions > + such as Foreign Data Wrappers reside at the > + Database level; Database > + names, Roles, > + Tablespaces, Replication > + origins, and subscriptions for logical > + Replication at the > + Cluster level. I think "complete" is the wrong world. I would say: "An object which is not specific to a given database, but instead shared across the entire Cluster". > + > + Grant > + > + > + A SQL command that is used to enable I'd say "allow" > + > + Heap > + > + > + Contains the original values of Row attributes I'm not sure what "original" means here ? > + (i.e. the data). The Heap is realized within > + Database files and mirrored in > + Shared Memory. I wouldn't say mirrored, and probably just remove at least the part after "and". > + > + Host > + > + > + See Server. Or client. Or proxy at some layer or other intermediate thing. Maybe just remove this. > + > + Index > + > + > + A Relation that contains data derived from a > + Table (or Relation such > + as a Materialized View). It's internal Its > + structure supports very fast retrieval of and access to the original > + data. > + Instance > + > + ... > + > + Many Instances can run on the same server as > + long as they use different IP ports and manage I would say "as long as their TCP/IP ports or sockets don't conflict, and manage..." > + Join > + > + > + A technique used with SELECT statements for > + correlating data in one or more Relations. I would refer to this as a SQL keyword allowing to combine data from multiple relations. > + Lock > + > + > + A mechanism for one process temporarily preventing data from being > + manipulated by any other process. I'd say: + A mechanism by which a process protects simultaneous access to a resource + by other processes. (I said "protects" since shared locks don't prevent all access, and it's easier than explaining "unsafe access"). > + > + Log File > + > + > + LOG files contain readable > + text lines about serious and non-serious events, e.g.: use of wrong > + password, long-running queries, ... . Serious and non-serious? > + Log Writer process > + > + > + If activated and parameterized, the I don't know what parameterized means here > + Log Writer process > + writes information about database events into the current > + Log file. When reaching certain time- or > + volume-dependent criterias, he creates a new I think criteria is the plural.. > + Log Record Can we remove this ? Couple releases ago, "pg_xlog" was renamed to pg_wal. I'd prefer to avoid defining something called "Log Record" about WAL that's right next to text logs. > + Logged > + > + > + A Table is considered > + Logged if changes to it are sent to the > + WAL Log. By default, all regular > + Tables are Logged. A > + Table can be speficied as unlogged either at > + creation time or via the ALTER TABLE command. The > + primary use of unlogged Tables is for storing > + transient work data that must be shared across processes, but with a > + final result stored in logged Tables. > + Temporary Tables are always unlogged. > + > + > + Maybe it's be better to define "unlogged", since 1) logged is the default; and 2) it's right next to text logs. > + Master > + > + > + When two or more Databases are linked via > + Replication, the Server > + that is considered the authoritative source of information is called > + the Master. I think it'd actually be the <> which is authoritative, in case they're running on the same <> > + > + Materialized > + > + > + The act of storing information rather than just the means of accessing remove "means of" ? > + the information. This term is used in Materialized > + Views meaning that the data derived from the > + View is actually stored on disk separate from separately > + the sources of that data. When the term > + Materialized is used in speaking about > + mulit-step queries, it means that the data of a given step is stored multi > + (in memory, but that storage may spill over onto disk). > + > + > + > + > + > + Materialized View > + > + > + A Relation that is defined in the same way that > + a View is, but it stores data in the same way change "it stores" to stores > + > + Partition > + > + > + > + a) A Table that can be queried independently by > + its own name, but can also be queried via another just say "on its own" or "directly" > + Table, a partitionend partitioned also, put it in parens, like "via another table (a partitioned table)..." > + Table, which is a collection of Say "set" here since you later talk about "subsets" and sets. > + > + Primary Key > + > + > + A special case of Unique Index defined on a > + Table or other Relation > + that also guarantees that all of the Attributes > + within the Primary Key do not have > + Null values. As the name implies, there can be > + only one Primary Key per > + Table, though it is possible to have multiple > + Unique Indexes that also have no > + Null-capable Attributes. I would say "multiple >>unique indexes<< on >>attributes<< defined as not nullable. > + Procedure > + > + > + A defined set of instructions for manipulating data within a > + Database. Procedure can "procedures" or "a procedure" > + Record > + > + > + See Tupple. Tupple is back. And again below. > + A single Row of a Table > + or other Relation. I think it's commonly used to mean "an instance of a row" (in an MVCC sense), but maybe that's too much detail for here. > + Referential Integrity > + > + > + The means of restricting data in one Relation A means > + > + Relation > + > + > + The generic term for all objects in a Database "A generic term for any object in a >>database<< that has a name and..." > + > + Result Set > + > + > + A data structure transmitted from a Server to > + client program upon the completion of a SQL > + command, usually a SELECT but it can be an > + INSERT, UPDATE, or > + DELETE command if the RETURNING > + clause is specified. I'd remove everything in that sentence after "usually". > + Revoke > + > + > + A command to reduce access to a named set of s/reduce/prevent/ ? > + Row > + > + > + See Tupple. tuple > + > + Savepoint > + > + > + A special mark (such as a timestamp) inside a > + Transaction. Data modifications after this > + point in time may be rolled back to the time of the savepoint. I don't think "timestamp" is a useful or accurate analogy for this. > + Schema > + > + > + A schema is a namespace for > + SQL objects, which all reside in the same > + database. Each SQL > + object must reside in exactly one > + Schema. > + > + > + In general, the names of SQL objects in the > + schema are unique - even across different types of objects. The lone > + exception is the case of Unique > + Constraints, in which case there > + must be a Unique Index > + with the same name and Schema as the > + Constraint. There is no restriction on having > + a name used in multiple Schemas. I think there's some confusion. Constraints are not objects, right ? But, constraints do have an exception (not just unique constraints, though): the constraint is only unique on its table, not in its database/schema. "pg_constraint_conrelid_contypid_conname_index" UNIQUE, btree (conrelid, contypid, conname) CLUSTER > + Select > + > + > + The command used to query a Database. Normally, > + SELECTs are not expected to modify the > + Database in any way, but it is possible that > + Functions invoked within the query could have > + side-effects that do modify data. I think there should be references to the sql-* pages for this and others. > + > + Serializable > + > + > + Transactions defined as SERIALIZABLE are unable to > + see changes made within other transactions. In effect, for the > + initializing session the entire Database > + appears to be frozen duration such a > + Transaction. Do you mean "for the duration of the >>Transaction<<" > + > + Session > + > + > + A Connection to the Database. > + > + > + A description of the commands that were issued in the life cycle of a > + particular Connection to the > + Database. I'm not sure what this means. > + Sequence > + > + > + > + An Database object which represents the A not An > + mathematical concept of a numerical integral sequence. It can be > + thought of as a Table with exactly one > + Row and one Column. The > + value stored is known as the current value. A > + Sequence has a defined direction (almost always > + increasing) and an interval step (usually 1). Whenever the > + NEXTVAL pseudo-column of a > + Sequence is accessed, the current value is moved > + in the defined direction by the defined interval step, and that value say "given interval step" > + Shared Memory > + > + > + RAM which is used by the processes common to an > + Instance. It mirrors parts of > + Database files, provides an area for > + WAL Records, Do we use shared_buffers for WAL ? > + > + Table > + > + > + A collection of Tuples (also known as > + Rows or Records) having > + a common data structure (the same number of > + Attributess, in the same order, having the same Attributes has two esses. > + name and type per position). A Table is the I don't think you need to say here that the columns of a table all have the same type and order. > + Temporary Tables > + > + > + Tables that exist either for the lifetime of a > + Session or a > + Transaction, as defined at creation time. The I would say "as specified at the time of its creation". > + Transaction > + > + > + A combination of one or more commands that must act as a single Remove "one or more" > + Trigger > + > + > + A Function which can be defined to execute > + whenever a certain operation (INSERT, > + UPDATE, or DELTE) is applied to > + that Relation. A Trigger s/that/a/ > + > + Unique > + > + > + The condition of having no matching values in the same s/matching/duplicate/ > + Relation. Most often used in the concept of s/concept/context/ > + > + Update > + > + > + A command used to modify Rows that already or 'may already' > + WAL File ... > + > + The sequence of WAL Records in combination with > + the sequence of WAL Files represents the Remove "in combination with the sequence of >WAL Files<" > + > + WAL Log Can you just say WAL or "write-ahead log". > + > + > + A WAL Record contains either new or changed > + Heap or Index data or > + information about a COMMIT, > + ROLLBACK, SAVEPOINT, or > + Checkpointer operation. WAL records use a > + non-printabe binary format. non-printable Or just remove it. Or just remove the sentence. > + WAL Writer process > + > + Window Function > + > + > + A type of Function similar to an > + Aggregate in that can derive its value from a in that IT > + set of Rows in a Result > + Set, but still retaining the original source data. -- Justin