Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s3668-009P1u-EI for pgsql-general@arkaria.postgresql.org; Sat, 04 May 2024 03:30:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s3664-00EYl3-Sq for pgsql-general@arkaria.postgresql.org; Sat, 04 May 2024 03:30:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s3664-00EYkk-Hh for pgsql-general@lists.postgresql.org; Sat, 04 May 2024 03:30:01 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s365y-001RXN-Ke for pgsql-general@lists.postgresql.org; Sat, 04 May 2024 03:30:00 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 4443TqwI2805643; Fri, 3 May 2024 23:29:52 -0400 From: Tom Lane To: "David G. Johnston" cc: Siddharth Jain , "pgsql-general@lists.postgresql.org" Subject: Re: Question regarding how databases support atomicity In-reply-to: References: Comments: In-reply-to "David G. Johnston" message dated "Fri, 03 May 2024 20:10:31 -0700" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2805641.1714793392.1@sss.pgh.pa.us> Date: Fri, 03 May 2024 23:29:52 -0400 Message-ID: <2805642.1714793392@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk "David G. Johnston" writes: > On Friday, May 3, 2024, Siddharth Jain wrote: >> The way I understand this is that if there is a failure in-between, we >>> start undoing and reverting the previous operations one by one. > Not in PostgreSQL. All work performed is considered provisional until a > commit succeeds. At which point all provisional work, which had been > tagged with the same transaction identifier, becomes reality to the rest of > the system, by virtue of marking the transaction live. Right. We don't use UNDO; instead, we use multiple versions of database rows (MVCC). A transaction doesn't modify the contents of existing rows, but just marks them as provisionally outdated, and then inserts new row versions that are marked provisionally inserted. Other transactions ignore the outdate markings and the uncommitted new rows, until the transaction commits, at which time the new versions become live and the old ones become dead. If the transaction never does commit -- either through ROLLBACK or a crash -- then the old row versions remain live and the new ones are dead. In either case, we don't have a consistency or correctness problem, but we do have dead rows that must eventually get vacuumed away to prevent indefinite storage bloat. That can be done by background housekeeping processes though (a/k/a autovacuum). I believe Oracle, for one, actually does use UNDO. I don't know what they do about failure-to-UNDO. regards, tom lane