Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oF1DW-0001ux-8Y for pgsql-docs@arkaria.postgresql.org; Fri, 22 Jul 2022 22:33:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1oF1DU-00075D-Ln for pgsql-docs@arkaria.postgresql.org; Fri, 22 Jul 2022 22:33:52 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oF1DU-000753-E8 for pgsql-docs@lists.postgresql.org; Fri, 22 Jul 2022 22:33:52 +0000 Received: from mail-ed1-x534.google.com ([2a00:1450:4864:20::534]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1oF1DQ-0005o6-1k for pgsql-docs@lists.postgresql.org; Fri, 22 Jul 2022 22:33:51 +0000 Received: by mail-ed1-x534.google.com with SMTP id u12so2348283edd.5 for ; Fri, 22 Jul 2022 15:33:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bowt-ie.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bZUzcrYHdyhHMV167JTCJFhaY51fUJvJ00vv21i68F4=; b=8NWuHmdxJpQS2UCxoTanIre9iyLSYoFzuPQAfaFi3UBDBHy15QFLIqCbfadaqnCuQR rOS7iKPYiaz4l1QAucq0obdOPWA23pkbX88nvIlFaoXTbCyhfDdKdT0oI8zqylYnOR4m h3tbZQtm4eCRrQy5BsbAUaj2qVBCkp/dQxrRXmjy3oTiDfasH27zPAI4SvQQNoOiiyHn nsYXi2EUme2VvWqqFAWVjLR2RfxMdYrRJVD7GzjJxGtZTRd+wxmAYC/nbd3SCVGpdUaA 79BRY+bP02LSVVeBJkKwEoRcWPo4jORtxEyBXa2ywGsYMIKhYMUQiHP9YS68rFZ/ffcp XUYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bZUzcrYHdyhHMV167JTCJFhaY51fUJvJ00vv21i68F4=; b=hY4TydRkaruwRcBpQdvPUS6jE2KuuZo5Ni8mfI1caB0/tNyY1yzlsfc+lnrCPkUZZn VshgPB7MUU4hZiCIagG5jHEjd+mrnKoBCCvUvmREw1vfHqj6iW0mT67JnCMVcWeC6Atw LWamW8TxrdSFaG42ZCilWJotERY4Enr0GdkFLj/AfkA78cX8scTthrPFw/njLjqGZdkz c7QIwQ921CnqnL0iC3/hPoXJGb4MFHF+ciwUwUKzV0DmpROU+tAg4qcA6xJLhAx3IbIm AHVzRKLeTMqY+vEtW/sufnzKNhMTSI9PeHvhvaq1x+cycyOrfzFvUESO2dRgyQsZ7efu pOVw== X-Gm-Message-State: AJIora+52bX6r0K1UdTwrTgRO8C+/xHh/7A8UFPEr9ynnNA9pPGHqcET XLxLfTWtm1JxJNJH7GLmAc2oisk0IkfDSzPub4pMkA== X-Google-Smtp-Source: AGRyM1vFaEdQgDM29qyn4voj2yrrTxH2EJDlxowJPzz6ftlwqliPUljlPa5IwSQ9XbYQYv1l31k0Gr9jdg+tpmPcgtk= X-Received: by 2002:a05:6402:2804:b0:439:83c2:8be2 with SMTP id h4-20020a056402280400b0043983c28be2mr1861353ede.292.1658529226061; Fri, 22 Jul 2022 15:33:46 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Peter Geoghegan Date: Fri, 22 Jul 2022 15:33:20 -0700 Message-ID: Subject: Re: documentation on HOT To: Bruce Momjian Cc: "Jonathan S. Katz" , "David G. Johnston" , Pg Docs Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Jul 22, 2022 at 2:11 PM Bruce Momjian wrote: > I have improved the wording of the last paragraph in this patch. I think that it would be worth prominently explaining where heap-only tuples get their name from: it comes from the fact there are (by definition) no entries for a heap-only tuple in any index, ever. Indexes are nevertheless capable of locating heap-only tuples during index scans, by dealing with a little additional indirection: they must traverse groups of related tuple versions, all for the same logical row that was HOT updated one or more times -- this group of related tuples is called a HOT chain. This seems like a useful thing to emphasize because it places the emphasis on what *doesn't* happen. Mostly what doesn't happen in indexes. New item identifiers actually *are* needed for heap-only tuples (perhaps we could get away with it, but we don't). However, that doesn't really matter too much in practice. Heap-only tuples can still have their line pointers set to LP_UNUSED directly during pruning, without having to be set to LP_DEAD for a time first (a situation which VACUUM alone can correct by setting the LP_DEAD items to LP_UNUSED during its second heap pass). So heap-only tuples "skip the step" where they have to become LP_DEAD stubs/tombstones. Which is possible precisely because indexes don't need to be considered (they're "heap-only"). I agree that pruning should be discussed here, though -- I wouldn't go as far as treating pruning as 100% unrelated to HOT. Perhaps something along the lines of this works: "It is possible for opportunistic pruning to completely remove all bloat caused by HOT updates (bloat from HOT chains), without leaving any residual garbage that only VACUUM is capable of cleaning up. Pruning a page affected by non-HOT updates or deletes is somewhat less effective, though, because small tombstone items (dead item identifiers) must remain until such time as VACUUM can verify that no remaining index tuples reference the items." Again, the emphasis is on what *doesn't* have to happen because indexes aren't making life hard for us. From the point of view of indexes, ignorance is bliss. The really nice important point about pruning and HOT is that it becomes possible (with care from the DBA and application) to practically eliminate the role of VACUUM. We may not even require a little help from VACUUM, under ideal conditions. -- Peter Geoghegan