Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oEw5q-0003mR-3u for pgsql-docs@arkaria.postgresql.org; Fri, 22 Jul 2022 17:05:38 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1oEw5o-0000OH-7M for pgsql-docs@arkaria.postgresql.org; Fri, 22 Jul 2022 17:05:36 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oEw5n-0000O7-PE for pgsql-docs@lists.postgresql.org; Fri, 22 Jul 2022 17:05:35 +0000 Received: from mail-ej1-x632.google.com ([2a00:1450:4864:20::632]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1oEw5k-0005gl-DQ for pgsql-docs@lists.postgresql.org; Fri, 22 Jul 2022 17:05:34 +0000 Received: by mail-ej1-x632.google.com with SMTP id l23so9627016ejr.5 for ; Fri, 22 Jul 2022 10:05:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bowt-ie.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sTLeQYkm+KGPQeVHsga97eL/UXWek9JXb61nPCHbQaA=; b=iVzBqeDFU+7qavA5Z9QSvXbzuZzLodM3TZBRNc9BdzGX88tQyzrFUUeA4EBkMdF23o s1T/lYfEYn7sy7Z95eyZSiasqkr4yov+0UfjYNyhzbHiAM6fSo+NkknwKzOSa1IbW+2b kK+fXMF8W0bdD7bpYt9CzNoflTLPLkrVviTdx/cwXtr5YncP6KgWIfvVDnU4bgU3X9SG sHgr0sGu7cTqeAeYsRebd27KoekDakYmdA9yiaJOmlsgc1IsOjaGqlSfoZGcG7ODvHZ8 +z2BsMAoaFxFC4RkvtGyINBmfy4r+D0SRDr5yi0RB1kiwlNshKK4UXVcmG7ymxzncRZ0 TU7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sTLeQYkm+KGPQeVHsga97eL/UXWek9JXb61nPCHbQaA=; b=YYvykBnOCPdSnY+dSigPc3Uz3y0JorTthMsYXD+4ecbXmyXfZDSZXhLOIHQYb50zB9 crA0cTCkSikGo8e9E+oiW+uOdvlji0ZTAXsG0QbWA3P68j1tRKPBas6TMkmVxKeiys9c 4BRZ2lexePwoIRDDNxfLt6ISiGARxo5BowJyb21SIx0UcBdMKU7BSsXyrhFjc1/baI/6 qGzFdKy/RgGpbEk3MrTLqWuR3PmEb4zT4XEA1NW75NSW6lnz4IQx4QUahxnMpHiwmPRe +9UwpZwwi0JrJs/QHRnPBQbJNzv9hZPyvXCGc030J9lLUOvpXRW77IcSBUfKmsmY2E/5 pNfw== X-Gm-Message-State: AJIora8XfZPEU9EZUrm9RGoCqo2ivVbFwzZteioESCFp9s01U8xeTI6X 3H8k1G6wEEKxO08pYkbq0D/WPvie/LzAVIvi+mqSdFNev0JEJA== X-Google-Smtp-Source: AGRyM1soYRll/EsYTqVweHJQIfvTkGBGEGFSkQG9W8efsvhbXRlQPPf+1dG4Iko4uwMVqQfx1sLeTsHXFxwq2ui45x0= X-Received: by 2002:a17:906:5a68:b0:72b:3e88:fec1 with SMTP id my40-20020a1709065a6800b0072b3e88fec1mr623495ejc.741.1658509530521; Fri, 22 Jul 2022 10:05:30 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Peter Geoghegan Date: Fri, 22 Jul 2022 10:05:04 -0700 Message-ID: Subject: Re: documentation on HOT To: "Jonathan S. Katz" Cc: Bruce Momjian , "David G. Johnston" , Pg Docs Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Jul 22, 2022 at 8:09 AM Jonathan S. Katz wrote: > Thanks! This is great. Probably the most concise and clear explanation I > have seen for HOT, which is exactly what we need for the docs :) I'm delighted that this is finally happening! > I think we should expand on this and explain how adjusting "fillfactor" > will affect this. I think that may change the final sentence too. Definitely -- this is by far the most important reason to tune heap fill factor, which can make a big difference. There should be a link from the fill factor docs to the new HOT section, at a minimum. Another thing: would be good to point out how to monitor the number of HOT updates vs non-HOT updates using views like pg_stat_user_tables. One minor correction: Opportunistic pruning isn't limited to heap-only tuples -- "HOT pruning" is actually a bit of a misnomer, that somehow caught on in the Postgres community. Opportunistic pruning can and will happen even with non-HOT updates -- the only difference on the heap page level is that 4 byte LP_DEAD line pointers will accumulate over time, which only VACUUM is able to clean up (we need LP_DEAD items to stick around until then as tombstones, so that index scans don't ever get confused). But pruning will still be able to free most of the space on the heap page, in almost the same way. Of course the indexes on the table are another matter -- each and every index will accumulate versions when there are many non-HOT updates, which can be a huge problem. My point is that *from the point of view of an affected heap page*, pruning (whether opportunistic or performed by VACUUM) can go ahead without HOT tuples/updates, and isn't really that different in terms of the amount of space freed each time. Bottom-up index deletion is effective is no small part because pruning was always pretty effective even with non-HOT updates, except within indexes -- it pretty much compensates for that remaining problem. The heuristics that it uses in the nbtree code work quite well because of the natural tendency of those 4 byte LP_DEAD item pointers to concentrate on individual heap pages. We can fit so many more 4 byte LP_DEAD items than full sized live heap tuples that it becomes pretty obvious (from the point of view of the nbtree code that starts out by looking at an index leaf page) which pointed-to heap blocks are likely to enable it to free space via index tuple deletes -- it often visits the heap pages with the most TIDs. Before bottom-up index deletion was added, I would notice that pgbench variant workloads with non-HOT updates would result in a pgbench_accounts table that is maybe 1%-2% larger after a few hours, while the indexes on the same table would grow by 100% - 200%. Opportunistic pruning would work pretty well for the heap, but wouldn't do anything at all to control bloat in indexes. So opportunistic pruning was always very effective with non-HOT updates in one important way, but totally ineffective in another way. There is a section about bottom-up index deletion in the docs (in the B-Tree internals chapter) that already references HOT. It should link to this new chapter now, I think. -- Peter Geoghegan