Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oGTw9-0004T0-Bq for pgsql-docs@arkaria.postgresql.org; Tue, 26 Jul 2022 23:26:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1oGTw8-0000pM-5W for pgsql-docs@arkaria.postgresql.org; Tue, 26 Jul 2022 23:26:00 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oGTw7-0000oT-U8 for pgsql-docs@lists.postgresql.org; Tue, 26 Jul 2022 23:25:59 +0000 Received: from momjian.us ([72.94.173.45]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oGTw5-0003lq-Bv; Tue, 26 Jul 2022 23:25:59 +0000 Received: from bruce by momjian.us with local (Exim 4.94.2) (envelope-from ) id 1oGTw1-0063Fq-DV; Tue, 26 Jul 2022 19:25:53 -0400 Date: Tue, 26 Jul 2022 19:25:53 -0400 From: Bruce Momjian To: Peter Geoghegan Cc: "Jonathan S. Katz" , "David G. Johnston" , Pg Docs Subject: Re: documentation on HOT Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Jul 23, 2022 at 11:33:40AM -0700, Peter Geoghegan wrote: > On Sat, Jul 23, 2022 at 8:51 AM Bruce Momjian wrote: > > Good points. I have updated the attached patch and URL to mention that > > HOT rows are _completely_ removed, and why that is possible, and I > > clarified the page item identifier mention. > > I think that this version looks very good, but I do have some minor notes: > > * You wrote "Specifically, updates cause additional rows to be added to tables." > > Perhaps this could be rephrased: "Specifically, updates add new > physical tuples to tables to represent each new version." Uh, that seems more confusing than what I have. I also considered "tuples", but if you are saying "old version of a row", you are taking about an old version of a logical row, not an old version of a physical tuple, really. > I think that the term "row" should only refer to the simple/abstract > idea of a row from a table, while the term tuple should be preferred > when referring to a physical embodiment of a row, like one version of > a row. Perhaps it's worth following that convention across the board > here (not just in this sentence that I have highlighted). Yes, if we were talking about tuples unrelated to the versions of the rows they represent, then yes, it would make sense. > * You wrote "This can also require new index entries for each updated > row, and removal of old versions of rows can be expensive" > > I believe that the operative word in this sentence (which appears in > the first paragraph) is "can". I think that it would be good to go > just a bit further with that. Maybe add another sentence immediately > afterwards that conveys "and now we're going to discuss when and how > new versions from updates can sometimes avoid the need for a new round > of index entries". I ended up adding index cleanup further up in the text --- please see my patch in the next email I send in this thread. > > * You wrote "New index entries are not needed to represent updated rows" > > It seems to me that this undersells the key benefit. You could perhaps > add another sentence. Something like: "This avoids the immediate cost > of adding new successor versions to each and every index, and avoids > the cost of removing the obsolete versions from each and every index > later on." Same. > > * You refer to opportunistic pruning as something that happens "during > normal operation", but that doesn't seem to get the idea of > "opportunistic" across. > > It seems like it would be worth writing a sentence or two more on > this, just to get that aspect across. Opportunistic cleanup occurs > when a query happens to notice that a heap page that it had to read as > part of query processing needed to be cleaned up in passing. We do it > there and then because it happens to be relatively cheap and > convenient to do it that way. That sort of thing. Yes, we need that, but not in this section --- I would like to it though. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson