Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v1VsI-009Rue-D1 for pgsql-hackers@arkaria.postgresql.org; Wed, 24 Sep 2025 20:14:02 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v1VsG-00FY60-SZ for pgsql-hackers@arkaria.postgresql.org; Wed, 24 Sep 2025 20:14:00 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v1VsG-00FY5r-HO for pgsql-hackers@lists.postgresql.org; Wed, 24 Sep 2025 20:14:00 +0000 Received: from mail-ej1-x62d.google.com ([2a00:1450:4864:20::62d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v1VsC-002gcT-0G for pgsql-hackers@lists.postgresql.org; Wed, 24 Sep 2025 20:13:59 +0000 Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-b33d2f0f5f4so39561466b.2 for ; Wed, 24 Sep 2025 13:13:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758744836; x=1759349636; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=kTu0vGsuekYtA8QFiKzdueJ3KdhFiczAAMCGAoaN8yw=; b=FoboTOuq4YJFqtLuYCP4Aa5YmYoeCW+EMIE+jH5gqyWr9woZFBEVM4Q6HGGhcJvAI2 u069iri3pY+qVh3YzLP75zB6Ci/+cSyBWVDyC8rw3defNy0H3bcEpOshYgcRSeOAvYGz 90KDMZ270pVdKIAHbL5u//06tQa+rlW0ZK6JjUf+4ffKKMiK8tcDg8GgKJUhlmie6yHQ lafUwfDdwmU0CYd0ud4Jbivw3CL+kGlW5AiVgKaayFeMfHqOghzoqnPP3Xoai7z3Amp1 s71SJCSuIqLchYfGCAOcpJNqp4ybUvH51YLh/lzZfwDCclCFSVLzxRVdgCpE3BSPsBWi uChw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758744836; x=1759349636; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kTu0vGsuekYtA8QFiKzdueJ3KdhFiczAAMCGAoaN8yw=; b=JuPtze3XAqEJaY37FTx00eQdnIHiIl7lqavXR3s2NU69BTeha67CXxwnhSp+ai/ist 9D8lkOWxxxIj2Wxo0lzGdpbpFdjFJhTfy5J0a86weU3Ql8MBzgzRXOHEh6GQKTEGfHVI Arty+J+BzyrCuVBi/O0tOcYpXYtRTOEOY8KkF5qx6gxWTbymoCdKrCAlu1W3EGeev3gu njOUu25XBt0wQdC/nxP2SLyWMcGSMNMQNqnY/U0FqVnaKrGkGw5hFI0qMkf8VfF9NUda aDDHAu6LcscljtCvLAofvx2oqo6PMqu1Z6cKwyycDPFPD1RJ5YCR0a+Cj4/lK9FcWy+l 99rg== X-Forwarded-Encrypted: i=1; AJvYcCXRizObhBZ7XkHm3jhMY1iMl/8L+YphWqqjPcBff2+H6zKWUAtNyZlS+OgSjAqafTaqm+w9bQAUGlWJ/r6x@lists.postgresql.org X-Gm-Message-State: AOJu0YwMGBkvwfpT+OhpIf0JOkRUmCozsR1QJJXCr9SfBaO+EMkkO1bW CXenpo0zGRPv6mdJ6JeF4zJMpoWspFQ9oHIJVfnzLWVKnS0BPaIH8YuvrfLS4A9+4rKnRi6vBfF usj8zbn3ZiPHGHgtNkoMPnoZ0NFPEBOU= X-Gm-Gg: ASbGncscyiNh2xFpYSbGc7FZgXMI1tE8wRhxeZbEvElIIybDAet87UyjSfrtTGhdb1R qvk4dNaI6i+hFNtnGVzYOdhO8Jb2kuwCQlHY3kSXTQPdBNt93eXP/M5GC46PNOGOLq7272siG46 EUGda6jt9HvC9yYAQP6Owm2916jyvDsyUd8dtRb++V3hBZ/u5F4nlDw2ooRAjskcoZGerYozA// qkwwyU= X-Google-Smtp-Source: AGHT+IFdJe4XlfmBwzolM/SyFTcq6rUKvEzrZXkNTh2LZ2y/TPzZmrYJT4L0QQtbDFLpfPdTyH+uu7Y8ux0FAtewvqc= X-Received: by 2002:a17:907:9805:b0:b32:2b60:eed with SMTP id a640c23a62f3a-b34be7cedbcmr92011966b.44.1758744835980; Wed, 24 Sep 2025 13:13:55 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Robert Haas Date: Wed, 24 Sep 2025 16:13:43 -0400 X-Gm-Features: AS18NWBv2_pd7aP6Z_Eu48ODxq6I8glPLjMnNHOrl7c4mhNmT79wCz2CeBstyyI Message-ID: Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) To: Melanie Plageman Cc: Andres Freund , Kirill Reshke , Andrey Borodin , PostgreSQL Hackers , Heikki Linnakangas Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk I find this patch set quite hard to follow. 0001 altogether removes the use of XLOG_HEAP2_VISIBLE in cases where we use XLOG_HEAP2_MULTI_INSERT, but then 0007 (the next non-refactoring patch) begins half-removing the dependency on XLOG_HEAP2_VISIBLE, assisted by 0009 and 0010, and then later you come back and remove the other half of the dependency. I know it was I who proposed (off-list) first making the XLOG_HEAP2_VISIBLE record only deal with the VM page and not the heap buffer, but I'm not sure that idea quite worked out in terms of making this easier to follow. At the least, it seems weird that COPY FREEZE is an exception that gets handled in a different way than all the other cases, fully removing the dependency in one step. It would also be nice if each time you repost this, or maybe in a README that you post along beside the actual patches, you'd include some kind of roadmap to help the reader understand the internal structure of the patch set, like 1 does this, 2-9 get us to here, 10-whatever get us to this next place. I don't really understand how the interlocking works. 0011 changes visibilitymap_set so that it doesn't take the heap block as an argument, but we'd better hold a lock on the heap page while setting the VM bit, otherwise I think somebody could come along and modify the heap page after we decided it was all-visible and before we actually set the VM bit, which would be terrible. I would expect the comments and the commit message to say something about that, but I don't see that they do. I find myself fearful of the way that 0007 propagates the existing hacks around setting the VM bit into a new place: + /* + * We always emit a WAL record when setting PD_ALL_VISIBLE, but we are + * careful not to emit a full page image unless + * checksums/wal_log_hints are enabled. We only set the heap page LSN + * if full page images were an option when emitting WAL. Otherwise, + * subsequent modifications of the page may incorrectly skip emitting + * a full page image. + */ + if (do_prune || nplans > 0 || + (xlrec.flags & XLHP_SET_PD_ALL_VIS && XLogHintBitIsNeeded())) + PageSetLSN(page, lsn); I suppose it's not the worst thing to duplicate this logic, because you're later going to remove the original copy. But, it took me >10 minutes to find the text in src/backend/access/transam/README, in the second half of the "Writing Hints" section, that explains the overall principle here, and since the patch set doesn't seem to touch that text, maybe you weren't even aware it was there. And, it's a little weird to have a single WAL record that is either a hint or not a hint depending on a complex set of conditions. (IMHO mixing & and && without parentheses is quite brave, and an explicit != 0 might not be a bad idea either.) Anyway, I kind of wonder if it's time to back out the hack that I installed here many years ago. At the time, I thought that it would be bad if a VACUUM swept over the visibility map setting VM bits and as a result emitted an FPI for every page in the entire heap ... but everyone who is running with checksums has accepted that cost already, and with those being the default, that's probably going to be most people. It would be even more compelling if we were going to freeze, prune, and set all-visible on access, because then presumably the case where we touch a page and ONLY set the VM bit would be rare, so the cost of doing that wouldn't matter much, but I guess the patch doesn't go that far -- we can freeze or set all-visible on access but not prune, without which the scenario I was worrying about at the time is still fairly plausible, I think, if checksums are turned off. -- Robert Haas EDB: http://www.enterprisedb.com