Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vdOhO-0003OH-1T for pgsql-hackers@arkaria.postgresql.org; Wed, 07 Jan 2026 08:15:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vdOhM-00CdFA-2W for pgsql-hackers@arkaria.postgresql.org; Wed, 07 Jan 2026 08:15:21 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vdOhM-00CdF2-1O for pgsql-hackers@lists.postgresql.org; Wed, 07 Jan 2026 08:15:21 +0000 Received: from mail-dl1-x1235.google.com ([2607:f8b0:4864:20::1235]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vdOhK-0052u7-2F for pgsql-hackers@lists.postgresql.org; Wed, 07 Jan 2026 08:15:20 +0000 Received: by mail-dl1-x1235.google.com with SMTP id a92af1059eb24-121b14efeb8so2225619c88.1 for ; Wed, 07 Jan 2026 00:15:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767773716; x=1768378516; darn=lists.postgresql.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FmbyhxHUGG6pQGP7XBatcyxtMSTBwJ0hu8aQ3Gb1UJU=; b=OhslugZEhUw2uVlZK0YRz8XO5VrNxxnWhNDALLTS27XbS74Z2kvq8TLmXWqVoTAPZj GLR0WhKHfAE9d9aTJh5VW3NdO3gtrN3i2T/Us8WCOEZ2NtXmQfKe1q3M6FfD5re8R+Ns dFphAsJtR7EoEb6Sgarz8jTz2OG55ty0D1ZYiPPBfUaGIY600fO4ANipqvo9tA+S3PsW fO2e6j81ldunjKmfNlqsCa0J1Uuat10iMmkWsFFK0EwEuqRK1A1TBcst5Sms8ytpu5oN ZY7e+Jdp7jBi5mC5xkRnFMFkJAuhi7MqN3wEspaj8IpaDg6Hv1lQ26lnDPJ0jDsdBNTh RNDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767773716; x=1768378516; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=FmbyhxHUGG6pQGP7XBatcyxtMSTBwJ0hu8aQ3Gb1UJU=; b=B+4FyyBpgcISySDoV9gTunHGFkcYw6AGGUXc9hwSHqWDLt6zKgpUatgysS4CPBtkQC fbdi5cAcrZ8jYjEBPAI6c+mAnfTAIg96UFUfwjd88tAOb01MXjZxFr7HOm3ZGGPp5Tfm 6nqaDumwgatTQ1+FNRCszyDiwU66/eMKNR9s/Z9csYmcFMHyKf0YLq8iKgzDNdvF5xha NQjU31D/AMVnSRVdViJvSoaxJMUVq2AQrC0BO2hRXl1cqBpq8ZlZNSdUnSxslXnOg7YF ojSnDwyF3DW1x2APi1hm+wwRkVKAc5F2Bv+LPefaCy8IQcb9YVRrD+Mo0Edk/sNXlCkr qBWg== X-Forwarded-Encrypted: i=1; AJvYcCVcQmELnVOYnZP1U0VSKNfN8kC7Dok1kjyBPKJ1WUwpLGluvYEuxXt41HWL0bW+9KbRcLxIZJdjKJ6jY5nv@lists.postgresql.org X-Gm-Message-State: AOJu0YwWfCpuEU3x0onf5OPKn8AhNrjRjZJ9iyGdnWHZ2KOqRmfIgIXL YvZIdhi0C9QAgeS+V5Y22DRIJPqwIdHlw7gBEcuU8Ewuxpz4jBe1pBUf X-Gm-Gg: AY/fxX6kOhIf3k/vZfzRNSF2k28ZvC5RxADW0udAmidOMyHxN8AZEGU2UqOFHEllwCV KEfWogeXJur8NyqCGdFM/x2Q0OsjN+2GVOvIkU3PdiShv+jHTCpknwbNGsQVMY7R+BrWNphmLlZ NeXSy/QIF47zNyhdrQ3XYNRacU10a06UsiIs+TD8QrFK9z+wxfvBwisSPtO1N5RSRoXBiX5Hkf6 ZRUa9JXsBhQqYJI+7VeAQFyGLS79NKVvt+GXfmAHwa55BJ4OQMy52oUVtK9UAvs3/hmQBZV24R0 JeSwGIk67ZrfHzcZJwCQMDIR0cwzdEtfsBQCCABQjrTXOTL7vuaX8syVsvOEVNLGuoxKxTHdyYy 0z+bw6b9DdVx7SfDkh2fw9NZUB8921q+dV1xeDMCuAIqYg4DTgbKEo/Txmxceh7ioPQHsJybYJe ClGzx2IAv8qRlxZDwq7mJp X-Google-Smtp-Source: AGHT+IGh+JsJlQFdHUL5mLhTcxB+bqXtrsNueQgsqthumfSAPEvuht9dRtMry45GpWh83Un+txbnsQ== X-Received: by 2002:a05:7022:a84:b0:11b:9386:825c with SMTP id a92af1059eb24-121f8b68c16mr1697978c88.41.1767773715921; Wed, 07 Jan 2026 00:15:15 -0800 (PST) Received: from smtpclient.apple ([142.171.105.12]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-121f249894fsm7256333c88.15.2026.01.07.00.15.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jan 2026 00:15:15 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81\)) Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) From: Chao Li In-Reply-To: Date: Wed, 7 Jan 2026 16:14:38 +0800 Cc: Andrey Borodin , Kirill Reshke , Xuneng Zhou , Andres Freund , Robert Haas , PostgreSQL Hackers , Heikki Linnakangas Content-Transfer-Encoding: quoted-printable Message-Id: References: <2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl@mdyyqpqzxjek> <6BC5DBAB-6084-4BB8-8450-52E9648AB021@gmail.com> <7F5BCD7A-764D-4D8D-8E27-6F2CAAEA1CEE@gmail.com> <4379FDA3-9446-4E2C-9C15-32EFE8D4F31B@yandex-team.ru> To: Melanie Plageman X-Mailer: Apple Mail (2.3826.700.81) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk > On Jan 7, 2026, at 01:31, Melanie Plageman = wrote: >=20 > On Tue, Jan 6, 2026 at 4:40=E2=80=AFAM Andrey Borodin = wrote: >>=20 >>> >>=20 >> I've tried to take an attempt to review some patches of this = patchset. It's huge and mostly polished. >=20 > I've added attributed your review on the patches you specifically > mention here (and from previous emails you sent). Let me know if there > are other patches you reviewed that you did not mention. >=20 >> In a step "Pass down information on table modification to scan node" = you pass SO_HINT_REL_READ_ONLY flag in IndexNext() and = BitmapTableScanSetup(), but not in IndexNextWithReorder() and = IndexOnlyNext(). Is there a reason why index scans with ordering cannot = use on-access VM setting? >=20 > Great point, I simply hadn't tested those cases and didn't think to > add them. I've added them in attached v33. >=20 > While looking at other callers of index_beginscan(), I was wondering > if systable_beginscan() and systable_beginscan_ordered() should ever > pass SO_HINT_REL_READ_ONLY. I guess we would need to pass if the > operation is read-only above the index_beginscan() -- I'm not sure if > we always know in the caller of systable_beginscan() whether this > operation will modify the catalog. That seems like it could be a > separate project, though, so maybe it is better to say this feature is > just for regular tables. >=20 > As for the other cases: We don't have the relation range table index > in check_exclusion_or_unique_constraints(), so I don't think we can do > it there. >=20 > And I think that the other index scan cases like in replication code > or get_actual_variable_endpoint() are too small to be worth it, don't > have the needed info, or don't do on-access pruning (bc of the > snapshot type they use). >=20 >> Also, comment about visibilitymap_set() says "Callers that log VM = changes separately should use visibilitymap_set()" as if = visibilitymap_set() is some other function. >=20 > Ah, yes, I forgot to remove that when I removed the old > visibilitymap_set() and made visibilitymap_set_vmbits() into > visiblitymap_set(). Done in v33. >=20 > - Melanie > = <= v33-0013-Track-which-relations-are-modified-by-a-query.patch> I see the same problem in 0009 and 0010: 0009 ``` --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -3570,6 +3570,7 @@ heap_page_would_be_all_visible(Relation rel, = Buffer buf, { ItemId itemid; HeapTupleData tuple; + TransactionId dead_after; =20 /* * Set the offset number so that we can display it along = with any @@ -3609,12 +3610,14 @@ heap_page_would_be_all_visible(Relation rel, = Buffer buf, =20 /* Visibility checks may do IO or allocate memory */ Assert(CritSectionCount =3D=3D 0); - switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, = buf)) + switch (HeapTupleSatisfiesVacuumHorizon(&tuple, buf, = &dead_after)) { case HEAPTUPLE_LIVE: { TransactionId xmin; =20 + = Assert(!TransactionIdIsValid(dead_after)); + /* Check comments in = lazy_scan_prune. */ if = (!HeapTupleHeaderXminCommitted(tuple.t_data)) { @@ -3647,8 +3650,10 @@ heap_page_would_be_all_visible(Relation rel, = Buffer buf, } break; =20 - case HEAPTUPLE_DEAD: case HEAPTUPLE_RECENTLY_DEAD: + = Assert(TransactionIdIsValid(dead_after)); + /* FALLTHROUGH */ ``` 0010: ``` static bool -heapam_scan_analyze_next_tuple(TableScanDesc scan, TransactionId = OldestXmin, +heapam_scan_analyze_next_tuple(TableScanDesc scan, double = *liverows, double *deadrows, = TupleTableSlot *slot) { @@ -1047,6 +1047,7 @@ heapam_scan_analyze_next_tuple(TableScanDesc scan, = TransactionId OldestXmin, ItemId itemid; HeapTuple targtuple =3D &hslot->base.tupdata; bool sample_it =3D false; + TransactionId dead_after; =20 itemid =3D PageGetItemId(targpage, hscan->rs_cindex); =20 @@ -1069,16 +1070,20 @@ heapam_scan_analyze_next_tuple(TableScanDesc = scan, TransactionId OldestXmin, targtuple->t_data =3D (HeapTupleHeader) = PageGetItem(targpage, itemid); targtuple->t_len =3D ItemIdGetLength(itemid); =20 - switch (HeapTupleSatisfiesVacuum(targtuple, OldestXmin, - = hscan->rs_cbuf)) + switch (HeapTupleSatisfiesVacuumHorizon(targtuple, + = hscan->rs_cbuf, + = &dead_after)) { case HEAPTUPLE_LIVE: sample_it =3D true; *liverows +=3D 1; break; =20 - case HEAPTUPLE_DEAD: case HEAPTUPLE_RECENTLY_DEAD: + = Assert(TransactionIdIsValid(dead_after)); + /* FALLTHROUGH */ ``` I believe the reason why we add Assert(TransactionIdIsValid(dead_after)) = under HEAPTUPLE_RECENTLY_DEAD is to ensure that when = HeapTupleSatisfiesVacuumHorizon() returns HEAPTUPLE_RECENTLY_DEAD, = dead_after must be set. So the goal of the assert is to catch bugs of = HeapTupleSatisfiesVacuumHorizon(). =46rom this perspective, I now feel dead_after should be initialized to = InvalidTransactionId. Otherwise, say HeapTupleSatisfiesVacuumHorizon() = has a bug and miss to set dead_after, then the assert mostly like = won=E2=80=99t be fired, because it holds a random value, most likely not = be 0. I know this comment conflicts to one of my previous comments, sorry = about that. As I read this patch once and again, I am getting more = understanding to it.=20 0014 ``` + /* set if the query doesn't modify the rel */ + SO_HINT_REL_READ_ONLY =3D 1 << 10, ``` Nit: I think it=E2=80=99s better to replace =E2=80=9Crel=E2=80=9D to = =E2=80=9Crelation=E2=80=9D. For a function comment, if there is a = parameter named =E2=80=9Crel=E2=80=9D, then we can use it to refer to = the parameter, without such a context, I guess here a while word is = better. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/