Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wEtqq-004XPP-0G for pgsql-hackers@arkaria.postgresql.org; Mon, 20 Apr 2026 19:00:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wEtqo-004GnW-2k for pgsql-hackers@arkaria.postgresql.org; Mon, 20 Apr 2026 19:00:06 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wEtqo-004GnN-1N for pgsql-hackers@lists.postgresql.org; Mon, 20 Apr 2026 19:00:06 +0000 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wEtql-00000001xv3-46gh for pgsql-hackers@lists.postgresql.org; Mon, 20 Apr 2026 19:00:05 +0000 Received: by mail-wr1-x429.google.com with SMTP id ffacd0b85a97d-43d43e09de5so1891644f8f.1 for ; Mon, 20 Apr 2026 12:00:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776711603; x=1777316403; darn=lists.postgresql.org; h=in-reply-to:from:content-language:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=JGYxgsQZpD9tNB23Ngxw4PzLwSnX2A6rSYWFgS0ncE0=; b=aNOgJVVUoV6d3aosTGQfzO4soiV3a+3aRw+ChgjvncNpXw3ayaZoAKGU1WcX7lvAVU MKLmW+cDyPhVy103mpDvqRUEm+F3w+Amhh2VguczKtBZIi7u3NXDt7uG47isbK8d2Llj LIn+TA3n/JH6rDQqgDUrr32j4fr2jU6Y8OiEY09ivWZjaop9ndq+8ZQetam/NVHgJ2zy hgSsC6KqYlSafbkyA8/BtOt8iBIZVjY4AWx2uiY7NGM9+rT/23DY+TsBgWyyBKbiff8I FtaEe5HfP/CACLAI/44YSpYdE1dZm/Xbwh9BpAqeAnfiluq2uKYycebAswmRUExO3T9P SKFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776711603; x=1777316403; h=in-reply-to:from:content-language:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-gg:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=JGYxgsQZpD9tNB23Ngxw4PzLwSnX2A6rSYWFgS0ncE0=; b=FJDBJt22PNXh9qX1Nb8tTuWnqR5p8IiHuOxYNEDSnYbnj9+pJK0XH3kTJTSvjidqJ5 LHIwi+rRY7Y9szTi80W8O1hmcUbohSNVazHVEXVGBoMzwP3iqvQPFwD6LX3q+2iN1NAN qhJw0FsfpXxTRs+GEyawjYrz1IJVu3YBqVDzeXnV76blYyE19PpZJYapkcP440TWU2yy 2nMIss4FxcO8ggM/dJdFGKrJ0jB7K3S1G0WrUe14aNTb9NPD2gA0rgoCyEEMYIJ/NMOD a/vojdOPLVzrj6p9ACkAp5sI6sm4WejzbnIRt9gmk5VeeNWTNshexqWFSpbHUEaKwsNB k3RA== X-Forwarded-Encrypted: i=1; AFNElJ9mJ14tAFqq3S6k7WCmDTHw+ITihkttlQbqUetZgf5l1N1r8LNtjXNjstWWRqGFZIuMYMfWXOVexBY4LSQg@lists.postgresql.org X-Gm-Message-State: AOJu0YzU9D5yWFsjUmcKkalNwSMX4aV17f78J1IcR3/u1z6D0asGTJ4Z blQ6nGKle2QeErPUq4vwvQ3s+KyvBiSOmA6xdS4YTSi8U77wUPQKdZwo X-Gm-Gg: AeBDievFQtN+Px+T+aiOkdS9tojMgTSf5D8+zBvlgJh7fQyFZjGNPElNejFgsqvPC6B aMuD8NeVq4M2paeGpLIMi4rmVvCTTcAIkn2JKuhErAJ4XrgXu/a55nmV3FU8Z7EELqwJLo/UudD IOShJyVpOpnL1c6x7SdeJArETkbFEwZQn1mpH695OAcR6HflqtFNvpVNQE4SfE5gRI7t3/j0fZJ 6Tfc1OYYnKhyFr00EBrgYZ9blv3lUpATqwpDc7aQzuBzutMaNPzdyNJT0uLMZq3brliRAdns5Rl 8/i2tHzQlwvnl1ASZkPT5K/gJFwzMU4kbQN6i9CaFMfjAOdcIZLwKGz29U1uv3W0ntKJqXkeeqq h/XbZjWlrfFvG2uIeEKA+7huYT7p5IPwuvs+OzgKIZFaY9SYQf4xtiyJnUZWfMEIdlMQnnNMADH PyPpAVr8K2juz7mh21+5GeXeJzV1GIzLtkKUw= X-Received: by 2002:a05:6000:2502:b0:43e:b0f8:c564 with SMTP id ffacd0b85a97d-43fe3dbd756mr22223468f8f.9.1776711602506; Mon, 20 Apr 2026 12:00:02 -0700 (PDT) Received: from [192.168.0.50] ([89.149.68.143]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43fe4e3a18csm35479868f8f.20.2026.04.20.12.00.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 20 Apr 2026 12:00:01 -0700 (PDT) Content-Type: multipart/alternative; boundary="------------v15PYiPt4UQ0hVpa0FSCKCje" Message-ID: <71277259-264e-4983-a201-938b404049d7@gmail.com> Date: Mon, 20 Apr 2026 22:00:00 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) To: Melanie Plageman , Andres Freund Cc: Tomas Vondra , David Rowley , Kirill Reshke , Chao Li , Andrey Borodin , Xuneng Zhou , Robert Haas , PostgreSQL Hackers , Heikki Linnakangas References: <2be31f17-5405-4de9-8d73-90ebc322f7d8@vondra.me> <97529f5a-ec10-46b1-ab50-4653126c6889@gmail.com> <46733d68-aec0-4d09-8120-4c66b87047a4@gmail.com> Content-Language: en-US From: Alexander Lakhin In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------v15PYiPt4UQ0hVpa0FSCKCje Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hello Melanie and Andres, 20.04.2026 19:18, Melanie Plageman wrote: >> but I wonder if there are other queries, >> which plans can change due to the same reason. > I think we'll have to take this on a case-by-case basis when we see > failures. While it is certainly possible other tests just rely on > autovacuum not having run and set the page all-visible, many of them > probably have already had to account for that. Thank you for paying attention to this! I think, I found another test which suffers from autoanalyze with the new behavior: [1], [2]. Initially I reproduced this diff on a slow armv7 device after many iterations of `make check` with: autovacuum_naptime = 1 autovacuum_analyze_threshold = 1 debug_parallel_query = 'regress' But now I see that it can be reproduced on an ordinary machine with just: --- a/src/test/regress/sql/plancache.sql +++ b/src/test/regress/sql/plancache.sql @@ -208,2 +208,3 @@ execute test_mode_pp(1); -- 2x  execute test_mode_pp(1); -- 3x +analyze test_mode;  execute test_mode_pp(1); -- 4x (and expected/plancache.out updated) and `make check` running in a loop. It failed for me on iterations 5, 4, 10 (as far as I can see, analyze updates relallvisible not every time): # parallel group (18 tests):  prepare xml conversion plancache limit returning copy2 polymorphism sequence rowtypes largeobject temp rangefuncs with truncate domain plpgsql alter_table # diff -U3 .../src/test/regress/expected/plancache.out .../src/test/regress/results/plancache.out # --- .../src/test/regress/expected/plancache.out    2026-04-20 21:35:30.677775398 +0300 # +++ .../src/test/regress/results/plancache.out     2026-04-20 21:43:49.324492302 +0300 # @@ -374,11 +374,11 @@ # #  -- we should now get a really bad plan #  explain (costs off) execute test_mode_pp(2); # -         QUERY PLAN # ------------------------------ # +                        QUERY PLAN # +---------------------------------------------------------- #   Aggregate # -   ->  Seq Scan on test_mode # -         Filter: (a = $1) # +   ->  Index Only Scan using test_mode_a_idx on test_mode # +         Index Cond: (a = $1) #  (3 rows) # #  -- but we can force a custom plan The same modified test survived 50 iterations at 378a21618~1. [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2026-04-07%2012%3A45%3A07 [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2026-04-12%2022%3A45%3A06 Best regards, Alexander --------------v15PYiPt4UQ0hVpa0FSCKCje Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit
Hello Melanie and Andres,

20.04.2026 19:18, Melanie Plageman wrote:
but I wonder if there are other queries,
which plans can change due to the same reason.
I think we'll have to take this on a case-by-case basis when we see
failures. While it is certainly possible other tests just rely on
autovacuum not having run and set the page all-visible, many of them
probably have already had to account for that.

Thank you for paying attention to this!

I think, I found another test which suffers from autoanalyze with the new
behavior: [1], [2].

Initially I reproduced this diff on a slow armv7 device after many
iterations of `make check` with:
autovacuum_naptime = 1
autovacuum_analyze_threshold = 1
debug_parallel_query = 'regress'

But now I see that it can be reproduced on an ordinary machine with just:
--- a/src/test/regress/sql/plancache.sql
+++ b/src/test/regress/sql/plancache.sql
@@ -208,2 +208,3 @@ execute test_mode_pp(1); -- 2x
 execute test_mode_pp(1); -- 3x
+analyze test_mode;
 execute test_mode_pp(1); -- 4x
(and expected/plancache.out updated)

and `make check` running in a loop. It failed for me on iterations 5, 4,
10 (as far as I can see, analyze updates relallvisible not every time):
# parallel group (18 tests):  prepare xml conversion plancache limit returning copy2 polymorphism sequence rowtypes largeobject temp rangefuncs with truncate domain plpgsql alter_table
# diff -U3 .../src/test/regress/expected/plancache.out .../src/test/regress/results/plancache.out
# --- .../src/test/regress/expected/plancache.out    2026-04-20 21:35:30.677775398 +0300
# +++ .../src/test/regress/results/plancache.out     2026-04-20 21:43:49.324492302 +0300
# @@ -374,11 +374,11 @@
#  
#  -- we should now get a really bad plan
#  explain (costs off) execute test_mode_pp(2);
# -         QUERY PLAN          
# ------------------------------
# +                        QUERY PLAN                        
# +----------------------------------------------------------
#   Aggregate
# -   ->  Seq Scan on test_mode
# -         Filter: (a = $1)
# +   ->  Index Only Scan using test_mode_a_idx on test_mode
# +         Index Cond: (a = $1)
#  (3 rows)
#  
#  -- but we can force a custom plan

The same modified test survived 50 iterations at 378a21618~1.

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2026-04-07%2012%3A45%3A07
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2026-04-12%2022%3A45%3A06

Best regards,
Alexander
--------------v15PYiPt4UQ0hVpa0FSCKCje--