Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vWGCk-00GAls-1n for pgsql-hackers@arkaria.postgresql.org; Thu, 18 Dec 2025 15:46:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vWGCj-002qko-11 for pgsql-hackers@arkaria.postgresql.org; Thu, 18 Dec 2025 15:46:14 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vWGCj-002qkf-03 for pgsql-hackers@lists.postgresql.org; Thu, 18 Dec 2025 15:46:13 +0000 Received: from mail-qt1-x835.google.com ([2607:f8b0:4864:20::835]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vWGCi-001LwO-0C for pgsql-hackers@lists.postgresql.org; Thu, 18 Dec 2025 15:46:12 +0000 Received: by mail-qt1-x835.google.com with SMTP id d75a77b69052e-4ee257e56aaso7308451cf.0 for ; Thu, 18 Dec 2025 07:46:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766072771; x=1766677571; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=PYaz59GKplrJr7z5H8NAIFctgd+FALs8UNGClCNogs8=; b=G9XBQe7unQq/e24OQ4E+/Yr3uGKbC+cCu6Qk/808/1koQ9HU6afDda47/eTPsvQESo BQ4MSPgE0F20KFSeoLf7x0PhJk+ZfmFQ7qnPufSjP7ZhWMD5DZIdO9OnogHm/FM9cd3Q RwrH8jb8fqUd+y8DIvYcB3+5aG+5TA1VH9egI1jBVkXB1nsq0nUQpi1/ea3AfYqC6Z6+ H6cZIJebY5Ec/LRQsq6DQtdmSgTL7EAsUUW3RZAViV0hP7vcBfTfvX/UfHoAnvCooYHD b+kwZSSjexvegsrWaaa3SiLwATnYE7bD4vZToczv5ZyjJEyjJ0OQC8ex6dtPeJ6ojTcF ozNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766072771; x=1766677571; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PYaz59GKplrJr7z5H8NAIFctgd+FALs8UNGClCNogs8=; b=REH275udhdsdm6sultCvDlZYydsHKuZMaC1Bm5IbeueKm5dQ891fjM4QoJRZQe/oVn QJj6D3UBIelKq6IfEp/d4A7qOZfjiUfRkcyRdnmhw3lDwvnDnx12gKbUhbXbjz6/Nnwp qSsf7x+YKcAJiEdKfKlzrAzZN2pEps3fNifbe6WHGQ6o2AUW9nVaPDCThjPO/4JTzRK9 NqhCWfxQrl+wJVK6qBs5MvGR1p1H6ckj4gTKZF6v1U1hnDXXAHkyMRid5JtN35z2xLZk ZfqXBh9sjTNxcuoxJjLAash/wV5s5eJMCMovzkQIchxC5vXzvuyn5Pt9j0RH9XmXEIYu 9lkA== X-Forwarded-Encrypted: i=1; AJvYcCX96FmnQ2kaKxUFqrmtqrEiWmvrKnW68GxYPal/kBmptJV9AIpvt5imKwEc5hMrh6KpMVj+/A+ora6QdEPw@lists.postgresql.org X-Gm-Message-State: AOJu0YxNPCfxoRJp6O9NTKuD8yQDgr7nHU+0MjuUVSHvMwlhCSjBLMJU xWrE3ux3Zaqd5fW5q3A0JH2BiS/8GT/tc6pu5KMlZtGFxg7xla6rwk+1Bx2FwVJOe3BhfJSYpaH 0a3BbSsEN1ai2iJxc4OCSJxaF4wQyIFQ= X-Gm-Gg: AY/fxX4n800xEi22M+HU6itL3aPge7pxEATmQtvvwpxak8P50MAL7s0erD2TpQTsYBm D/6+zptBd1rf2ut/OXDbS8Gac8FCyMJJCCjZ7YuEFUxWFwI60hH9zSVc09fiXEbEU2Ucz7FOQBu nkwfkMeah+WqWHpyrCLdy2p+MGvhS1UsrCCLlDMAkQ8V3qHklD7E1YH+4oSQc8P2EEiUsB2Wuhn /NSWJusuXfHo/EEJguEj57Hu91lRCcuLlXPifXEhZ+7p3zIHEgoh9v4JYdwoU48VhwvlORFqUks JFHFBd+S X-Google-Smtp-Source: AGHT+IGMEuAVlDMi+w7C5cslTANUg+5H4BtaeymMP6iBQAdbm7zLhsfYjDWPTj4xwvUnBmCwgzMWBbz2aVeuAxIqJko= X-Received: by 2002:ac8:5a81:0:b0:4ee:26b3:e512 with SMTP id d75a77b69052e-4f35f3b7834mr45707471cf.13.1766072770710; Thu, 18 Dec 2025 07:46:10 -0800 (PST) MIME-Version: 1.0 References: <2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl@mdyyqpqzxjek> In-Reply-To: From: Kirill Reshke Date: Thu, 18 Dec 2025 20:45:58 +0500 X-Gm-Features: AQt7F2q1sjMIImRlhvC99OkCA1zKnyFHLydJlXcPF5UPqqtY2Vcy5sXr5QzVjr0 Message-ID: Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) To: Melanie Plageman Cc: Andres Freund , Robert Haas , Andrey Borodin , PostgreSQL Hackers , Heikki Linnakangas , Chao Li Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, 18 Dec 2025 at 20:18, Melanie Plageman wrote: > > Also, after the whole set is committed, we should then never > > experience discrepancy between PD_ALL_VISIBLE and VM bits? Because > > they will be set in a single WAL record. The only cases when heap and > > VM disagrees on all-visibility then are corruption, > > pg_visibilitymap_truncate and old data (data before v19+ upgrade?) > > If my understanding is correct, should we add document this? > > Even on current master, I don't see a scenario other than VM > corruption or truncation where PD_ALL_VISIBLE can be set but not the > VM (or vice versa). The only way would be if you error out after > setting PD_ALL_VISIBLE before setting the VM. Setting PD_ALL_VISIBLE > is not in a critical section in lazy_scan_prune(), so it won't panic > and dump shared memory, so the buffer with PD_ALL_VISIBLE set may > later get written out. But the only obvious way I see to error out of > MarkBufferDirty() is if the buffer is not valid -- which would have > kept us from doing previous operations on the buffer, I would think. > Well... I may be missing something, but on current HEAD, XLOG_HEAP2_PRUNE_VACUUM_SCAN and XLOG_HEAP2_VISIBLE are two different record, XLOG_HEAP2_PRUNE_VACUUM_SCAN being always emitted first. So, WAL writer may end up kill-9-ed just after XLOG_HEAP2_PRUNE_VACUUM_SCAN makes it to the disk, and XLOG_HEAP2_VISIBLE never. Crash recovery then, and we have discrepancy. This does not happen with a single WAL record. Another simple reproducer here: standby streaming, receiving XLOG_HEAP2_PRUNE_VACUUM_SCAN from primary, Then network becomes bad, and we never get XLOG_HEAP2_VISIBLE from primary. Then we promoted by the admin. And again, VM bit vs PD_ALL_VISIBLE discrepancy. Am I missing something? -- Best regards, Kirill Reshke