Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vM5gZ-000ltt-0q for pgsql-hackers@arkaria.postgresql.org; Thu, 20 Nov 2025 14:30:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vM5gX-0029yT-2s for pgsql-hackers@arkaria.postgresql.org; Thu, 20 Nov 2025 14:30:58 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vM5gX-0029yL-1X for pgsql-hackers@lists.postgresql.org; Thu, 20 Nov 2025 14:30:57 +0000 Received: from mail-ej1-x62c.google.com ([2a00:1450:4864:20::62c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vM5gV-000XSy-14 for pgsql-hackers@postgresql.org; Thu, 20 Nov 2025 14:30:56 +0000 Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-b7370698a8eso120854766b.0 for ; Thu, 20 Nov 2025 06:30:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763649055; x=1764253855; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YLHIbxiwJaLPdy+WzrbOZDzhG7Y3D67MyiDoRSs1f7I=; b=BVqEeOUThjeYcF020Xze9QRPy7kb4JMy/ShIzvhRFSqx7LOzhSQxEuBtze6nMii21b cMkfxzxAegrkFuDrMi9dQH1w5X0LQj0qX1FjMwkvo61lE7suOsdeQp7G7SEVnLEv1lXg VswQjtikIFvkJW2EDat7nJSA7JYD5tUsicmB1GFUlTryzzVeSMoWNm2pov8tMSlkgqdY rccWDjNGaTkHl25GZU262bHuNlylSJv7nqpQo8MHw5b1oA8KthxzqANEG+Nftn5Ih9Kc 9rlv37s9ZqLWPBdrmD5w2BfogD5JPTrp6IjoZ6OjfTdiE+EV1pv+/fhf3Z3WkgP7EnEH 3txQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763649055; x=1764253855; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=YLHIbxiwJaLPdy+WzrbOZDzhG7Y3D67MyiDoRSs1f7I=; b=b/JnIRXb50spTT21k5jXBjIMVbdERFc9OFy7ysh1BaQPnf8JhBzrt9KcqOoH9nkrLJ T7T74A6osAnGSsTK0wxX6MCTCVME3ijaL1+aLKFmXNMr8ncughCiqrD1jNyjq+OZpRXD 5V8nM9BZfyHcg9GIES7i3NqxbDm2f/2M9yXAl0iQSS02j9zBKemCRBwMQ7zrCHh7io1X HqKP7OPVcoJGV0PW7UJ8GkcEhqcNSAYRCkwuN8R4GacJXVaRB0B1/JTRUmv799aLQgrz xzePmU7wFQShIO5uf7wAnkzcye9FKfnEjPjVxSugbUX+VhlIG7N/IW9kRvRSBudotHlj X7pg== X-Forwarded-Encrypted: i=1; AJvYcCWcENH/rFHKm9sF/t9BWcaG4onb6WU3nBhIV/11tvjTAXlVa84pZAZSt2eUBOQq6elUJSWul71J8bn79Sb7@postgresql.org X-Gm-Message-State: AOJu0YymHyYa3rC6S9Wav/68/Eev3B65yb+0wkE2q+iVHAZf71Cm+HoA mpCnpVtWUva9mV+NUyYG2h6KXxeabYuMkwqtYIPitDS33X+jWsZyiLl0NCn/zBZWOrvDCJoyqqq +XAws7QiweSRo7t60fK5gl9fmiI494tU= X-Gm-Gg: ASbGncuQgoJJnOuo+CTJa+Yz9f3k6oVm7aVHN9BHPIAkF1kNTk00789JsHyH2UxeOWX m3PehEYqYulK8h5LRYOSLbWTXHttci3mShWrs5Q1G1v+9ihDOzalZmiJBLcuC1OARQuvgZOBhGN 2/2QGz7S/q6by9XcKywO2h6UzDNV8dQjvLzINZTCKjftFbJSur/mOzijQvgA1OqazaosOR2gpnD bGfX/17ilhi69vt9V1ttv5Qg8/S2sUw1K8GuB29Jmf3NcBbsKmcfLzZbeKwDhKwatnkoCP0gEAg 4zfFj1bNSqzNGdiCyDugGLO9+mnyghcZCxOAaw== X-Google-Smtp-Source: AGHT+IGKBVSmX+8P9h7AKEYoPlQT5BumY+WiCt5Dgdgm0/2zNO2/7jK3vFI18vsFAPnAXM1L4ct12OO34fzZBfR1Rfo= X-Received: by 2002:a17:907:d8b:b0:b72:5983:db07 with SMTP id a640c23a62f3a-b7655485963mr353680866b.7.1763649054435; Thu, 20 Nov 2025 06:30:54 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Robert Haas Date: Thu, 20 Nov 2025 09:30:42 -0500 X-Gm-Features: AWmQ_blJuP9oufRpHHPDR5SJfC3YsUT2ObAOkMVf0EIwo0-h14Y2eVzY_ds2034 Message-ID: Subject: Re: another autovacuum scheduling thread To: Nathan Bossart Cc: Robert Treat , David Rowley , Sami Imseih , Jeremy Schneider , pgsql-hackers@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Nov 12, 2025 at 3:10=E2=80=AFPM Nathan Bossart wrote: > I do think re-prioritization is worth considering, but IMHO we should lea= ve > it out of phase 1. I think it's pretty easy to reason about one round of > prioritization being okay. The order is completely arbitrary today, so h= ow > could ordering by vacuum-related criteria make things any worse? In my > view, changing the list contents in fancier ways (e.g., adding > just-processed tables back to the list) is a step further that requires > more discussion and testing. I agree with your view around reprioritization. To answer your rhetorical question, the way that reordering the list could hurt is if the current ordering (pg_class scan order) happened to be a near-optimal choice. For example, suppose the last table in pg_class order in a state where vacuuming appears to be necessary but will be painful and/or useless (VACUUM will error, xmin will prevent all or most tuple removal, located on an incredibly slow disk with nothing cached, whatever). Re-sorting the list figures to move that table earlier, which will not work out for the best. I suspect that reprioritization actually increases the danger of this kind of failure mode. The more aggressive you are about making sure that the highest-priority tables actually get handled first, the more important it is to be correct about the real order of priority. I do think in the long term a really good system is probably going to accumulate a bunch of extra logic to deal with cases like this. For example, if the first table in the queue causes VACUUM to spend an hour chugging a way and then fail with an I/O error, we would ideally want to make sure to wait a while before retrying that table, so that others don't get starved. But like you say, there's no need to solve every problem at once. What seems important to me for this patch is that we don't choose an actively bad sort order. For instance, if we don't get the balance between prioritizing anti-wraparound activity and controlling runaway bloat correct, and especially if there's no way to recover by tweaking settings, to me that's a scary scenario. I do think it's fairly realistic for a bad choice of sort order to end up being a regression over the current lack of a sort order. You might just be getting lucky right now -- say, because the catalog tables all occur first in the catalog and vacuuming those tends to be important, and among user tables, the ones you created first are actually the ones that are most important. That's not a particularly crazy scenario, IMHO. Point being: I think we need to avoid the mindset that we can't be stupider than we are now. I don't think there's any way we would commit something that is GENERALLY stupider than we are now, but it's not about averages. It's about whether there are specific cases that are common enough to worry about which end up getting regressed. I'm honestly not sure how much of a risk that is, and, again, I'm not trying to kill the patch. It might well be that the patch is already good enough that such scenarios will be extremely rare. However, it's easy to get overconfident when replacing a completely unintelligent system with a smarter one. The risk of something backfiring can sometimes be higher than one anticipates. One idea that might be worth considering is adding a reloption of some kind that lets the user exert positive control over the sort order. I know that's scope creep, so maybe it's a bad idea for that reason. But I think it would be a better idea than Sami's proposal to score system catalogs more highly, not so much because his idea is necessary wrong-headed as because it doesn't help with what I see as the principal danger here, namely, that whatever we do will sometimes turn out to be wrong. Trying to be right 100% of the time is not going to work out as well as having a backup plan for the cases where we are wrong. --=20 Robert Haas EDB: http://www.enterprisedb.com