Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vMln6-001bfe-2Y for pgsql-hackers@arkaria.postgresql.org; Sat, 22 Nov 2025 11:28:32 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vMln5-00BQH7-1G for pgsql-hackers@arkaria.postgresql.org; Sat, 22 Nov 2025 11:28:31 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vMln5-00BQGz-0A for pgsql-hackers@lists.postgresql.org; Sat, 22 Nov 2025 11:28:31 +0000 Received: from mail-ej1-x633.google.com ([2a00:1450:4864:20::633]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vMln3-000suG-0u for pgsql-hackers@postgresql.org; Sat, 22 Nov 2025 11:28:31 +0000 Received: by mail-ej1-x633.google.com with SMTP id a640c23a62f3a-b73a9592fb8so16091266b.1 for ; Sat, 22 Nov 2025 03:28:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763810906; x=1764415706; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=joLWdDTqnnTfqmnjNgNMByh3oNIRyUX9jlAXlJ5sgU0=; b=NqDkFmOWQ/Go/ownWD4KtXl/BsjthAeA5GCXvkQSgV7nGxcWyV/ebW9hgyEBCO060b A6mb7fO4vTZ+9hye2fyBmTmKzk1/3183jCVcMTVTBLYHJ4KMswnaNqW0ei4xf7mUayks EzJBO3K7hH3wssfAOyOqylqhj7sgp9P4uKKyc6Gg5KEduF6dksS41+F79SH9nY3SD10b 8DswShi4EqZ9okC7nuW7tpNVFSjUHRgqfo4MYaMIOlebVkjySorK4eymrbiD6gpuq1I+ Y+lqKMorCscqs2UfmY6ZW1dDubHbKsBZKVGd+e+8Vt9iMF8yDa+zLDhzCy8z18eKpZuf JtZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763810906; x=1764415706; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=joLWdDTqnnTfqmnjNgNMByh3oNIRyUX9jlAXlJ5sgU0=; b=RbrUl+86FpuCc26Vt+T39ziKu4WCFtI88UUHPrJQVXiOHUzXf0b7NiWz04A7VNljba n6TTYWaN9uHHsYihYnHnDez5TIzFljrk2CJFaIg69r5WC6d5k6PBrdOlKaeqzjKSb4QA u11C7oRRyj7lkLkro37m/lpNPasEjWRSypMJgBzbnzqqlprOiUzE+f77zUTYX5gnGbFx iUSKAgGFTcO3yuoLu8aTApe711KqWKqciKwODIHhMA0h8VE8VaEmdL/0zN1hpDvsPP5G ozLf3esReu9QwZLRBecmvYqj1Tk1Bda2Bn6Vn4XCdX6JQDA/2PT8jQiVokR/vxTRnVME /eaA== X-Forwarded-Encrypted: i=1; AJvYcCUscfuOxKIAdTzb/ME+U+KbVsVx1vhkzTLgvLAQfEysYCdf9Fx4kSnfggu9NsCts5x9c664JgVsQx554qP4@postgresql.org X-Gm-Message-State: AOJu0YwqSvN424NN6KanEUi3xUbagvrE4ABKM/txV8HBG8LdbOSCF5Y/ rEHxuc85BEyLoCtOIujuMG2vE9+cJ0y66oWFZAKkFYfeqWg7RoMp57vcMFUS137/DQl9eVgUkAs tKzT+tinhpbkh6in+fy8ok8k43LD5+zw= X-Gm-Gg: ASbGncvxayp9ELxeNMkCBZIjwJCzSScwul3g3sAG3M7Vog8Ex1n1lbMEbwoB1Cb5sjg vw0gU/OO5ksnqVYRvBp83SOB1tAaQTjUDPBfkzPw/ekACMxsszyz3c8nPZ5zxztQx+uC1hDXSAf XK8dWUd694+LLxUa4tya3HsRFRiBFt80VCYNEkT99F3FbAKQQSriycS2OjFMdLRp2+BeaubV0/h RuF1X9Yxyg3AbfLvGOtADjj9le7jc98Eu3xnYAyNtzvH4ZiIpBoxppVKXXr5ok3YRuWhwzfjryD xXQzX4pjF4FizLLGVkeVVzHlS8w= X-Google-Smtp-Source: AGHT+IEVgfBcRnR+ni0xdIqa9ErnKjzUPqiCc5newOYUknjOcEig0ejvd1SDRLckF1DJxPFWyxhc9eJyBhxrEU07Gns= X-Received: by 2002:a17:907:2d88:b0:b76:3c6c:3f79 with SMTP id a640c23a62f3a-b7657287a70mr1071857766b.18.1763810905940; Sat, 22 Nov 2025 03:28:25 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Robert Haas Date: Sat, 22 Nov 2025 06:28:13 -0500 X-Gm-Features: AWmQ_bminH0Xw062BiSjcvZWMUwmxbSJ4ri8QuGcmn6rIs-UfJt8mH2Y8tCCYPE Message-ID: Subject: Re: another autovacuum scheduling thread To: David Rowley Cc: Sami Imseih , Nathan Bossart , Robert Treat , Jeremy Schneider , pgsql-hackers@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Nov 20, 2025 at 5:12=E2=80=AFPM David Rowley = wrote: > It wasn't intended to be offensive. OK. > I suspect the most likely area the new prioritisation order could > cause issues is from the lack of randomness. Will multiple workers > working into the same database be more likely to bump into each other > somehow in a bad way? Maybe that's a good area to focus testing. I agree that lack of randomness could cause problems, but I don't see how it could cause regressions, because the current system isn't random, either. Even if the order of pg_class is unpredictable, it may (depending on the workload) not change very much from one day to the next. > Yeah partly, but mostly I just really doubt that this matters that > much. It's been said on this thread already that prioritisation isn't > as important as the autovacuum-configured-to-run-too-slowly issue, and > I agree with that. I just find it hard to believe that the highly > volatile pg_class order has been just perfect all these years and that > sorting by percentage-over-threshold-desc will make things worse > overall. There was mention that pg_catalog tables are first in > pg_class, but I don't really agree with that as if I create some new > tables on a fresh database, I see those getting lower ctids than any > pg_catalog table. The space for that is finite, but there's no > shortage of other reasons for user tables to become mentioned in > pg_class before catalogue tables as the database gets used. I see that > table_beginscan_catalog() uses SO_ALLOW_SYNC too, so there's an extra > layer of randomness from sync scans. I don't recall any complaints > from the order autovacuum works on tables, so, to me, it just seems > strange to think that the volatile order of pg_class just happened to > be right all these years. I suspect what's happening is that the extra > bloat or stale statistics that people get as a result of the > pg_class-order autovacuum just gets unnoticed, ignored or attended to > via adjustments to the corresponding scale_factor reloption. Interesting. I don't have any real knowledge of how jumbled-up the order of pg_class is on real production systems, and I agree that if the answer is "it's usually quite jumbled up" then that is good news for this patch. In any case, I'm not trying to say that prioritization is an intrinsically bad idea, because I don't believe that. What I'm trying to say is that there's a limited number of ways for this patch to make things worse, and one of them is if someone is winning right now by accident, so therefore we should think about how many people might be in that situation. I would argue that if a large number of users end up with a very similar pattern in terms of how pg_class is ordered, that makes the patch higher-risk than if, as I think you're arguing here, there's enough randomness in terms of where things end up in pg_class to prevent any particular pattern from predominating. In the latter case, one or two really unlucky users could end up worse off, but that's not really an issue. What would be an issue is if we regressed some kind of common pattern. I admit that's a bit speculative and I'm probably being a little paranoid here: doing smart things is typically better than doing dumb things, and what we're doing right now is dumb. On the other hand, once we ship something, we can't pull it back. If it causes a problem, someone will call me at 2am and need their system fixed right now. If my answer is "well, there are no configuration knobs we can change and no way to get back to the old behavior and I'm sorry you're having that problem but the only answer is for you to run all your VACUUMs manually until two years from now when maybe the algorithm will have been improved," it's not going to be a very good night. After 15 years at EDB, I've learned that the problem isn't being wrong per se; it's having no way to get out from under being wrong. It is absolutely inevitable that I will screw up, you will screw up, the project as a whole will screw up, and that doesn't worry me a bit. What does worry me is the prospect that we won't have thought hard enough about what we're going to do if and when that happens. Most of the customers that I've gotten to work with over the years are very gracious about things going wrong with the software as long as there are some options to deal with the problem. I fully admit that this patch may already be good enough that I'll never hear a single customer complain about it, but the time to think through the reverse scenario, where some users are unhappy, is before we ship, not after. That necessarily involves some speculation about what might go wrong and some of that speculation may be groundless, but speculation causes a lot less pain than angry customers whose problems you can't fix. --=20 Robert Haas EDB: http://www.enterprisedb.com