Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v7L9o-001BZN-Gd for pgsql-hackers@arkaria.postgresql.org; Fri, 10 Oct 2025 22:00:12 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v7L9j-00EXU9-Vm for pgsql-hackers@arkaria.postgresql.org; Fri, 10 Oct 2025 22:00:08 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v7L9j-00EXS2-HR for pgsql-hackers@lists.postgresql.org; Fri, 10 Oct 2025 22:00:08 +0000 Received: from mail-ot1-x32e.google.com ([2607:f8b0:4864:20::32e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v7L9f-0016Y3-0i for pgsql-hackers@postgresql.org; Fri, 10 Oct 2025 22:00:07 +0000 Received: by mail-ot1-x32e.google.com with SMTP id 46e09a7af769-79d36a6298dso911680a34.0 for ; Fri, 10 Oct 2025 15:00:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ardentperf-com.20230601.gappssmtp.com; s=20230601; t=1760133602; x=1760738402; darn=postgresql.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=JsRgQWe+GRwBytgt2TZo7H09RGrnII+yfZoQhF+okEM=; b=Fsd7G9RWqY5664haEL7URkIDN/R1BiEt2KwpzxyynrNc7Bc+EyHLPgDBtuqmRxVTk6 eiXsJih+JsGxJVrAqGnkR44yWpwh4CKWF9otfS2L4nZOceJChaXsgGsj4eg2FoSG2scR qWSWqcYDi1cL8gZ0xgYxLo+T9FH9l6Tzh23b8j6m9lbpbELtdmTn0Tv01o9qxn7hy6SJ Lov2Ll7apJI2FCQBKTwYJrUFrESgN2tmxOfUqFEMJT0QgP2OtG+4/JL6OctB3yOuaIfx uBrq6s05CcDTVD+SSGnzJ4aYaV/uj6kkyMv6gN8tEX639DqOr36Xuz4e1NqdYKIAtgY2 26tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760133602; x=1760738402; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JsRgQWe+GRwBytgt2TZo7H09RGrnII+yfZoQhF+okEM=; b=IqMl/MT5XDGJC+X3RNPevbH4ojdW4WfeaVL9H8sprQPE3xa+vLiVnYAazfXwIoSoBS I1SMjbDaZ+EtvjRpM/Rg6lu5VFgvjsBeMZpcC/R5LP+PfTkUqvETYLvVUNNQE0lAyU8W GhySXYhmqLzywH2BX37zQCKfDxtQZa/AT3c4QLmDlVDUBqBkqKTe1pirAu4IXPRm7Bvj JmoOFTL0ZngkQsDjgZkzpnq+s4//SkOTBuMzFo4KhOhS4QJdxzkxncS6zGZktg8Gn8uc 5dTerWnMbSihCyIQQWhNF4PIz0NFKsX4Szwtry2cEaW8YpOvcN6bL+B7v8VHlPdyoXZ5 DbDg== X-Forwarded-Encrypted: i=1; AJvYcCVe8BWyp3qqlR/jK0bDVIZyVfVzPT+IxWf+JvCSViSLjha79BPofMG9e97ML4cMT7ggNWJdq7/NKsPRTrMw@postgresql.org X-Gm-Message-State: AOJu0YydpcqMnPQn7bsHJIXJ4EVW1dgatXzIbIHLfPWzml434FVL2dIY F/MKqtnoaFxaUlpOhSfjBbttN53CkcaEe4S2SnETHa8tU7dv1dajDObnKHgxNycRDg== X-Gm-Gg: ASbGncuzV9iR7qnBMDWWY4y4bICHb91VAKTWHHoXr5EA9tAyXNXajQR2nedoHqlOxK2 /UG4blkI70ErZW/E+uhV3cVydSuOZfl4kEGVupPLZrxjLCFJnqfJBoWH0Ygf7t+zywNWmUrmsyE D56S/iaWJc/yO4NQ/SyhE1xcuZYtdVbJdJP8rzNRz3sMP3Zy6DGs40riTJuN4sTu42tIiUYMucz 4e3dWeoVffdJRrajsKQCSyguPfMlNKQzNsE5FKYN5DyfZpMZMJNCTwfDjl96UJ77z+LuiPV9Lc5 zQfusWuhWwRU1qmsEBPRW6e2jtnPgXPO2HvnP42zqnP7eu9vRFhQyFS7TzrP8/qwc0UILRHFKvi qySo4Y+OFM2ziWqYiT7UuvVgjawIoirARkABC80CNDMZCNu6Z/cK/0uFB1cQ20zizOy5rrVKIF/ TGLq2HuQjbR/TdWsVH5NFp5IltgZY= X-Google-Smtp-Source: AGHT+IHYEmra6H+gcoKpdMAPR9bHAzpinXt9ZoUPpOn+8+6E2M/qxtkvbKPSBA0mhx/qXIv+8rZPMA== X-Received: by 2002:a05:6830:620c:b0:758:5a29:cba0 with SMTP id 46e09a7af769-7c0df7e61a6mr8165865a34.33.1760133602142; Fri, 10 Oct 2025 15:00:02 -0700 (PDT) Received: from ardentperf.com (97-113-159-222.tukw.qwest.net. [97.113.159.222]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7c0f911ab1csm1225577a34.28.2025.10.10.15.00.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Oct 2025 15:00:01 -0700 (PDT) Date: Fri, 10 Oct 2025 14:59:59 -0700 From: Jeremy Schneider To: Robert Haas Cc: Nathan Bossart , David Rowley , Sami Imseih , pgsql-hackers@postgresql.org Subject: Re: another autovacuum scheduling thread Message-ID: <20251010145959.414a2c27@ardentperf.com> In-Reply-To: References: <20251008164057.6bceb9ed@ardentperf.com> <20251008172727.3befd129@ardentperf.com> <20251008182520.6e05a8b8@ardentperf.com> <20251008184740.328d45de@ardentperf.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, 10 Oct 2025 16:24:51 -0400 Robert Haas wrote: > I don't think we > need something dramatically awesome to make a change to the status > quo, but if it's extremely easy to think up simple scenarios in which > a given idea will fail spectacularly, I'd be inclined to suspect that > there will be a lot of real-world spectacular failures. What does a real-world spectacular failure look like? "If those 3 autovac workers had processed tables in a different order everything would have been peachy" But if autovac is going to get jammed up long enough to wraparound the system, does it matter whether or not it did a one-time processing of a bunch of small tables before it got jammed? One particular table always scoring high shouldn't block autovac from other tables, because it doesn't start a new iteration until it goes all the way through the list from its current iteration right? And one iteration of autovac needs to process everything in the list... so it should take the same overall time regardless of order? The spectacular failures I've seen with autovac usually come down to things like too much sleeping (cost_delay) or too few workers, where better ordering would be nice but probably wouldn't fix any real problems leading to the spectacular failures =46rom Robert's 2024 pgConf.dev talk: 1. slow - forward progress not fast enough 2. stuck - no forward progress 3. spinning - not accomplishing anything 4. skipped - thinks not needed 5. starvation - cant keep up I don't think any of these are really addressed by simply changing table order. =46rom Robert's 2022 email to hackers: > A few people have proposed scoring systems, which I think is closer > to the right idea, because our basic goal is to start vacuuming any > given table soon enough that we finish vacuuming it before some > catastrophe strikes. ... > If table A will cause wraparound in 2 hours and take 2 hours to > vacuum, and table B will cause wraparound in 1 hour and take 10 > minutes to vacuum, table A is more urgent even though the catastrophe > is further out. Robert it sounds to me like the main use case you're focused on here is where basically wraparound is imminent - we are already screwed - and our very last hope was that a last-ditch autovac can finish just in time Failsafe and dynamic cost updates were huge advancements. Do we allow dynamic adjustment to worker count yet? I hope y'all just pick something and commit it without getting too lost in the details. I honestly think in the list of improvements around autovac, this is the lowest priority on my list of hopes and dreams as a user for wraparound prevention :) because if this ever matters to me for avoiding wraparound, I was screwed long before we got to this point and this is not going to fix my underlying problems. -Jeremy