Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vIxhO-004v9Q-2t for pgsql-hackers@arkaria.postgresql.org; Tue, 11 Nov 2025 23:22:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vIxhM-008IR3-0Y for pgsql-hackers@arkaria.postgresql.org; Tue, 11 Nov 2025 23:22:52 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vIxhL-008IQv-2C for pgsql-hackers@lists.postgresql.org; Tue, 11 Nov 2025 23:22:51 +0000 Received: from mail-vs1-xe30.google.com ([2607:f8b0:4864:20::e30]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vIxhJ-006cgy-22 for pgsql-hackers@postgresql.org; Tue, 11 Nov 2025 23:22:50 +0000 Received: by mail-vs1-xe30.google.com with SMTP id ada2fe7eead31-5dd6fbe50c0so119370137.2 for ; Tue, 11 Nov 2025 15:22:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=xzilla-net.20230601.gappssmtp.com; s=20230601; t=1762903369; x=1763508169; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=SkpY5xs2WebzWzhvknId1F2MOzQzHc5jr/WtY4z+7W4=; b=ybZQRLZaEKHiI1I329ZexiinPugExUaAKvNIE/RbqIivDXHr/lYq29CnqCuIFxdcOJ rhBeSpw8/P6j9QQ3o5nlPRVzVdFQ76GftfZuCR8aoOCn81Kl6rkPx9oFS0LCCQxyg1Bu wi0oNawoQmXeKh1O3D73lzRQe6zl+9hhC+sSW2rEdgJ5TJoTvXgh0SHdzvxl6RRmuqgf wCt+mu2r79HU2JxbVaJD1V4kNUw8g0xuxxMlMcVItvrd/QVGNqj5z7XqpGaocOk8J0CJ XUVVBUOfsJCYb0mK6/XzuzcpXMVXaQZkemUB6XUd3Wy+dmYfzBfaLcL6WX/dbE6Qiwvp 2CqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762903369; x=1763508169; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=SkpY5xs2WebzWzhvknId1F2MOzQzHc5jr/WtY4z+7W4=; b=NrWSiJ9sU4956otZHlQo4/osC4iBi9OIt1wIv/0suSZBY7otZkvYCatz9rwLqkq+BK 5qyKZlzTH365Fa3arwNmWfFla5K0pbZy3pO9QXbxpYHkkfjXDlqfQeZinLQIaYZpgVg4 9TWytOioDNcipykFcUiKUUfHJb1iL+s8UdFzGTGESLaAe3ljmSsZOQC9AgeqMm28vQW9 RlJZufvNRbf2YGn69N/mqnr8qncggtz7sNUbUV+fQw+faLBrmT/iy3eaWOvgJu/OiSnw 555hwJk523cZG2Qc9b5PADHXdIWsu2ej4HZUCstn/pMnSBodRFSfIpnH2+a5VzlCLa3m K9sg== X-Forwarded-Encrypted: i=1; AJvYcCX8jqb+ljJIuQjqzQTeSYoL7EL+HwX8QDTK7+3Xr7/bSJ3t6jJHANShUlhT2cDkuIp4/6gJ00eGM12wKnR2@postgresql.org X-Gm-Message-State: AOJu0YysunwC5Ca31S/O8fFTqVWfEHcVEBuMpgpnujGX+6PM/Vcuk7+H Oqa0EVesSSTbrnoQOfIhncVavuymq0/fP2KPJQ15nQvzDFAJBCshOTrsoX9CcSKjJ9L/df5MJCz 5RQuXY9gcVecZIBWPTkiInFYw+5fcBNQpuE90b1lc3w== X-Gm-Gg: ASbGncueKFi53dq83NlgnkidlLZNdGbPwYRLXiUAQQhAq6sSXci396tOoVUdEvJ+THC wEhayQrID4qi3PaqN/4oam1SCpa+PcmUX76TkzEL7qKkK/HrJW0Pj3/DFPUgyrQdCxMqhUDs9n5 MGojQ7hftHbx6NLpkAi+zVC7kaP0yIXfIBJUrXsynQ/3ooJkH44n6v7EPvM57yojKqknDljzQrG uvrB+YUFb9FS0oDkdXdhNNdgAxRZhe5BkI8azvocxXiYFJ7htA8MG4PXW9v X-Google-Smtp-Source: AGHT+IFLd4ShlN/g5/ejfp3q6SXDulXCKfo29Fb7zRJxqL29Z8x9fJs6Z4w7FcVcvTiqQzk29xYIfQ3kx0bIFzcdvWg= X-Received: by 2002:a05:6102:5114:b0:5d5:dbbb:5b86 with SMTP id ada2fe7eead31-5de07e2ba90mr213571137.25.1762903368748; Tue, 11 Nov 2025 15:22:48 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Robert Treat Date: Tue, 11 Nov 2025 18:22:36 -0500 X-Gm-Features: AWmQ_bnhBeb7ubRuzB6aYsF0vaXPNEVylHZ8Q4XKbt4OL_45myjPWiIerpQEJgs Message-ID: Subject: Re: another autovacuum scheduling thread To: David Rowley Cc: Nathan Bossart , Sami Imseih , Robert Haas , Jeremy Schneider , pgsql-hackers@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Nov 11, 2025 at 3:27=E2=80=AFPM David Rowley = wrote: > On Wed, 12 Nov 2025 at 09:13, Nathan Bossart w= rote: > > On Wed, Nov 12, 2025 at 09:03:54AM +1300, David Rowley wrote: > > > /* when enough time has passed, refresh the list to ensure the > > > scores aren't too out-of-date */ > > > if (time is > lastcheck + autovacuum_naptime * ) > > > { > > > list_free_deep(tables_to_process); > > > goto the_top; > > > } > > > } // end of foreach(cell, tables_to_process) > > > > My concern is that this might add already-processed tables back to the > > list, so a worker might never be able to clear it. Maybe that's not a = real > > problem in practice for some reason, but it does feel like a step too f= ar > > for stage 1, as you said above. > > Oh, that's a good point. That's a very valid concern. I guess that > could be fixed with a hashtable of vacuumed tables and skipping tables > that exist in there, but the problem with that is that the table might > genuinely need to be vacuumed again. It's a bit tricky to know when a > 2nd vacuum is a legit requirement and when it's not. Figuring that out > might me more logic that this code wants to know about. > Yeah, there is a common theoretical pattern that always comes up in these discussions where autovacuum gets stuck behind N big tables + (AVMW - N) small tables that keep filtering up to the top of the list, and I'm not saying that would never be a problem, but assuming the algorithm is working correctly, this should be fairly avoidable, because the use of xid age essentially works as a "hash of vacuumed tables" equivalent for tracking purposes. Walking through it, once a table is vacuumed, it should go to the bottom of the list. The only way it crops back-up quickly is due to significant activity on it, but even then, you need a special set of circumstances, like it needs to be a small enough table with heavy updates and a small autovacuum_vacuum_threshold. This type of combo would cause the table to look like it is excessively bloated and in need of vacuuming, but even in this scenario, eventually other tables will get an xid age high enough that they will "out rank" the high activity table and get their turn. TBH I'm not sure if we need to do replanning, but in the scenarios where I have used it, having more accurate information on the state of the database has generally been better than relying on more stale information. Of course it isn't 100%, but the current implementation isn't either, and don't forget we still have the failsafe_age as, well, a failsafe. Robert Treat https://xzilla.net