Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCHL9-00BYQA-7t for pgsql-general@arkaria.postgresql.org; Fri, 24 Oct 2025 12:56:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vCHL8-00GNTi-4t for pgsql-general@arkaria.postgresql.org; Fri, 24 Oct 2025 12:56:17 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCHL7-00GNTW-QX for pgsql-general@lists.postgresql.org; Fri, 24 Oct 2025 12:56:16 +0000 Received: from lana.depesz.com ([88.198.49.178] helo=depesz.com) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vCHL4-003zit-0N for pgsql-general@lists.postgresql.org; Fri, 24 Oct 2025 12:56:16 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=depesz.com; s=20170201; h=In-Reply-To:Content-Type:MIME-Version:References:Reply-To: Message-ID:Subject:Cc:To:Sender:From:Date:Content-Transfer-Encoding: Content-ID:Content-Description; bh=7daKGrgxSHlV+xjogc0ZUA2JoHLyS7EByUL1yeu+epo=; b=s2OgjzHeTF8Y/WtD/rC316Afts sWtsS+PRcm4iAyPVseJBA/sV9PauWACo+yOlqdFFffZPSDfy13aW1T3IgqMgoymgfjaOj8vNMtdEe p2zG5Iz/ztyl5OgBsGm4tycfInhxIC6NAVRDh+mzzwuH4BG08h0jV/NgWGs79yiMFtys=; Received: from depesz by depesz.com with local (Exim 4.96) (envelope-from ) id 1vCHL2-00EOZt-1O; Fri, 24 Oct 2025 14:56:12 +0200 Date: Fri, 24 Oct 2025 14:56:12 +0200 From: hubert depesz lubaczewski Sender: depesz@depesz.com To: Tom Lane Cc: PostgreSQL General Subject: Re: Why is this query touching 4gb of buffers? Message-ID: Reply-To: depesz@depesz.com References: <1754333.1761310446@sss.pgh.pa.us> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1754333.1761310446@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Oct 24, 2025 at 08:54:06AM -0400, Tom Lane wrote: > hubert depesz lubaczewski writes: > > First run of the query generated: > > ... > > -> Index Scan using index_some_table_pending on some_table (cost=0.42..178322.57 rows=611988 width=16) (actual time=27962.567..27962.567 rows=0 loops=1) > > Index Cond: (send_at <= '2025-10-23 12:35:48'::timestamp without time zone) > > Buffers: shared hit=4624 read=117838 dirtied=486 > > > Then, immediately I reran it, without reindex, without analyze, without anything. And I got: > > > -> Index Scan using index_some_table_pending on some_table (cost=0.42..178328.27 rows=612009 width=16) (actual time=0.438..0.438 rows=0 loops=1) > > Index Cond: (send_at <= '2025-10-23 12:35:48'::timestamp without time zone) > > Buffers: shared hit=424 > > > Time is irrelevant, the point is that we are going down from ~120k buffers > > "touched" to 424 buffers. What is going on? > > The first execution probably had to set hint bits on a whole lot > of recently-deleted rows. But why it doesn't happen/help on secondary? Subsequent runs on secondary still have to "touch" hundreds of thousands of pages", even if I'll do the run on primary that would reset hint bits. So, on primary - reruning the query fixes the "how many pages we touch" - but on secondary, the number generally doesn't change, at least withint 15-20 minute window. Best regards, depesz