Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uoh3X-008NF5-1M for pgsql-general@arkaria.postgresql.org; Wed, 20 Aug 2025 11:32:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uoh3W-007bmq-36 for pgsql-general@arkaria.postgresql.org; Wed, 20 Aug 2025 11:32:38 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uoh3V-007bmi-OO for pgsql-general@lists.postgresql.org; Wed, 20 Aug 2025 11:32:38 +0000 Received: from lana.depesz.com ([88.198.49.178] helo=depesz.com) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1uoh3U-000toW-0b for pgsql-general@lists.postgresql.org; Wed, 20 Aug 2025 11:32:37 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=depesz.com; s=20170201; h=In-Reply-To:Content-Type:MIME-Version:References:Reply-To: Message-ID:Subject:Cc:To:Sender:From:Date:Content-Transfer-Encoding: Content-ID:Content-Description; bh=z3nbcUzKsZgsRfge6knCw0SSiKXmoQNdjIL+ic5v2aY=; b=YgZuXi3+FkFVpBl8RDCEEhFLLx iBh3+IMQWIXzhsfEzQEXi3WMViG6kLbCHJWrk/Fui2u/8ge6Mdh4FbG7qb2d2bNU0ZafMufxeyvD2 nWVnEhvsdASTeZPqoVuedsPPsbGW1uxSQdbC7X85xU112GVHpbpkbGL3n/VCfmMs4BuE=; Received: from depesz by depesz.com with local (Exim 4.96) (envelope-from ) id 1uoh3S-00CgkM-1j; Wed, 20 Aug 2025 13:32:34 +0200 Date: Wed, 20 Aug 2025 13:32:34 +0200 From: hubert depesz lubaczewski Sender: depesz@depesz.com To: Adrian Klaver Cc: PostgreSQL General Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug Message-ID: Reply-To: depesz@depesz.com References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Aug 19, 2025 at 11:39:03AM -0700, Adrian Klaver wrote: > > Every now and then (usually every 3-5 minutes, but not through the whole > > day), we see situations where every query suddently takes ~ 1 second. > Given the subject line, what you are reporting is happening on the replica, > correct? Yes. > If so where is the replica relative to the primary, in terms of network > distance? =$ ping -c 10 primary reports: 10 packets transmitted, 10 received, 0% packet loss, time 9181ms rtt min/avg/max/mdev = 0.942/0.956/0.991/0.012 ms > Also what are the 'hardware' specifications on the replica instance? c8g.48xlarge ec2 instance. It is arm64, 192 cores, with 384 gb of ram. As for storage, this is relatitvely slow, because this db is rather small: gp3 500gb volume, with 6000 iops. At no point is IO in any way close to limits, the whole db fits easily in ram. Best regards, depesz