Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1upTxH-0061wD-TS for pgsql-general@arkaria.postgresql.org; Fri, 22 Aug 2025 15:45:29 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1upTxH-007FzS-D5 for pgsql-general@arkaria.postgresql.org; Fri, 22 Aug 2025 15:45:28 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1upTu8-007CXZ-Bd for pgsql-general@lists.postgresql.org; Fri, 22 Aug 2025 15:42:12 +0000 Received: from lana.depesz.com ([88.198.49.178] helo=depesz.com) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1upTu7-001IWt-0I for pgsql-general@lists.postgresql.org; Fri, 22 Aug 2025 15:42:12 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=depesz.com; s=20170201; h=In-Reply-To:Content-Type:MIME-Version:References:Reply-To: Message-ID:Subject:Cc:To:Sender:From:Date:Content-Transfer-Encoding: Content-ID:Content-Description; bh=y2ns+tEFqSzlHXidTl/kAmQ8V/8MXclx6zNbvyIYloQ=; b=nLR92F/RzS8kv1Yul8bejxzp4Q kaYboATE7QLCoQ7IF/yrq0QpsI76POebT3OEWYhfrhDAfc+u7xs91p7pL8J4Y84vwHqNcq2NxtxYV 97T6bXQxL/Ss08KDJC68TXkRr7UhMH8wwiK5sGvIG5ea4HqzH+p+FlMb6IZt7CY3D8+E=; Received: from depesz by depesz.com with local (Exim 4.96) (envelope-from ) id 1upTu4-00DHtd-31; Fri, 22 Aug 2025 17:42:09 +0200 Date: Fri, 22 Aug 2025 17:42:08 +0200 From: hubert depesz lubaczewski Sender: depesz@depesz.com To: Adrian Klaver Cc: Tom Lane , PostgreSQL General , Chris Wilson Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug Message-ID: Reply-To: depesz@depesz.com References: <2a3e4a8d-e8c2-46d6-ad7d-9e631ce6725e@aklaver.com> <1882312.1755876082@sss.pgh.pa.us> <99abba30-1c14-4cc6-aef4-2f8a8f4bfac2@aklaver.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <99abba30-1c14-4cc6-aef4-2f8a8f4bfac2@aklaver.com> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Aug 22, 2025 at 08:39:21AM -0700, Adrian Klaver wrote: > On 8/22/25 08:30, hubert depesz lubaczewski wrote: > > On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote: > > > hubert depesz lubaczewski writes: > > > > I got repeatable case today. Is is breaking on its own everyy > > > > ~ 5 minutes. > > > > > > Interesting. That futex call is presumably caused by interaction > > > with some other process within the standby server, and the only > > > plausible candidate really is the startup process (which is replaying > > > WAL received from the primary). There are cases where WAL replay > > > will take locks that can block queries on the standby. Can you > > > correlate the delays on the standby server with any DDL events > > > occurring on the primary? > > > > Nope. Plus there is certain repetition of these cases, so even if I'd > > miss *some* create table/alter, it just isn't going to be happening > > every 4-5 minutes. > > > > > So, while there are outliers, I'd say that most of the problems happens every > > 3-5 minutes. > > Are you using the Postgres community version or the AWS variant? Community. From pgdg repo. Best regards, depesz