Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1upAC0-000B6P-6v for pgsql-general@arkaria.postgresql.org; Thu, 21 Aug 2025 18:39:21 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1upABw-000yi5-21 for pgsql-general@arkaria.postgresql.org; Thu, 21 Aug 2025 18:39:16 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1upABv-000yho-Hd for pgsql-general@lists.postgresql.org; Thu, 21 Aug 2025 18:39:16 +0000 Received: from mail-ed1-x532.google.com ([2a00:1450:4864:20::532]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1upABs-0018cP-2x for pgsql-general@lists.postgresql.org; Thu, 21 Aug 2025 18:39:15 +0000 Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-6188b5be5deso2024687a12.0 for ; Thu, 21 Aug 2025 11:39:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=simply-italian-co-uk.20230601.gappssmtp.com; s=20230601; t=1755801551; x=1756406351; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wzfeWMXHqtxvIAGklBIui1K4HaakXNdtEvG1R8hjGBU=; b=pH0vUBTaXeVSQDXtt/6NgvEJUeckkXEblLftEVi/Og3cIGXGIZfPJB09M6OnCA3QB8 2PNaqB6zTJ5RRv+e8OiNmJIMBHUYCluVw/FmOUKkBaI5P8LpNleucyz03B0zu6eLR41I 7OKId0v4Uez5p5u0siHe5f4oDaU3LdVmFIXpjw2rYxtKbPI3bb5FXdf9942qCVcFAaUF lSHTJW1wSFyWc0l8D4jUCzbcdrmj5deaU42Q5GuGT1eULWQJdRJHeHCwnljivnZIV9gi 3muKM1kjVx+Si2F7v9wnABR2Q/MdHOz9PJOqHKw+5CtHiYVbxDxjS1pHOQBcOS4WewDi 9+Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755801551; x=1756406351; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wzfeWMXHqtxvIAGklBIui1K4HaakXNdtEvG1R8hjGBU=; b=TYSjSbgHxmTd3cIaoHKvVOvRnqNMIq1/iO/Pgv1zQpo5erporgECF9k0S/wAu3RGzt bMKJM9lzX8Ki8iFStoS713lp5JRAFBGN5/yxjQEmgtW9OowFX+W3MUcbiI33mXcbcy9Y HeYuZ1R5dDcJ+JMu+Y8Xb3t017fUZ2P40QdFbdo1UT+CU41aRVVX2jH3to9GXktzLicX sZTIN6b0miko08KiCRr855JQpmQe61UJhI5H567bLR1pheftnPaMMXpJ0aLSd+KT92lV pTFBGbhhzM+e1otfGnUuwvnfMmpz2Hkn+tLv4Mr+Lo3LXSpN5kBmfk5G69FE4wWfl4zt 3vYw== X-Forwarded-Encrypted: i=1; AJvYcCX3VUPkCFewTxbFq5Qu9OZ4mboqx7eTO1eGe4h5USETv65qkwuqZ1VWwRuq1PJKy/rGVNyPsMx1CeI9faHs@lists.postgresql.org X-Gm-Message-State: AOJu0Yxb6LwxBlK18rij39dzCFusvXserviCNBdHCzK5DrhNdDyppDA5 vAYqwRaBtimctrNYUpekSY2G21coFLQQZDn1np0kNYTNxaAz9yQsdXAvNNE8d00glj5Io37l15t N1RP/rMQBuwsqgliEJs2OqhBq0AjT+++iErD9DuRKyg== X-Gm-Gg: ASbGncsKXIG6O9Bzh/15duvGUYPG7JnZbs0NnX/G90tajQZj3cRxQUAXeiC6PAwjOPl iItO0MKYDK+InVBPwom13U1B1RzG9qjIFvRo3bbHsOUf3YtFmZ7rmtNDPRCOUAwN8cW4iwVWYbR LGKjngqpBynXRie2EiUi3GviM6H8aFK3aF8QCE+DqAoPyzj0Xhd0gfSNsTM3nGEBAQMGdb6ndoH rXSLMamo7tCwzGibhYmTcjPmzDIN1WZCsplXDWYKQi3N1aOnxTg X-Google-Smtp-Source: AGHT+IGiTk4fgpgQ6Z1+Gtzqny9iG9nBjPhhtD1qQTNKjjhZahyXQ6ZbOBWmL9uGN8Qi43Tsw7+yo4ebRVmbQBDWFEI= X-Received: by 2002:a17:907:940c:b0:af8:f187:3222 with SMTP id a640c23a62f3a-afe294e0111mr15212566b.33.1755801550944; Thu, 21 Aug 2025 11:39:10 -0700 (PDT) MIME-Version: 1.0 References: <05969854-0d19-4726-ae1b-586659dd443b@aklaver.com> <25334887-f1c3-40a1-94b0-753c7d67ae2b@aklaver.com> <2a3e4a8d-e8c2-46d6-ad7d-9e631ce6725e@aklaver.com> In-Reply-To: <2a3e4a8d-e8c2-46d6-ad7d-9e631ce6725e@aklaver.com> From: Chris Wilson Date: Thu, 21 Aug 2025 19:38:34 +0100 X-Gm-Features: Ac12FXxen66GHNYfhQfuQMitnVH3tj5bn5uaUpoC9JCU2ybIMcvsH_7MDr8biTM Message-ID: Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug To: Adrian Klaver Cc: depesz@depesz.com, PostgreSQL General Content-Type: multipart/alternative; boundary="000000000000bd8df4063ce467bd" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000bd8df4063ce467bd Content-Type: text/plain; charset="UTF-8" If all your queries are coming through pgBouncer, and only those hang (the server itself responds if you connect directly to it), then it might be this pgBouncer issue: https://github.com/pgbouncer/pgbouncer/issues/1054 Although that issue is now "closed", because the invisible "debug" log message was upgraded to a warning (and I don't think that change is in any released version), the underlying problem still exists: pgbouncer hangs completely (stops forwarding packets) for a while if the PAM authentication queue becomes full. If you have a relatively slow PAM service (such as pam_ldap) then you can trigger it by opening ~100 connections to pgBouncer simultaneously (without waiting for previous ones to authenticate), something like this: for i in `seq 1 100`; do psql -h pgbouncer -p 6432 -U user db_name -c "SELECT 1" & done Thanks, Chris. On Thu, 21 Aug 2025 at 19:17, Adrian Klaver wrote: > On 8/21/25 09:51, hubert depesz lubaczewski wrote: > > On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote: > >> Getting to the bottom of the bag of ideas: > >> Have you looked at the OS system log for the time period involved? > > > > Yes. Mostly dmesg. Nothing interesting logged around the time. > > > >> You mentioned this seemed to involve PREPARE and DISCARD ALL. > >> Is this the same set of statements or is it all over the place? > > > > No. From what I can tell it's random sample. > > > >> Also it would be helpful to know what bouncer you are actually using and > >> what mode you are running in? > > > > pgBouncer, version 1.23.1. As for more... mostly transaction pooling. > > Applications go using transaction pooling, but people (dbas, ops) have > > session pooling. > > Have you looked at?: > > https://www.pgbouncer.org/changelog.html#pgbouncer-124x > > To see if anything stands out. > > Then there is: > > https://www.pgbouncer.org/config.html#max_prepared_statements > > The below may also be worth looking at: > > https://github.com/pgbouncer/pgbouncer/pull/1144 > > I can't help thinking that there is a caching issue at stake, though > that is just a guess. > > > > > > Best regards, > > > > depesz > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > > > --000000000000bd8df4063ce467bd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
If all your queries are coming through pgBouncer, and only= those hang (the server itself responds if you connect directly to it), the= n it might be this pgBouncer issue:


Although=C2=A0that issu= e is now "closed", because the invisible "debug" log me= ssage was upgraded to a warning (and I don't think that change is in an= y released version), the underlying problem still exists: pgbouncer hangs c= ompletely (stops forwarding packets) for a while if the PAM authentication= =C2=A0queue becomes full.

If you have a relatively= slow PAM=C2=A0service (such as pam_ldap) then you can trigger it by openin= g ~100 connections to pgBouncer simultaneously (without waiting for previou= s ones to authenticate), something like this:

for i in `seq 1 100`; do psql -h pgbouncer -p 6432 -U user db_nam= e -c "SELECT 1" & done

Thanks, Chris.

On Thu, 21 Aug 2025 at 19:= 17, Adrian Klaver <adrian.k= laver@aklaver.com> wrote:
On 8/21/25 09:51= , hubert depesz lubaczewski wrote:
> On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote:
>> Getting to the bottom of the bag of ideas:
>> Have you looked at the OS system log for the time period involved?=
>
> Yes. Mostly dmesg. Nothing interesting logged around the time.
>
>> You mentioned this seemed to involve PREPARE and DISCARD ALL.
>> Is this the same set of statements or is it all over the place? >
> No. From what I can tell it's random sample.
>
>> Also it would be helpful to know what bouncer you are actually usi= ng and
>> what mode you are running in?
>
> pgBouncer, version 1.23.1. As for more... mostly transaction pooling.<= br> > Applications go using transaction pooling, but people (dbas, ops) have=
> session pooling.

Have you looked at?:

https://www.pgbouncer.org/changelog.html#pgbo= uncer-124x

To see if anything stands out.

Then there is:

https://www.pgbouncer.org/config.html#m= ax_prepared_statements

The below may also be worth looking at:

https://github.com/pgbouncer/pgbouncer/pull/1144=

I can't help thinking that there is a caching issue at stake, though that is just a guess.


>
> Best regards,
>
> depesz
>


--
Adrian Klaver
adrian.klave= r@aklaver.com


--000000000000bd8df4063ce467bd--