Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uBvJM-008HqB-9G for pgsql-general@arkaria.postgresql.org; Mon, 05 May 2025 12:52:44 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uBvJL-003S68-5n for pgsql-general@arkaria.postgresql.org; Mon, 05 May 2025 12:52:43 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uBvJK-003S5y-Ia for pgsql-general@lists.postgresql.org; Mon, 05 May 2025 12:52:42 +0000 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uBvJH-000GGg-0m for pgsql-general@lists.postgresql.org; Mon, 05 May 2025 12:52:41 +0000 Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-72d3b48d2ffso4407303b3a.2 for ; Mon, 05 May 2025 05:52:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kset.org; s=google; t=1746449559; x=1747054359; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KIAAY/oNnrENcDIiNu4ux7WDf/pHtgs7fgX4ZvC9clU=; b=owIwJFjZgYSjHm8+R6N6N/CXp4hsjemaEeR6wrGj7oQZRXQmmCGbApl8b7Sn4z8J38 Ku1e9Lkp50u+4iIAzwOOGoGX+N01RFDZ4fHu1mL1VJXnebsv81OljR+E1p/SH7UY0ClO 4Xdo5Fkp1sEEqa6c+CQD4nORSVNxUz/U2R3PllRwZU4ZeSm9NKL7/NAKZnoKmOwhLfKV Tq74mBf3Ya0p5fFIBM8gLrq5VHMNEB32VNEtx4R7Ac8oqpiCZb6pBzuvxLMw8aoCNP4h zRp2h7NTU6saMuIrxVgdjdV1yYYl4ANa7J+1gzAxldlmLJ2JLKUafJoavgefVBMOzo0+ hqOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746449559; x=1747054359; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=KIAAY/oNnrENcDIiNu4ux7WDf/pHtgs7fgX4ZvC9clU=; b=dfMkU2kPIo2QwpiTo+Emdh6igk+8ZLJakx0sRvAzp2B1jGgbmSavjoX4XMl8cOMzIb 6l6JKsnQyCvJb2Gg+sYWRgKExysfEs/amXqkIA8+LVXL+ISZVqzJBSXUeWIgXnCwti8r I0J1FsfB3p8qRj8z8cP/xtugbJV6gWOJkjXaJR5HAsyxoscRk0RBbgwh5EytN19YSGF3 /xxCWS/Hhgafd9NbHw8/MD3ZmooBprUpQXaGD3oKteDt+GTxZEh413g3qOQmoR9M/Pc/ jQhylFJENFuwccijdUD6EfWssCMAlLwZkd78QeB4CxeopMP1BvXR3G6ObhfL2ee0Dk4f vY9w== X-Forwarded-Encrypted: i=1; AJvYcCVJTuljRXJsbhLJCHrsU9VWx6VDsyWJU6d7JNJcqN371Lc0Xi4DS1dQXJO0XtlStSWf/6apT4YafIJddJZh@lists.postgresql.org X-Gm-Message-State: AOJu0YyMguzWDvqrAWFWPXdIQEcWJDGfbnFd7qk+/RWXeZ4l6xNLkp7U U46f3QKZhADfFGAp/b6Av6QvUtI1SBr/QX1ZJDyJvOQkLFnKADybawRwfkLD1Go8iE5dNskIciw DJpanz6fjyXWnuzpsh270dQwR7VaV0q/stY5wE23sx12VcZXJzmG73A== X-Gm-Gg: ASbGncsS/ycYvzealKliUHG7Yej4rt4CxAr0EMdvYsHiSQfafa+l8VWCrX2MGrdKAda XpGQBbp41YEVWxWiP7kgZ1LBPtGgmlz4GbcYxCSJb9Bf4kwuL5Hsnc1f9uUrIwPEf10diBijOjX T+VtP9e75U5iFO4fMty1QBFg== X-Google-Smtp-Source: AGHT+IGskYy9OhfcMVgtgADGp+IlnSvwwetuT5QJU3r6chQZ0VPMzeue8DBJLGqr0AAjSA+UemhefTOD8PdL9NXTI0Q= X-Received: by 2002:a05:6a00:8d8d:b0:736:34ca:deee with SMTP id d2e1a72fcca58-7406f0aff36mr9543246b3a.7.1746449558893; Mon, 05 May 2025 05:52:38 -0700 (PDT) MIME-Version: 1.0 References: <4ec38b6d-bd92-4055-8d2f-7efa583a2b9f@cloud.gatewaynet.com> In-Reply-To: From: =?UTF-8?Q?Mladen_Marinovi=C4=87?= Date: Mon, 5 May 2025 14:52:26 +0200 X-Gm-Features: ATxdqUHPcSL_aZz5Tk37xq8bnBIaH1l1wqjdrIs2mxcTokmJKSSKf9gedGJk-cg Message-ID: Subject: Re: Different execution plans in PG17 and pgBouncer... To: SERHAD ERDEM Cc: Achilleas Mantzios , "pgsql-general@lists.postgresql.org" Content-Type: multipart/alternative; boundary="000000000000938f83063462f9d5" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000938f83063462f9d5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, May 5, 2025 at 2:38=E2=80=AFPM SERHAD ERDEM w= rote: > Hi , you had better try vacuum analyze for the whole db , pgbouncer > connection layer can not causeslow queries. > I did that already. But the slow query is the consequence of the different plan, not the statistics. > ------------------------------ > *From:* Mladen Marinovi=C4=87 > *Sent:* Monday, May 5, 2025 12:27 PM > *To:* Achilleas Mantzios > *Cc:* pgsql-general@lists.postgresql.org < > pgsql-general@lists.postgresql.org> > *Subject:* Re: Different execution plans in PG17 and pgBouncer... > > > > On Mon, May 5, 2025 at 12:07=E2=80=AFPM Achilleas Mantzios < > a.mantzios@cloud.gatewaynet.com> wrote: > > > On 5/5/25 11:00, Mladen Marinovi=C4=87 wrote: > > > > On Mon, May 5, 2025 at 11:24=E2=80=AFAM Achilleas Mantzios < > a.mantzios@cloud.gatewaynet.com> wrote: > > > On 5/5/25 09:52, Mladen Marinovi=C4=87 wrote: > > Hi, > > We recently migrated our production instances from PG11 to PG17. While > doing so we upgraded our pgBouncer instances from 1.12 to 1.24. As > everything worked on the test servers we pushed this to production a few > weeks ago. We did not notice any problems until a few days ago (but the > problems were here from the start). The main manifestation of the problem= s > is a service that runs a fixed query to get a backlog of unprocessed data > (limited to a 1000 rows). When testing the query using pgAdmin connected > directly to the database we get a result in cca. 20 seconds. The same que= ry > runs for 2 hours when using pgBouncer to connect to the same database. > > > That's a huge jump, I hope you guys did extensive testing of your app. In > which language is your app written? If java, then define prepareThreshold= =3D0 > in your jdbc and set max_prepared_statements =3D 0 in pgbouncer. > > Mainly python, but the problem was noticed in a java service. > Prepare treshold was already set to 0. We changed the max_prepared_state= ments > to 0 from the default (200) but no change was noticed. > > How about search paths ? any difference on those between the two runs ? D= o > you set search_path in pgbouncer ? what is "cca." btw ? > > > The more interesting part is that when we issue an explain of the same > query we get different plans. We did this a few seconds apart so there > should be no difference in collected statistics. We ruled out prepared > statements, as we suspected the generic plan might be the problem, but it > is not. Is there any pgBouncer or PG17 parameter that might be the cause = of > this? > > > Does this spawn any connections (such as dblink) ? are there limits per > user/db pool_size in pgbouncer ? > > No additional connection nor dbling. Just plain SQL (CTE, SELECT, INSERT, > UPDATE, DELETE,...) There are limits, but they are not hit. The query jus= t > uses a different plan and runs slower because of that. > > Pgbouncer, in contrast to its old friend PgPool-II is completely passive, > just passes through SQL to the server as fast as possible as it can. But = I > am sure you know that. Good luck, keep us posted! > > Yes, that is what puzzles me. > > What is the pgbouncer's timeout in the server connections ? > > How about "idle in transaction" ? do you get any of those? What's the > isolation level ? > > How about the user ? is this the same user doing pgadmin queries VS via > the app ? > > Can you identify the user under which the problem is manifested and : > > ALTER user "unlucky_user" SET log_statement =3D 'all'; > > ALTER user "unlucky_user" SET log_min_duration_statement =3D 0; -- to hel= p > you debug the prepared statements .. just in case , and other stuff not > printed by log_statement =3D all. > > None of those parameters should affect the fact that when issuing the > explain select query (the statement is not prepared) from psql directly > gives a different result than issuing it over the pgbouncer connection. T= he > result is repeatable. > > We have rolled back pgbouncer to 1.12. and it seems the problem persists. > This is one of the weirdest things I have ever seen with PostgreSQL. > > > Regards, > Mladen Marinovi=C4=87 > > --000000000000938f83063462f9d5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Mon, May 5, = 2025 at 2:38=E2=80=AFPM SERHAD ERDEM <serhade@hotmail.com> wrote:
Hi=C2=A0 , you had better try=C2=A0 vacuum=C2=A0 analyze for the whole db ,= =C2=A0=C2=A0 pgbouncer=C2=A0 connection layer can not causeslow queries.

I did that already. But the sl= ow query is the consequence of the different plan, not the statistics.
=C2=A0

From:=C2=A0Mladen Marinovi=C4=87 <marin@kset.org>
Sent:=C2=A0Monday, May 5, 2025 12:27 PM
To:=C2=A0Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> Cc:=C2=A0pgsql-general@lists.postgresql.org <pgsql-general@lists.po= stgresql.org>
Subject:=C2=A0Re: Different execution plans in PG17 and pgBouncer...=
=C2=A0


On Mon, May 5, 2025 at 12:07=E2=80=AFPM Achill= eas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


On 5/5/25 11:00, Mladen Marinovi=C4=87 wrote:<= /div>


On Mon, May 5, 2025 at 11:24=E2=80=AFAM Achill= eas Mantzios <a.mantzios@cloud.gatewaynet.com> wrote:


On 5/5/25 09:52, Mladen Marinovi=C4=87 wrote:<= /div>
Hi,

We recently migrated our production instances = from PG11 to PG17. While doing so we upgraded our pgBouncer instances from = 1.12 to 1.24. As everything worked on the test servers we pushed this to pr= oduction a few weeks ago. We did not notice any problems until a few days ago (but the problems were here f= rom the start). The main manifestation of the problems is a service that ru= ns a fixed query to get a backlog of unprocessed data (limited to a 1000 ro= ws). When testing the query using pgAdmin connected directly to the database we get a result in cca. 20 seco= nds. The same query runs for 2 hours when using pgBouncer to connect to the= same database.


That's a huge jump, I hope you guys did exte= nsive testing of your app. In which language is your app written? If java, = then define prepareThreshold=3D0 in your jdbc and set max_prepared_statements =3D 0 in pgbouncer.

Mainly python, but the problem was noticed in = a java service.
Prepare treshold was already set to 0. We chan= ged the=C2=A0 max_prepared_statements to 0 from the default (200) but no change was noticed.

How about search paths ? any difference on those= between the two runs ? Do you set search_path in pgbouncer ? what is "= ;cca." btw ?


The more interesting part is that when we issu= e an explain of the same query we get different plans. We did this a few se= conds apart so there should be no difference in collected statistics. We ru= led out prepared statements, as we suspected the generic plan might be the problem, but it is not. Is ther= e any pgBouncer or PG17 parameter that might be the cause of this?


Does this spawn any connections (such as dblink)= ? are there limits per user/db pool_size in pgbouncer ?

No additional connection nor dbling. Just plai= n SQL (CTE, SELECT, INSERT, UPDATE, DELETE,...) There are limits, but they = are not hit. The query just uses a different plan and runs slower because o= f that.

Pgbouncer, in contrast to its old friend PgPool-= II is completely passive, just passes through SQL to the server as fast as = possible as it can. But I am sure you know that. Good luck, keep us posted!=

Yes, that is what puzzles me.

What is the pgbouncer's timeout in the serve= r connections ?

How about "idle in transaction" ? do y= ou get any of those? What's the isolation level ?

How about the user ? is this the same user doing= pgadmin queries VS via the app ?

Can you identify the user under which the proble= m is manifested and :

ALTER user "unlucky_user" SET log_stat= ement =3D 'all';

ALTER user "unlucky_user" SET log_min_= duration_statement =3D 0; -- to help you debug the prepared statements .. j= ust in case , and other stuff not printed by log_statement =3D all.

None of those parameters should affect the fac= t that when issuing the explain select query (the statement is not prepared= ) from psql directly gives a different result than issuing it over the pgbo= uncer connection. The result is repeatable.

We have rolled back pgbouncer to 1.12. and it = seems the problem persists. This is one of the weirdest things I have ever = seen with PostgreSQL.
=C2=A0
Regards,
Mladen Marinovi=C4=87
--000000000000938f83063462f9d5--