public inbox for [email protected]  
help / color / mirror / Atom feed
From: Achilleas Mantzios <[email protected]>
To: Mladen Marinović <[email protected]>
To: Efrain J. Berdecia <[email protected]>
Cc: SERHAD ERDEM <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Different execution plans in PG17 and pgBouncer...
Date: Mon, 5 May 2025 15:09:01 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAHjkqPQCmRwDMLSX2N_Gp+5WL+xJbtaoF2_aVCHxxQFPQApFxA@mail.gmail.com>
References: <CAHjkqPQFy2GnzRQHSkc8FXJdOYEV46cTQ9KjHXDUctCMHvveBw@mail.gmail.com>
	<[email protected]>
	<CAHjkqPRn3nHOcrTSYtew84zUvHA=SvVPnYivzjEvQ=MkYzj_5w@mail.gmail.com>
	<[email protected]>
	<CAHjkqPRsircyROY46YF-3XkTuwZCpKK-OZRcSQ_NchSdinaxOw@mail.gmail.com>
	<DB9PR02MB9948F2D8B4D024A4E78C08EEA38E2@DB9PR02MB9948.eurprd02.prod.outlook.com>
	<CAHjkqPS6o2S-XTCKAyf9J=aKX1i_KhPW4QuqOpKNLMZx+QwSCw@mail.gmail.com>
	<[email protected]>
	<CAHjkqPQCmRwDMLSX2N_Gp+5WL+xJbtaoF2_aVCHxxQFPQApFxA@mail.gmail.com>


On 5/5/25 14:26, Mladen Marinović wrote:
> Hi,
>
> Mystery not solved...but identified. The pool is in transaction mode 
> and some connections use set enable_mergejoin=off, but they do not set 
> it back to on. Upon getting the connection from the pool the parameter 
> is still set to off causing the planner to not use this kind of join 
> which results in different plans when using this tainted pgbouncer 
> connection instead of the clean one from pg17.
>
> The problem is that server_reset_query is not used when the pool is in 
> transaction mode. Now, we have to see how to fix this problem.

But you've got this : https://www.pgbouncer.org/config.html

"


      server_reset_query_always

Whether |server_reset_query| should be run in all pooling modes. When 
this setting is off (default), the |server_reset_query| will be run only 
in pools that are in sessions-pooling mode. Connections in 
transaction-pooling mode should not have any need for a reset query.

This setting is for working around broken setups that run applications 
that use session features over a transaction-pooled PgBouncer. It 
changes non-deterministic breakage to deterministic breakage: Clients 
always lose their state after each transaction.

"


>
> Regards,
> Mladen Marinović
>
> On Mon, May 5, 2025 at 3:10 PM Efrain J. Berdecia 
> <[email protected]> wrote:
>
>     Is the query using parameter markers? Is the source executing the
>     query forcing a "bad" data type casting?
>
>     Yahoo Mail: Search, Organize, Conquer
>     <https://mail.onelink.me/107872968?pid=nativeplacement&c=US_Acquisition_YMktg_315_SearchOrgConque...;
>
>         On Mon, May 5, 2025 at 8:52 AM, Mladen Marinović
>         <[email protected]> wrote:
>
>
>         On Mon, May 5, 2025 at 2:38 PM SERHAD ERDEM
>         <[email protected]> wrote:
>
>             Hi  , you had better try  vacuum  analyze for the whole db
>             ,   pgbouncer  connection layer can not causeslow queries.
>
>
>         I did that already. But the slow query is the consequence of
>         the different plan, not the statistics.
>
>             ------------------------------------------------------------------------
>             *From:* Mladen Marinović <[email protected]>
>             *Sent:* Monday, May 5, 2025 12:27 PM
>             *To:* Achilleas Mantzios <[email protected]>
>             *Cc:* [email protected]
>             <[email protected]>
>             *Subject:* Re: Different execution plans in PG17 and
>             pgBouncer...
>
>
>             On Mon, May 5, 2025 at 12:07 PM Achilleas Mantzios
>             <[email protected]> wrote:
>
>
>                 On 5/5/25 11:00, Mladen Marinović wrote:
>
>
>
>                     On Mon, May 5, 2025 at 11:24 AM Achilleas Mantzios
>                     <[email protected]> wrote:
>
>
>                         On 5/5/25 09:52, Mladen Marinović wrote:
>
>                             Hi,
>
>                             We recently migrated our production
>                             instances from PG11 to PG17. While doing
>                             so we upgraded our pgBouncer instances
>                             from 1.12 to 1.24. As everything worked on
>                             the test servers we pushed this to
>                             production a few weeks ago. We did not
>                             notice any problems until a few days ago
>                             (but the problems were here from the
>                             start). The main manifestation of the
>                             problems is a service that runs a fixed
>                             query to get a backlog of unprocessed data
>                             (limited to a 1000 rows). When testing the
>                             query using pgAdmin connected directly to
>                             the database we get a result in cca. 20
>                             seconds. The same query runs for 2 hours
>                             when using pgBouncer to connect to the
>                             same database.
>
>
>                         That's a huge jump, I hope you guys did
>                         extensive testing of your app. In which
>                         language is your app written? If java, then
>                         define prepareThreshold=0 in your jdbc and set
>                         max_prepared_statements = 0 in pgbouncer.
>
>                     Mainly python, but the problem was noticed in a
>                     java service.
>                     Prepare treshold was already set to 0. We changed
>                     the max_prepared_statements to 0 from the default
>                     (200) but no change was noticed.
>
>                         How about search paths ? any difference on
>                         those between the two runs ? Do you set
>                         search_path in pgbouncer ? what is "cca." btw ?
>
>
>                             The more interesting part is that when we
>                             issue an explain of the same query we get
>                             different plans. We did this a few seconds
>                             apart so there should be no difference in
>                             collected statistics. We ruled out
>                             prepared statements, as we suspected the
>                             generic plan might be the problem, but it
>                             is not. Is there any pgBouncer or PG17
>                             parameter that might be the cause of this?
>
>
>                         Does this spawn any connections (such as
>                         dblink) ? are there limits per user/db
>                         pool_size in pgbouncer ?
>
>                     No additional connection nor dbling. Just plain
>                     SQL (CTE, SELECT, INSERT, UPDATE, DELETE,...)
>                     There are limits, but they are not hit. The query
>                     just uses a different plan and runs slower because
>                     of that.
>
>                         Pgbouncer, in contrast to its old friend
>                         PgPool-II is completely passive, just passes
>                         through SQL to the server as fast as possible
>                         as it can. But I am sure you know that. Good
>                         luck, keep us posted!
>
>                     Yes, that is what puzzles me.
>
>                 What is the pgbouncer's timeout in the server
>                 connections ?
>
>                 How about "idle in transaction" ? do you get any of
>                 those? What's the isolation level ?
>
>                 How about the user ? is this the same user doing
>                 pgadmin queries VS via the app ?
>
>                 Can you identify the user under which the problem is
>                 manifested and :
>
>                 ALTER user "unlucky_user" SET log_statement = 'all';
>
>                 ALTER user "unlucky_user" SET
>                 log_min_duration_statement = 0; -- to help you debug
>                 the prepared statements .. just in case , and other
>                 stuff not printed by log_statement = all.
>
>             None of those parameters should affect the fact that when
>             issuing the explain select query (the statement is not
>             prepared) from psql directly gives a different result than
>             issuing it over the pgbouncer connection. The result is
>             repeatable.
>
>             We have rolled back pgbouncer to 1.12. and it seems the
>             problem persists. This is one of the weirdest things I
>             have ever seen with PostgreSQL.
>
>                     Regards,
>                     Mladen Marinović
>

view thread (13+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Different execution plans in PG17 and pgBouncer...
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox