Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1odz10-0007fN-6W for pgsql-bugs@arkaria.postgresql.org; Thu, 29 Sep 2022 19:16:10 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1odyzz-0002SC-Nz for pgsql-bugs@arkaria.postgresql.org; Thu, 29 Sep 2022 19:15:07 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1odyMR-0007iG-OF for pgsql-bugs@lists.postgresql.org; Thu, 29 Sep 2022 18:34:15 +0000 Received: from mail-ej1-x62f.google.com ([2a00:1450:4864:20::62f]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1odyML-0006S7-HE for pgsql-bugs@lists.postgresql.org; Thu, 29 Sep 2022 18:34:15 +0000 Received: by mail-ej1-x62f.google.com with SMTP id r18so4543967eja.11 for ; Thu, 29 Sep 2022 11:34:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=databricks.com; s=google; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date; bh=aR7C2I/sBHtrzs1p42OPW2W7TWrIs6+22sg939EebqY=; b=h1IkTOtcUAmiQhdo5D0qwglizDW3VlDpNKFlatSAsnHf1JmxBk15S1luWS2QN3TqGy K3Jzv3MG90RLvqZhJVxCdg6CzyIwpt2JdodGDUDMAZO6hEDNDR/nKLlg0LDgzq1rVe4M Z2veLKCfrgRBiFzYidqzMuLRjFC3YYDQMz4JR7y0FJAYKtF0NaB+yhWUFRw/TnNHghJv 3TRyX2+D/6s3RpK05H/FeB2Sj1TVANWG97X9II7VCUkkH8HT6v3O+5ALFsbwvrXW2k9V fbtoNd81FYvQjkEZZp57DnHgqXPFOaWldXVEMNn4+E4vye8+TYiXZVglMIOFEc5a98AV UHjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date; bh=aR7C2I/sBHtrzs1p42OPW2W7TWrIs6+22sg939EebqY=; b=qJ7uSDuc/RYooQpvr87YCRAILWNxXitf6LTpGpywymfrZWvOmfmaztQSALw+lPc7/Y JmHlS/U71TZXrHL+DZYfbpy0XSzC10mbUfwxoKI7rw126wWiUFfP85zjFuw+DhDRG/sG CeFxFA3iYQVnWOxYi/y9c0vAYOlbZNZehyLcm71N5RVoPDrN7GiKAN85S14HtlJ7q9+4 KoEKRvFM6dv0vnx3o6wYel3W+TSNvWn7wt8TKmEIfMUwcCiL3BbXd1GBqpZgkvteo1oh Q9GUrEIeccUDhAepbdkC90h3RAinghHfRqgZtPpM5nCJbO76aR4erUm3fvP/X12lEmit prPg== X-Gm-Message-State: ACrzQf21q45wyCNWKeEISsP+kAD/lEfoOkXc+HNPvjTIw22TzQ6dFv4t 1ELXByJGvvh0ZE3jGFg6kQhpYsncDfZiX7It4QbAJDu/5l17rg== X-Google-Smtp-Source: AMsMyM6QGFrG8slZSCBOKGb8ODFQeQ3oKajf3G3KWYDk1v72+JSqR6FAEye5kkAE9SyUZrAIquB3RMVT/oF5ZJz2kH0= X-Received: by 2002:a17:906:730f:b0:783:27c5:f65f with SMTP id di15-20020a170906730f00b0078327c5f65fmr3857072ejc.142.1664476446068; Thu, 29 Sep 2022 11:34:06 -0700 (PDT) MIME-Version: 1.0 From: Malay Keshav Date: Thu, 29 Sep 2022 11:33:54 -0700 Message-ID: Subject: [Bug][Ver 11]: Generic query plan selected is worse than custom query plan To: pgsql-bugs@lists.postgresql.org, pgsql-sql@lists.postgresql.org Cc: "malay.keshav@gmail.com" Content-Type: multipart/alternative; boundary="0000000000004ef49005e9d51f78" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000004ef49005e9d51f78 Content-Type: text/plain; charset="UTF-8" Hi, We are using Postgres 11.13 for our company's critical database. However, recently after the addition of an index to a table, we found significant degradation in a specific query's execution time. We found that Postgres11 caches a generic execution plan for a parameterized query on the 6th execution of the query based on some heuristic comparison b/w the generic plan and the custom plan for that query. In our particular case, the Postgres engine decided to pick the generic query plan and cache it for all further calls with that query. My understanding was that the generic query plan would only be selected if it had a better execution time than the custom query plan. Which in our case is not true. We were able to reproduce this deterministically using the same query parameters to trigger the engine to pick the bad generic query plan on the 6th run (first 5 runs shows the engine used the efficient query plan). Why does the engine pick the generic query plan when its execution time is worse than the custom query plan? Is this a bug? We have run vacuum analyze, created new tables from existing data, etc but the problem still persisted. Funny thing is, this only happens in one of the many deployed regions suggesting it has to do with the data distribution of that region. We were also able to trick the postgres engine into not caching the generic plan and to always go for the custom query plan on each execution. We did this by formulating a query that on the 6th execution would trigger the heuristic to pick the custom plan. However, this is not a scalable or practical solution with the 100s of queries we run against the database - finding a query that can trick the engine into selecting the custom query plan. What are our options other than upgrading to Postgres 12 which provides a configuration to override and use a custom query plan on every execution? All best, Malay Keshav --0000000000004ef49005e9d51f78 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

We are using Postgres 11.13 for our compa= ny's critical database. However, recently after the addition of an inde= x to a table, we found significant degradation in a specific query's ex= ecution time.

We found that Postgres11 caches a ge= neric execution plan for a parameterized query on the 6th execution of the = query based on some heuristic comparison b/w the generic plan and the custo= m plan for that query.=C2=A0

In our particul= ar case, the Postgres engine decided to pick the generic query plan and cac= he it for all further calls with that query. My understanding was that the = generic query plan would only be selected if it had a better execution time= than the custom query plan. Which in our case is not true.

<= /div>
We were able to reproduce this deterministically using the same q= uery parameters to trigger the engine to pick the bad generic query plan on= the 6th run (first 5 runs shows the engine used the efficient query plan).= Why does the engine pick the generic query plan when its execution time i= s worse than the custom query plan? Is this a bug?=C2=A0

We have run vacuum analyze, created new tables from existing data, e= tc but the problem still persisted. Funny thing is, this only happens in on= e of the many deployed regions suggesting it has to do with the data distri= bution of that region. We were also able to trick the postgres engine into = not caching the generic plan and to always go for the custom query plan on = each execution. We did this by formulating a query that on the 6th executio= n would trigger the heuristic to pick the custom plan. However, this is not= a scalable or practical solution with the 100s of queries we run against t= he database - finding a query that can trick the engine into selecting the = custom query plan.

What are our options other = than upgrading to Postgres 12 which provides a configuration to override an= d use a custom query plan on every execution?

= All best,
Malay Keshav
--0000000000004ef49005e9d51f78--