Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1skNGk-00AFG8-3o for pgsql-hackers@arkaria.postgresql.org; Sat, 31 Aug 2024 12:31:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1skNFi-005vJl-On for pgsql-hackers@arkaria.postgresql.org; Sat, 31 Aug 2024 12:30:51 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1skNFi-005vJc-Bk for pgsql-hackers@lists.postgresql.org; Sat, 31 Aug 2024 12:30:51 +0000 Received: from mail-lf1-x12a.google.com ([2a00:1450:4864:20::12a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1skNFf-002P4X-9e for pgsql-hackers@postgresql.org; Sat, 31 Aug 2024 12:30:50 +0000 Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-533488ffaebso3304468e87.0 for ; Sat, 31 Aug 2024 05:30:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725107447; x=1725712247; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LN5Dfh1HrbEjNpcmBEcShpyhzOPn6+1QfQgiHq9fqF8=; b=f5Vau52S2tjjMWM1SEsA3nJ/qKYshCHiRssl/QuiTgN6JkXrt0KVx+y7TAy+w795Vo a5EecSMJoIhJs+fNl8ISKg34DGvxKSDD3YDUON4QvvUA3yrojp5PN0wQ2Pqah2VM+Hv4 VX3OMqMOdjvaFZh2zysTuisLhY4zwozQfJYslht4aYftPfZTNaBz0Xm7YbB0YstzvRqI uJdohvKpyl9SlTkJvgISJ9FHlw2tbV78Ma6y9DkfJ9x5kdsGjRsYQrpqJgwRrkcuX+ZW fMsmo5h2jTNBakWllplqon8L6NacbYT3PXeZL5AZ7/pJ1kHkKBW+qs9JAV6HLFDgrwI4 bc/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725107447; x=1725712247; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LN5Dfh1HrbEjNpcmBEcShpyhzOPn6+1QfQgiHq9fqF8=; b=e9Rm0SOsKZCrMJVEs6ybSoxgP8PiQkIExhNRD2jvnc3p5IBtSfi1M5YwefQdMso3ZA zmnySa1jXWAK0L0rIYpxJBM+H5W7eg1F+ZZY/bpHjwnqRquI0CKgWC3PgyNzd/jOLG7E pYc4U/ORqCDS1vivameeAkisrLzJxaTttVvjdSQNYHO2SDL4XBGW9EObDZbxNebmjZY4 NuricaYClJqByXvUCKOuTEX9FyGSuhH3i5ZzgiQ5KnCTlyQtsN3hdKIdY2XlMhP4AdAv xZ+hq6650YAOcMlFeYRwNEj0xFTZ3/6RiXIko0BymXKt3h4LEYuoFG+ll1I0VmZkylTn efkQ== X-Forwarded-Encrypted: i=1; AJvYcCU1OJWpigSrfrO5BjxUGMPCk+XgBwzebqNqyK8u0swVgaSrrKZuCNubuXF78ltRZJxrZIXbXQc8EalZYxZb@postgresql.org X-Gm-Message-State: AOJu0Yx3Vqp2zNlTc5ZPZyEGkxTlTazRD92TwfuqkWj4Yd7ND6E0cSFO TkilIB/Hp38U2NPP8iFWDBuCPPxSq/UDyiga8Ae8cLSL3+Skcw3ofU/U0XKg44vGCOlmULaFe9J T33sfDhsrGNzhAf/f7i/yOrSJq80= X-Google-Smtp-Source: AGHT+IHR7ZxNACETkjisoi8j5WTNpjidsJQKxDpsykOJ4JVioHWkuawXV1X/MrdiCEp2Mv+HHIWOWvG7jgcx5A07xWM= X-Received: by 2002:a05:6512:2309:b0:52e:9b68:d2d4 with SMTP id 2adb3069b0e04-53546bb8200mr3042351e87.56.1725107446217; Sat, 31 Aug 2024 05:30:46 -0700 (PDT) MIME-Version: 1.0 References: <202406191709.jbvpf7d7hl6g@alvherre.pgsql> In-Reply-To: From: Junwang Zhao Date: Sat, 31 Aug 2024 20:30:34 +0800 Message-ID: Subject: Re: generic plans and "initial" pruning To: Amit Langote Cc: Robert Haas , Alvaro Herrera , Andres Freund , Daniel Gustafsson , David Rowley , PostgreSQL Hackers , Thom Brown , Tom Lane Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On Thu, Aug 29, 2024 at 9:34=E2=80=AFPM Amit Langote wrote: > > On Fri, Aug 23, 2024 at 9:48=E2=80=AFPM Amit Langote wrote: > > On Wed, Aug 21, 2024 at 10:10=E2=80=AFPM Robert Haas wrote: > > > On Wed, Aug 21, 2024 at 8:45=E2=80=AFAM Amit Langote wrote: > > > > * The replanning aspect of the lock-in-the-executor design would be > > > > simpler if a CachedPlan contained the plan for a single query rathe= r > > > > than a list of queries, as previously mentioned. This is particular= ly > > > > due to the requirements of the PORTAL_MULTI_QUERY case. However, th= is > > > > option might be impractical. > > > > > > It might be, but maybe it would be worth a try? I mean, > > > GetCachedPlan() seems to just call pg_plan_queries() which just loops > > > over the list of query trees and does the same thing for each one. If > > > we wanted to replan a single query, why couldn't we do > > > fake_querytree_list =3D list_make1(list_nth(querytree_list, n)) and t= hen > > > call pg_plan_queries(fake_querytree_list)? Or something equivalent to > > > that. We could have a new GetCachedSinglePlan(cplan, n) to do this. > > > > I've been hacking to prototype this, and it's showing promise. It > > helps make the replan loop at the call sites that start the executor > > with an invalidatable plan more localized and less prone to > > action-at-a-distance issues. However, the interface and contract of > > the new function in my prototype are pretty specialized for the replan > > loop in this context=E2=80=94meaning it's not as general-purpose as > > GetCachedPlan(). Essentially, what you get when you call it is a > > 'throwaway' CachedPlan containing only the plan for the query that > > failed during ExecutorStart(), not a plan integrated into the original > > CachedPlanSource's stmt_list. A call site entering the replan loop > > will retry the execution with that throwaway plan, release it once > > done, and resume looping over the plans in the original list. The > > invalid plan that remains in the original list will be discarded and > > replanned in the next call to GetCachedPlan() using the same > > CachedPlanSource. While that may sound undesirable, I'm inclined to > > think it's not something that needs optimization, given that we're > > expecting this code path to be taken rarely. > > > > I'll post a version of a revamped locks-in-the-executor patch set > > using the above function after debugging some more. > > Here it is. > > 0001 implements changes to defer the locking of runtime-prunable > relations to the executor. The new design introduces a bitmapset > field in PlannedStmt to distinguish at runtime between relations that > are prunable whose locking can be deferred until ExecInitNode() and > those that are not and must be locked in advance. The set of prunable > relations can be constructed by looking at all the PartitionPruneInfos > in the plan and checking which are subject to "initial" pruning steps. > The set of unprunable relations is obtained by subtracting those from > the set of all RT indexes. This design gets rid of one annoying > aspect of the old design which was the need to add specialized fields > to store the RT indexes of partitioned relations that are not > otherwise referenced in the plan tree. That was necessary because in > the old design, I had removed the function AcquireExecutorLocks() > altogether to defer the locking of all child relations to execution. > In the new design such relations are still locked by > AcquireExecutorLocks(). > > 0002 is the old patch to make ExecEndNode() robust against partially > initialized PlanState nodes by adding NULL checks. > > 0003 is the patch to add changes to deal with the CachedPlan becoming > invalid before the deferred locks on prunable relations are taken. > I've moved the replan loop into a new wrapper-over-ExecutorStart() > function instead of having the same logic at multiple sites. The > replan logic uses the GetSingleCachedPlan() described in the quoted > text. The callers of the new ExecutorStart()-wrapper, which I've > dubbed ExecutorStartExt(), need to pass the CachedPlanSource and a > query_index, which is the index of the query being executed in the > list CachedPlanSource.query_list. They are needed by > GetSingleCachedPlan(). The changes outside the executor are pretty > minimal in this design and all the difficulties of having to loop back > to GetCachedPlan() are now gone. I like how this turned out. > > One idea that I think might be worth trying to reduce the footprint of > 0003 is to try to lock the prunable relations in a step of InitPlan() > separate from ExecInitNode(), which can be implemented by doing the > initial runtime pruning in that separate step. That way, we'll have > all the necessary locks before calling ExecInitNode() and so we don't > need to sprinkle the CachedPlanStillValid() checks all over the place > and worry about missed checks and dealing with partially initialized > PlanState trees. > > -- > Thanks, Amit Langote @@ -1241,7 +1244,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams, if (customplan) { /* Build a custom plan */ - plan =3D BuildCachedPlan(plansource, qlist, boundParams, queryEnv); + plan =3D BuildCachedPlan(plansource, qlist, boundParams, queryEnv, true); Is the *true* here a typo? Seems it should be *false* for custom plan? --=20 Regards Junwang Zhao