Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sdUZL-00G3Oq-Cm for pgsql-hackers@arkaria.postgresql.org; Mon, 12 Aug 2024 12:54:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sdUZJ-00FoxM-9S for pgsql-hackers@arkaria.postgresql.org; Mon, 12 Aug 2024 12:54:37 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sdUZI-00Fowy-VS for pgsql-hackers@lists.postgresql.org; Mon, 12 Aug 2024 12:54:36 +0000 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sdUZG-004Lme-9a for pgsql-hackers@postgresql.org; Mon, 12 Aug 2024 12:54:35 +0000 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-7104f93a20eso3577444b3a.1 for ; Mon, 12 Aug 2024 05:54:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723467273; x=1724072073; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fV7w3dFnMW3OHD1eEqpNTCs7axvzaLngtTaEVKx3yhI=; b=cNdoNreBR+RE6AOD5Omu7uc4BIr1oQk9twf1xVqdyjjftX7AQKIc19eLmkMRDNlOcH 1LU5w1aDEaaFOWAxCAlb61qpYbpeg5Yx942U7baDFn4v11WszRh8g4dt1dOiYwTrpeLF k0OsWvxV8bOuGobD7mfJVviiNriJqTHzFk1RwkLwGe5whXQ0n/yIb4vUuu4m50BfNBZ5 bBnUPpTUw/WYdKxalcHvkJArpqgqER0UjXDNKpQJ5FlJKhERLkRcia7PYCieMyfV9qv5 CVVdqq1IwmrVYlUlfbveHHibAkqgaZuh0/bfde5rgTfqtV/F3LiGNQbr8swpAS7EX4GF 5auw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723467273; x=1724072073; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fV7w3dFnMW3OHD1eEqpNTCs7axvzaLngtTaEVKx3yhI=; b=xU8E85dNbWIYo2/bsPvLtWuuZ/DaubbB471QdvR7aRrqcUIe+chqF7rGuJQgkk84dd 2rk7qNjeZvW8V0B/A2iSFre8erFnzuslPTmWllZmE/68TWQCT7rkGXR4h+cVSgyV+3cL GMG1sMC26kw0ejBs70kZHRrv4ldaVL6KLFy3y+IV1hEwUBkWQUKuOw1HULUKuGlMHzbc Yfj6TR4cTnBsWWyjJOZMBtQv5DIwdS78k3lXL8DT9Ls8LO08TgyemRWtE/uUy/20rWpE KinODAk3iqR6dQ1GT7hw7/d0o5JC3kjuRqsvuCakHUr5nM9so/HSHnXp0ezjdOcFM8Nt 7cKQ== X-Forwarded-Encrypted: i=1; AJvYcCW7y3tfMf2+2AxYHkWzjMlH2t/9+w3wW0WJpPljzA8+Phm0WRXOtC6UtGIxSAp9Oetr2DakgFxoppQJU3hohXM6cHIfCR1Dk2kCWHsy X-Gm-Message-State: AOJu0YwT53r5F3J3mDnH38yMsAk1aTZr9pTop54+nKurO/5/NCzmzouo UvBI/8bS0IFafWkBqEU8jto0E2Xdqnje5coK2Z+i63AMFmtfp+UxUe+PCAnxXt69JR9UpzYMRGo X6poAw5ikZKVuXsCA1+Z+Gqc9r4k= X-Google-Smtp-Source: AGHT+IEtlGIQCVRguHcF8MX2MoNK5Ix+bOSjs69AuP4Wa6YEHFw2LLKQa7g3lHa/p3HLbLk8vOu5Amp7bTVPMq8R6gs= X-Received: by 2002:a05:6a21:3a94:b0:1c0:f315:ec7e with SMTP id adf61e73a8af0-1c8d74e3ebcmr371319637.28.1723467272975; Mon, 12 Aug 2024 05:54:32 -0700 (PDT) MIME-Version: 1.0 References: <202406191709.jbvpf7d7hl6g@alvherre.pgsql> In-Reply-To: <202406191709.jbvpf7d7hl6g@alvherre.pgsql> From: Amit Langote Date: Mon, 12 Aug 2024 21:54:16 +0900 Message-ID: Subject: Re: generic plans and "initial" pruning To: Alvaro Herrera Cc: Robert Haas , Andres Freund , Daniel Gustafsson , David Rowley , Jacob Champion , PostgreSQL Hackers , Thom Brown , Tom Lane Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Jun 20, 2024 at 2:09=E2=80=AFAM Alvaro Herrera wrote: > I hope we can get this new executor code in 18. Thanks for doing the benchmark, Alvaro, and sorry for the late reply. Yes, I'm hoping to get *some* version of this into v18. I've been thinking how to move this forward and I'm starting to think that we should go back to or at least consider as an option the old approach of changing the plancache to do the initial runtime pruning instead of changing the executor to take locks, which is the design that the latest patch set tries to implement. Here are the challenges facing the implementation of the current design: 1. I went through many iterations of the changes to ExecInitNode() to return a partially initialized PlanState tree when it detects that the CachedPlan was invalidated after locking a child table and to ExecEndNode() to account for the PlanState tree sometimes being partially initialized, but it still seems fragile and bug-prone to me. It might be because this approach is fundamentally hard to get right or I haven't invested enough effort in becoming more confident in its robustness. 2. Refactoring needed due to the ExecutorStart() API change especially that pertaining to portals does not seem airtight. I'm especially worried about moving the ExecutorStart() call for the PORTAL_MULTI_QUERY case from where it is currently to PortalStart(). That requires additional bookkeeping in PortalData and I am not totally sure that the snapshot handling changes after that move are entirely correct. 3. The need to add *back* the fields to store the RT indexes of relations that are not looked at by ExecInitNode() traversal such as root partitioned tables and non-leaf partitions. I'm worried about #2 the most. One complaint about the previous design was that the interface changes to capture and pass the result of doing initial pruning in plancache.c to the executor did not look great. However, after having tried doing #2, the changes to pass the pruning result into the executor and changes to reuse it in ExecInit[Merge]Append() seem a tad bit simpler than the refactoring and adjustments needed to handle failed ExecutorStart() calls, at multiple code sites. About #1, I tend to agree with David that adding complexity around PlanState tree construction may not be a good idea, because we might want to rethink Plan initialization code and data structures in the not too distant future. One idea I thought of is to take the remaining locks (to wit, those on inheritance children if running a cached plan) at the beginning of InitPlan(), that is before ExecInitNode(), like we handle the permission checking, so that we don't need to worry about ever returning a partially initialized PlanState tree. However, we're still left with the tall task to implement #2 such that it doesn't break anything. Another concern about the old design was the unnecessary overhead of initializing bitmapset fields in PlannedStmt that are meant for the locking algorithm in AcquireExecutorLocks(). Andres suggested an idea offlist to either piggyback on cursorOptions argument of pg_plan_queries() or adding a new boolean parameter to let the planner know if the plan is one that might get cached and thus have AcquireExecutorLocks() called on it. Another idea David and I discussed offlist is inventing a RTELockInfo (cf RTEPermissionInfo) and only creating one for each RT entry that is un-prunable and do away with PlannedStmt.rtable. For partitioned tables, that entry will point to the PartitionPruneInfo that will contain the RT indexes of partitions (or maybe just OIDs) mapped from their subplan indexes that are returned by the pruning code. So AcquireExecutorLocks() will lock all un-prunable relations by referring to their RTELockInfo entries and for each entry that points to a PartitionPruneInfo with initial pruning steps, will only lock the partitions that survive the pruning. I am planning to polish that old patch set and post after playing with those new ideas. -- Thanks, Amit Langote