Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tJ5kI-000rmF-GN for pgsql-hackers@arkaria.postgresql.org; Thu, 05 Dec 2024 06:53:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tJ5kF-004obQ-Tr for pgsql-hackers@arkaria.postgresql.org; Thu, 05 Dec 2024 06:53:53 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tJ5kF-004obE-G4 for pgsql-hackers@lists.postgresql.org; Thu, 05 Dec 2024 06:53:52 +0000 Received: from mail-pg1-x536.google.com ([2607:f8b0:4864:20::536]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tJ5kD-0018RC-Nh for pgsql-hackers@postgresql.org; Thu, 05 Dec 2024 06:53:51 +0000 Received: by mail-pg1-x536.google.com with SMTP id 41be03b00d2f7-7fd17f230c2so262294a12.0 for ; Wed, 04 Dec 2024 22:53:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733381629; x=1733986429; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ujn/3qNgCGV/tPJIVL7kufiKEa3cYN+mI7qYIi+fd2I=; b=gkMxZkJnFlgHT9cFaOlCri/Sh1in0VWIkSYwd9X+sDMpR0/InqVNmGx2tGRY/bZ7TP giw418AJaY4Q1GJAKO2TVJjkAzjB4jaDrXb3K5uTGcpwnf1RJOCvRVwC4uoUNF+Y2c6x jgnEB3gqIl5erG15t232tswFFlhdDymXt/Mm9+lIcHTXhrdoWdAYKtMVNglyhdifzY+Z eVZwWFgP/HXY1F+aTVkM3hPvIwueD30zwHHkfdP3mj80PVom4HP9arPAAt1a9DJrKnbf vHZQk4/1GZaakbuCdxISsAuisLrNlw1yYE8IyHxSSa44XL+BYGQR9rutRh2VhyIKKZve ysKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733381629; x=1733986429; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ujn/3qNgCGV/tPJIVL7kufiKEa3cYN+mI7qYIi+fd2I=; b=MJbVV8+4NL+L59Mov0dHSGBN79KlfWpmYDidITBYLzhAaGCut50LYPG/vDQ6BVRpVX kvrwBF3qTLY94FEY2kMkHS0Xfyq4xotj8x+SCwmHgZb2ACTbGmt3NRrXbE6meU30emL2 pLnfTAGOVAeRo8YIskRYwfmL8TMg3S0+hwT1jHSj/iS3+IwoC073JAJ4ypkz+MExiCaq RUAvYpNWpMa59/4M4YbJU2HFNE17Drco6raxjaOI4ojdMfwvo3DS3rA2Pjj5oH16AbmT iDWvtiAGo8+Llxo3fGsdGm0lCBaDdntnwJQPXEkM5kAbJamytkrzAAl5hAv2UfQI2+py uqvg== X-Forwarded-Encrypted: i=1; AJvYcCUi2cGW6512mrKB838kIa3ZTb349VhjyqmD52ZunZ7hzN6aAih/DTb/AcMhDZlHXMdvRWizwXdIi93M6BKT@postgresql.org X-Gm-Message-State: AOJu0YwTseksBsf0kcZk6UkhFOc9Fb+caFAs8WRPxdmHFw2s+OrTqvAF hjSCfc3bdiam09KgqkOurCLNP3Wi70dAimztBfKc+pNLgE35iwgRVz+jTjZxqj1ZB21l6lbF9t2 /mu7/FN7rPewMo1o43KIsmehJWCw= X-Gm-Gg: ASbGncs6uuYvPfej0ny4kQjMoJBdmq9raWknY2CXXPu74ezSi1sqV0sDCruFz3r6W6O YE5JuzvSbSrjKvUrYpOJx7inytTeyf9Sr X-Google-Smtp-Source: AGHT+IGJLR6ortOSAucvO6GyYZRYNq/shhPGgK9ATZ0nonlg3q1bbGbXF/PyHYWSA6gi6ydCT1WOSpbXwmUTMqpT7tQ= X-Received: by 2002:a17:90b:388e:b0:2ee:db1a:2e3c with SMTP id 98e67ed59e1d1-2ef011e384bmr11766901a91.1.1733381628621; Wed, 04 Dec 2024 22:53:48 -0800 (PST) MIME-Version: 1.0 References: <54c35fb9-da3a-4754-ab8c-46ed0b612465@vondra.me> <684c70d7-180e-461d-9377-600c2db581ba@vondra.me> In-Reply-To: <684c70d7-180e-461d-9377-600c2db581ba@vondra.me> From: Amit Langote Date: Thu, 5 Dec 2024 15:53:32 +0900 Message-ID: Subject: Re: generic plans and "initial" pruning To: Tomas Vondra Cc: Robert Haas , Alvaro Herrera , Andres Freund , Daniel Gustafsson , David Rowley , PostgreSQL Hackers , Thom Brown , Tom Lane Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Dec 5, 2024 at 2:20=E2=80=AFAM Tomas Vondra wrote= : > On 12/4/24 14:34, Amit Langote wrote: > > On Mon, Dec 2, 2024 at 3:36=E2=80=AFAM Tomas Vondra w= rote: > >> 0001 > >> ---- > >> > >> 1) But if we don't expect this error to actually happen, do we really > >> need to make it ereport()? Maybe it should be plain elog(). I mean, it= 's > >> "can't happen" and thus doesn't need translations etc. > >> > >> if (!bms_equal(relids, pruneinfo->relids)) > >> ereport(ERROR, > >> errcode(ERRCODE_INTERNAL_ERROR), > >> errmsg_internal("mismatching PartitionPruneInfo found = at > >> part_prune_index %d", > >> part_prune_index), > >> errdetail_internal("plan node relids %s, pruneinfo > >> relids %s", > >> bmsToString(relids), > >> bmsToString(pruneinfo->relids))); > > > > I'm fine with elog() here even if it causes the message to be longer: > > > > elog(ERROR, "mismatching PartitionPruneInfo found at part_prune_index > > %d (plan node relids %s, pruneinfo relids %s) > > > > I'm not forcing you to do elog, if you think ereport() is better. I'm > only asking because AFAIK the "policy" is that ereport is for cases that > think can happen (and thus get translated), while elog(ERROR) is for > cases that we believe shouldn't happen. > > So every time I see "ereport" I ask myself "how could this happen" which > doesn't seem to be the case here. > > >> Perhaps it should even be an assert? > > > > I am not sure about that. Having a message handy might be good if a > > user ends up hitting this case for whatever reason, like trying to run > > a corrupted plan. > > I'm a bit skeptical about this, TBH. If we assume the plan is > "corrupted", why should we notice in this particular place? I mean, it > could be corrupted in a million different ways, and the chance that it > got through all the earlier steps is like 1 in a 1.000.000. Yeah, I am starting to think the same. Btw, the idea to have a check and elog() / ereport() came from Alvaro upthread: https://www.postgresql.org/message-id/20221130181201.mfinyvtob3j5i2a6%40alv= herre.pgsql > >> 2) I'm not quite sure what "exec" partition pruning is? > >> > >> /* > >> * ExecInitPartitionPruning > >> * Initialize the data structures needed for runtime "exec" partitio= n > >> * pruning and return the result of initial pruning, if available. > >> > >> Is that the same thing as "runtime pruning"? > > > > "Exec" pruning refers to pruning performed during execution, using > > PARAM_EXEC parameters. In contrast, "init" pruning occurs during plan > > initialization, using parameters whose values remain constant during > > execution, such as PARAM_EXTERN parameters and stable functions. > > > > Before this patch, the ExecInitPartitionPruning function, called > > during ExecutorStart(), performed "init" pruning and set up state in > > the PartitionPruneState for subsequent "exec" pruning during > > ExecutorRun(). With this patch, "init" pruning is performed well > > before this function is called, leaving its sole responsibility to > > setting up the state for "exec" pruning. It may be worth renaming the > > function to better reflect this new role, rather than updating only > > the comment. > > > > Actually, that is what I decided to do in the attached, along with > > some other adjustments like moving ExecDoInitialPruning() to > > execPartition.c from execMain.c, fixing up some obsolete comments, > > etc. > > > > I don't see any attachment :-( > > Anyway, if I understand correctly, the "runtime pruning" has two > separate cases - initial pruning and exec pruning. Is that right? That's correct. These patches are about performing "initial" pruning at a different time and place so that we can take the deferred locks on the unpruned partitions before we perform ExecInitNode() on any of the plan trees in the PlannedStmt. > >> 0005 > >> ---- > >> > >> 1) auto_explain.c - So what happens if the plan gets invalidated? The > >> hook explain_ExecutorStart returns early, but then what? Does that bre= ak > >> the user session somehow, or what? > > > > It will get called again after ExecutorStartExt() loops back to do > > ExecutorStart() with a new updated plan tree. > > > >> 2) Isn't it a bit fragile if this requires every extension to update > >> and add the ExecPlanStillValid() calls to various places? > > > > The ExecPlanStillValid() call only needs to be added immediately after > > the call to standard_ExecutorStart() in an extension's > > ExecutorStart_hook() implementation. > > > >> What if an > >> extension doesn't do that? What weirdness will happen? > > > > The QueryDesc.planstate won't contain a PlanState tree for starters > > and other state information that InitPlan() populates in EState based > > on the PlannedStmt. > > OK, and the consequence is that the query will fail, right? No, the core executor will retry the execution with a new updated plan. In the absence of the early return, the extension might even crash when accessing such incomplete QueryDesc. What the patch makes the ExecutorStart_hook do is similar to how InitPlan() will return early when locks taken on partitions that survive initial pruning invalidate the plan. > >> Maybe it'd be > >> possible to at least check this in some other executor hook? Or at lea= st > >> we could ensure the check was done in assert-enabled builds? Or > >> something to make extension authors aware of this? > > > > I've added a note in the commit message, but if that's not enough, one > > idea might be to change the return type of ExecutorStart_hook so that > > the extensions that implement it are forced to be adjusted. Say, from > > void to bool to indicate whether standard_ExecutorStart() succeeded > > and thus created a "valid" plan. I had that in the previous versions > > of the patch. Thoughts? > > Maybe. My concern is that this case (plan getting invalidated) is fairly > rare, so it's entirely plausible the extension will seem to work just > fine without the code update for a long time. You might see the errors like the one below when the core executor or a hook tries to initialize or process in some other way a known invalid plan, for example, because an unpruned partition's index got concurrently dropped before the executor got the lock: ERROR: could not open relation with OID xxx > Sure, changing the APIs is allowed, I'm just wondering if maybe there > might be a way to not have this issue, or at least notice the missing > call early. > > I haven't tried, wouldn't it be better to modify ExecutorStart() to do > the retries internally? I mean, the extensions wouldn't need to check if > the plan is still valid, ExecutorStart() would take care of that. Yeah, > it might need some new arguments, but that's more obvious. One approach could be to move some code from standard_ExecutorStart() into ExecutorStart(). Specifically, the code responsible for setting up enough state in the EState to perform ExecDoInitialPruning(), which takes locks that might invalidate the plan. If the plan does become invalid, the hook and standard_ExecutorStart() are not called. Instead, the caller, ExecutorStartExt() in this case, creates a new plan. This avoids the need to add ExecPlanStillValid() checks anywhere, whether in core or extension code. However, it does mean accessing the PlannedStmt earlier than InitPlan(), but the current placement of the code is not exactly set in stone. --=20 Thanks, Amit Langote