Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qT33d-008ELu-V4 for pgsql-hackers@arkaria.postgresql.org; Mon, 07 Aug 2023 16:26:14 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qT33a-00Aa1d-9k for pgsql-hackers@arkaria.postgresql.org; Mon, 07 Aug 2023 16:26:10 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qT33Z-00Aa1V-Su for pgsql-hackers@lists.postgresql.org; Mon, 07 Aug 2023 16:26:10 +0000 Received: from mail-lf1-x130.google.com ([2a00:1450:4864:20::130]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qT33W-0013Cx-PV for pgsql-hackers@postgresql.org; Mon, 07 Aug 2023 16:26:09 +0000 Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-4fe0d5f719dso7764936e87.2 for ; Mon, 07 Aug 2023 09:26:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691425564; x=1692030364; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wHOsphxRvg6B3P6PbB46OLOzX/9dphjpc59Op+RwJGU=; b=lINWugQY7frnhLpzQehNSJI/PeUZC60nNrBvIRhBBYinScpysoNZr9rFk+z4dKCqeJ /hH+kTCHh+RlzW85n5QWlz0xpHdtnVCqPmIp/B0IF1YdEMPGjYWbloK38irWp7JjToj5 C0kQe4OEw3TEF1e2ig1XQFKO2lZfjYsQ+aFi/ZlmG04wMNvTE3qyK/DgTAUDU29VcI85 v7KJu1CRLSFmPdcH6WYBcl1BkwQiAjuctPdvDO7mOR+j2ubQCsD4vWC0QlOG1uleLdg+ AAWE8Yd+xTkss5kpGC8AsPPabNJq66KmXi8VX0LfNKAAZGBGAZM69QazhNQDUVK2BLml kLoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691425564; x=1692030364; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wHOsphxRvg6B3P6PbB46OLOzX/9dphjpc59Op+RwJGU=; b=O2fj5B8XEpiI0hAp05kxX/+nhupwDC8JjHOp2yy3GQrBf2HVHLwiPaViamFzmOh4lM sSy4np+/lQ3ftQyuByZ7E4vFDxEwtpQYFF1bMqzGrbQ7zub2YutWhxP5zNFnLRbaHI7Z /hNRmryk4ZCKG/3FRCVXeZwVYVS2eNCJLgWAofhI+/bZP7ob2iHZkYzFX9Qt7sU8rUWs ONuKwlK5k00VPBcaupmrVCsO+JCIoqPNtkB/fsUsJd2drcrsPBrO0xD8vxn2oUuNEtRZ GLHJWZh4kyVrolCG1sWhETKF8xqhnP9qUu855ZQH7spGzhFHTkCdut7mHNkM05Pv188d eFJA== X-Gm-Message-State: AOJu0Yys6eCB6l1JXOp/W+vxA2szjAHqD8hU0MJbdixqdunPRq5dZ6HR 9uvFESTjsF2gW3LpDRSUWttfZbuTMnD039EsF6E= X-Google-Smtp-Source: AGHT+IGQDCi7xqirSAQ/jVa3cgEWwk/rTuOm3cxpnbKEnjPLf9YhBcCwe3qn1q945X8kDeUGNq0nwkLI2b0buZPGozc= X-Received: by 2002:a05:6512:3451:b0:4f9:54f0:b6db with SMTP id j17-20020a056512345100b004f954f0b6dbmr5535496lfr.13.1691425564266; Mon, 07 Aug 2023 09:26:04 -0700 (PDT) MIME-Version: 1.0 References: <20221221101846.7zsi7lscnm5ediik@alvherre.pgsql> <1350682.1671635927@sss.pgh.pa.us> <4191508.1674157166@sss.pgh.pa.us> <349124.1674185509@sss.pgh.pa.us> <20230207180855.xy5m4puwh5gzd7xy@awork3.anarazel.de> <9D23D8BB-83EE-451E-95C1-CF42ADB76869@yesql.se> <813399.1691423093@sss.pgh.pa.us> In-Reply-To: <813399.1691423093@sss.pgh.pa.us> From: Robert Haas Date: Mon, 7 Aug 2023 12:25:52 -0400 Message-ID: Subject: Re: generic plans and "initial" pruning To: Tom Lane Cc: Amit Langote , Thom Brown , Daniel Gustafsson , Alvaro Herrera , Andres Freund , David Rowley , Jacob Champion , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Aug 7, 2023 at 11:44=E2=80=AFAM Tom Lane wrote: > Right, I doubt that changing that is going to work out well. > Hash joins might have issues with it too. I thought about the case, because Hash and Hash Join are such closely intertwined nodes, but I don't see any problem there. It doesn't really look like it would matter in what order things got cleaned up. Unless I'm missing something, all of the data structures are just independent things that we have to get rid of sometime. > Could it work to make the patch force child cleanup before parent, > instead of after? Or would that break other places? To me, it seems like the overwhelming majority of the code simply doesn't care. You could pick an order out of a hat and it would be 100% OK. But I haven't gone and looked through it with this specific idea in mind. > On the whole though I think it's probably a good idea to leave > parent nodes in control of the timing, so I kind of side with > your later comment about whether we want to change this at all. My overall feeling here is that what Gather and Gather Merge is doing is pretty weird. I think I kind of knew that at the time this was all getting implemented and reviewed, but I wasn't keen to introduce more infrastructure changes than necessary given that parallel query, as a project, was still pretty new and I didn't want to give other hackers more reasons to be unhappy with what was already a lot of very wide-ranging change to the system. A good number of years having gone by now, and other people having worked on that code some more, I'm not too worried about someone calling for a wholesale revert of parallel query. However, there's a second problem here as well, which is that I'm still not sure what the right thing to do is. We've fiddled around with the shutdown sequence for parallel query a number of times now, and I think there's still stuff that doesn't work quite right, especially around getting all of the instrumentation data back to the leader. I haven't spent enough time on this recently enough to be sure what if any problems remain, though. So on the one hand, I don't really like the fact that we have an ad-hoc recursion arrangement here, instead of using planstate_tree_walker or, as Amit proposes, a List. Giving subordinate nodes control over the ordering when they don't really need it just means we have more code with more possibility for bugs and less certainty about whether the theoretical flexibility is doing anything in practice. But on the other hand, because we know that at least for the Gather/GatherMerge case it seems like it probably matters somewhat, it definitely seems appealing not to change anything as part of this patch set that we don't really have to. I've had it firmly in my mind here that we were going to need to change something somehow -- I mean, the possibility of returning in the middle of node initialization seems like a pretty major change to the way this stuff works, and it seems hard for me to believe that we can just do that and not have to adjust any code anywhere else. Can it really be true that we can do that and yet not end up creating any states anywhere with which the current cleanup code is unprepared to cope? Maybe, but it would seem like rather good luck if that's how it shakes out. Still, at the moment, I'm having a hard time understanding what this particular change buys us. --=20 Robert Haas EDB: http://www.enterprisedb.com