Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8rTQ-000x4x-2Y for pgsql-hackers@arkaria.postgresql.org; Sat, 04 Apr 2026 03:15:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w8rTN-00FDXV-0g for pgsql-hackers@arkaria.postgresql.org; Sat, 04 Apr 2026 03:14:57 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8rTM-00FDXN-30 for pgsql-hackers@lists.postgresql.org; Sat, 04 Apr 2026 03:14:57 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w8rTH-00000000ULx-0PZt for pgsql-hackers@lists.postgresql.org; Sat, 04 Apr 2026 03:14:56 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 6343EkKC3877211; Fri, 3 Apr 2026 23:14:46 -0400 From: Tom Lane To: Robert Haas cc: Alexander Lakhin , Lukas Fittl , PostgreSQL Hackers Subject: Re: pg_plan_advice In-reply-to: References: <1299934.1773938807@sss.pgh.pa.us> <3683430.1775173413@sss.pgh.pa.us> <3817825.1775240432@sss.pgh.pa.us> Comments: In-reply-to Robert Haas message dated "Fri, 03 Apr 2026 20:14:49 -0400" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <3877209.1775272486.1@sss.pgh.pa.us> Date: Fri, 03 Apr 2026 23:14:46 -0400 Message-ID: <3877210.1775272486@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Robert Haas writes: > ... But I also feel like if we've only seen one buildfarm > failure since the last round of stabilization, it might not be a > catastrophe if nothing further is done before feature freeze. In fact, > I think it might be *good*. Given the apparently-low failure rate that > we now have, it feels to me like we might want to run like this for a > month or even or two or three to get a clearer feeling for whether the > failure you saw is the only one or whether, perhaps, there are others. > Or even just how often this one happens. Reasonable point. > I mean, there is possibly an argument that we don't really need to > gather any more information about the problem; it does seem like we > understand what is going on here, and if we had a great, simple fix I > would probably just apply it and be done with it. But I also don't > quite understand why you're in such a rush. If we still feel like > running the tests serially is the best solution in a month, can't we > just do it then? The terms that I'm thinking in are "how much redesign will we accept post-feature-freeze, in either pg_plan_advice or test_plan_advice, before choosing to revert those modules entirely for v19?". I think that running those tests serially is a sufficiently low-risk option that it'd be okay to put it in post-freeze, even very long after. I'm not sure that any of the other group-1 or group-2 options you suggested would be okay post-freeze. (Of course, ultimately that'd be the RMT's decision not mine.) I believe that we probably will need to do something in this area before v19 release. If we're willing to commit to it being "run the tests serially", then sure we can wait awhile before actually doing that. Maybe we'll even think of a better idea ... but what we can do about this post-freeze seems pretty constrained to me. regards, tom lane