Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQ9yP-001KL7-1S for pgsql-hackers@arkaria.postgresql.org; Thu, 21 May 2026 20:26:29 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wQ9xN-00AxR1-2Y for pgsql-hackers@arkaria.postgresql.org; Thu, 21 May 2026 20:25:26 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQ9xN-00AxQt-1f for pgsql-hackers@lists.postgresql.org; Thu, 21 May 2026 20:25:26 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wQ9xL-000000009MM-2gix for pgsql-hackers@lists.postgresql.org; Thu, 21 May 2026 20:25:25 +0000 Received: from pro.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.18.1/8.18.1) with ESMTP id 64LKPDZH2034406; Thu, 21 May 2026 16:25:13 -0400 From: Tom Lane To: Alexandra Wang cc: jian he , pgsql-hackers@lists.postgresql.org, Tomas Vondra , Andrei Lepikhov , Corey Huinker , hs@cybertec.at, Jeff Davis Subject: Re: Is there value in having optimizer stats for joins/foreignkeys? In-reply-to: References: <246035.1764627115@sss.pgh.pa.us> <8df3d212-5d60-4e30-9606-d8849f7d37ae@gmail.com> <3c477f2f-10e4-4705-bb21-90ccbe67e9d2@gmail.com> Comments: In-reply-to Alexandra Wang message dated "Fri, 15 May 2026 08:30:31 -0700" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <24246.1779395113.1@sss.pgh.pa.us> Date: Thu, 21 May 2026 13:25:13 -0700 Message-ID: <24247.1779395113@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Alexandra Wang writes: > Here's v7, another attempt to fix the unstable tests. Hi Alexandra, I signed up for an in-person review of this at PGConf.dev, but the schedule doesn't seem to be working in favor of making that happen. If you see this and happen to run into me in the hallway, I'm happy to chat, but in any case here are my rather-hasty review notes. I think it's okay if v1 only handles 2-way joins, as long as the catalog representation is prepared for more. Restricting to cases where we can do index-based sampling seems fine too. Those things could be relaxed later if it seems worthwhile, but we'd have a creditable feature even without. I didn't read the sampling code in any detail. I think you will need to put more thought into what is user-friendly behavior in case the required index doesn't exist or doesn't have the right properties. (I think the tests for that might not be strong enough, either.) I think you could simplify some code noticeably if you included the anchor rel's OID as the first element of stxjoinrels[]. Yeah, it'd be redundant with stxrelid, but so what? It's not like pg_statistic_ext rows are narrow enough that anyone would notice the extra 4 bytes. I think this would simplify some of the relationships within the data structures, too, eg all varnos in the expressions could be considered to reference stxjoinrels[]. I don't love stxkeyrefs[]. I wonder if it's time to throw away stxkeys[], represent all the target columns as regular expression trees in stxexprs, and then special-case columns that are simple Vars where appropriate at execution. (In the same vein, I dislike the grammar's separation of plain columns from expressions; I'd like to replace stats_params with expr_list and sort it all out later. But perhaps that's material for a separate patch.) We will need to put more thought into permissions: I don't think requiring all the tables to have the same owner is workable. (What happens if someone tries to ALTER OWNER later?) However, if they don't all have the same owner, there are potential security problems, so the right restriction is not obvious. This is not necessary to solve now; there are bigger questions to worry about. But we'll need an answer before it's committable. It's not too soon to write some user-facing documentation. CREATE STATISTICS man page obviously needs attention, but also the discussion of extended stats in perform.sgml. And catalogs.sgml. I find that writing that sort of stuff helps to clarify where one's design is weak. regards, tom lane