Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pJLAD-0006It-Ip for pgsql-hackers@arkaria.postgresql.org; Sat, 21 Jan 2023 21:12:37 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pJLAB-0000hJ-4G for pgsql-hackers@arkaria.postgresql.org; Sat, 21 Jan 2023 21:12:35 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pJLAA-0000hA-QX for pgsql-hackers@lists.postgresql.org; Sat, 21 Jan 2023 21:12:34 +0000 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pJLA7-000774-S4 for pgsql-hackers@lists.postgresql.org; Sat, 21 Jan 2023 21:12:33 +0000 Received: by mail-wm1-x335.google.com with SMTP id bg13-20020a05600c3c8d00b003d9712b29d2so8111560wmb.2 for ; Sat, 21 Jan 2023 13:12:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; h=content-transfer-encoding:in-reply-to:references:cc:to:from :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=LDhPLZav2gA9NbdoxMLc7OxC2gkqJ1+/dd+Z+vMBU8o=; b=a/gMucRQS5IQkg8SbhFdM5ux9G9Qaddt4TLQANrQecrRunmy4tuGVaFTvp7O2f7oMZ pWyXer8rxGi5WdJKZKoEUBCSLta7JLn/2b7eXQ6DRT4Frftm12G//wLMwXNS7Oiqpkxx 2y0KtgS17Q5JeLEKuzsuINEUQZf5KUaIm5ADl5mwLmNIk53oTW7XOu+g7UUy16bxLcbn XQfgsa9G38044Yc9y2bghTcJYQMg9SY4snJITn4w9qmBZ7sE1Bkdwn97YTZN9rrXq2H3 O4fOQt5+KVEBpCUZ0YoYxWGiUqPqwJAQ+O9A4J3G+bfdD+7Sw/IGhK6wsxvT44+mwgLs SX+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:references:cc:to:from :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LDhPLZav2gA9NbdoxMLc7OxC2gkqJ1+/dd+Z+vMBU8o=; b=2XCq/+AVNGqVJh7FQhGfmQCpzUooAlb3xOsCCS1bliaCY6Gb3ciGG55uQ2AQ4AFAC5 CncBSemqV/emS8A4+c/bOnuq0hEra0JelUA9GEbe/Q+4ZD+FOZNKwCfuFQYpfiW4kCqG /SkQQMRI20EoJORpYPuVME1YHneXgyug0XZed2F4O/dyaiaZdl1Tg+On6BlPL9KtK4Lg 9kiUeJCcu/noZSAnMMFIOwwuOmilmtJlK8wBVLvvil9GOroEugf1jI8aZ2bph0DJkFY7 ONdvO2XJNeps71jfG135355E/vcEu5dTD94DdMbYX4eGumb1jnunB7LdUqKh+8adeKAs qizA== X-Gm-Message-State: AFqh2kq43EL9HfOjql41vV9Zvb3afUfLzrlUxa4q2f0VO7oL9u4U8ReL hPimu/tuF/6yfQhKXZrqzCovo4hzJlHz8+vO7gEZeVTIhN5cG2KqEwBH++lKTEcjOSzDKlPtbw4 tCg3NsqKRsvsMtOeyx9kAQ4Vn1y6uNFBuubg3k7l3RuReCwZz1KyUni0zQhta6Po9IlaO0a3zfU SXLrtWwnZi3AwwO8quVZyu2bd/A1+dAn+Vibt704Ny0qiOK6GSPqi5+3imw4zgYQZOuSChaYvn2 kAZxg7C84RXgh6n5WIKanzAvGZtNfFS2IFmAP+Lr6mYc3nnUdXLAd9aSbki X-Google-Smtp-Source: AMrXdXsbpUJNB9u2FHNrS29f2i8IE4qGooyELCQGn/YRst0kDXsxTcE9guBnyRBgFc3YrWoNdIz81A== X-Received: by 2002:a05:600c:3514:b0:3db:80c:8ed9 with SMTP id h20-20020a05600c351400b003db080c8ed9mr17155779wmq.22.1674335550049; Sat, 21 Jan 2023 13:12:30 -0800 (PST) Received: from [10.137.0.17] (ip-86-49-228-162.bb.vodafone.cz. [86.49.228.162]) by smtp.gmail.com with ESMTPSA id h19-20020a05600c351300b003cfd4e6400csm6523116wmq.19.2023.01.21.13.12.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 21 Jan 2023 13:12:29 -0800 (PST) Message-ID: Date: Sat, 21 Jan 2023 22:12:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: Implement missing join selectivity estimation for range types Content-Language: en-US From: Tomas Vondra To: Mahmoud Sakr Cc: PostgreSQL Hackers , SCHOEMANS Maxime , Diogo Repas , Luo Zhicheng , Andrey Lepikhov References: <8afecd87-d1e5-241c-5e3e-75e1c62c279b@enterprisedb.com> <1bc5e1e4-380b-213d-6e6a-d5e8f7f0c7f5@enterprisedb.com> <84ffb566-8038-ab35-c841-7a5e5728a247@enterprisedb.com> In-Reply-To: <84ffb566-8038-ab35-c841-7a5e5728a247@enterprisedb.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-CLOUD-SEC-AV-Info: enterprisedb,google_mail,monitor X-CLOUD-SEC-AV-Sent: true X-Gm-Spam: 0 X-Gm-Phishy: 0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Mahmoud, I finally had time to properly read the paper today - the general approach mostly matches how I imagined the estimation would work for inequalities, but it's definitely nice to see the algorithm properly formalized and analyzed. What seems a bit strange to me is that the patch only deals with range types, leaving the scalar cases unchanged. I understand why (not having a MCV simplifies it a lot), but I'd bet joins on range types are waaaay less common than inequality joins on scalar types. I don't even remember seeing inequality join on a range column, TBH. That doesn't mean the patch is wrong, of course. But I'd expect users to be surprised we handle range types better than "old" scalar types (which range types build on, in some sense). Did you have any plans to work on improving estimates for the scalar case too? Or did you do the patch needed for the paper, and have no plans to continue working on this? I'm also wondering about not having MCV for ranges. I was a bit surprised we don't build MCV in compute_range_stats(), and perhaps we should start building those - if there are common ranges, this might significantly improve some of the estimates (just like for scalar columns). Which would mean the estimates for range types are just as complex as for scalars. Of course, we don't do that now. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company