Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1umxf0-000TUJ-6j for pgsql-hackers@arkaria.postgresql.org; Fri, 15 Aug 2025 16:52:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1umxez-000zS2-Eo for pgsql-hackers@arkaria.postgresql.org; Fri, 15 Aug 2025 16:52:10 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1umxez-000zRt-5X for pgsql-hackers@lists.postgresql.org; Fri, 15 Aug 2025 16:52:09 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1umxex-0001Fh-0s for pgsql-hackers@lists.postgresql.org; Fri, 15 Aug 2025 16:52:08 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 57FGq6ud616752; Fri, 15 Aug 2025 12:52:06 -0400 From: Tom Lane To: Chao Li cc: pgsql-hackers@lists.postgresql.org Subject: Re: Improve hash join's handling of tuples with null join keys In-reply-to: <544D7C83-CECE-44E7-B5D7-530E9318D231@gmail.com> References: <3061845.1746486714@sss.pgh.pa.us> <496221.1748882849@sss.pgh.pa.us> <175507656113.993.1381684440543440253.pgcf@coridan.postgresql.org> <544D7C83-CECE-44E7-B5D7-530E9318D231@gmail.com> Comments: In-reply-to Chao Li message dated "Thu, 14 Aug 2025 10:36:06 +0800" MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-ID: <616750.1755276726.1@sss.pgh.pa.us> Content-Transfer-Encoding: quoted-printable Date: Fri, 15 Aug 2025 12:52:06 -0400 Message-ID: <616751.1755276726@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Chao Li writes: > With this patch, =E2=80=9Cisnull=E2=80=9D now becomes true because of th= e change of strict op. Then the outer null join key tuple must be stored i= n a tuplestore. When an outer table contains a lot of null join key tuples= , then the tuplestore could bump to very large, in that case, it would be = hard to say this patch really benefits. What's your point? If we don't divert those tuples into the tuplestore, then they will end up in the main hash table instead, and the consequences of bloat there are far worse. > Based on this patch, if we are doing a left join, and outer table is emp= ty, then all tuples from the inner table should be returned. In that case,= we can skip building a hash table, instead, we can put all inner table tu= ples into hashtable.innerNullTupleStore. Building a tuplestore should be c= heaper than building a hash table, so this way makes a little bit more per= formance improvement. I think that would make the logic completely unintelligible. Also, a totally-empty input relation is not a common situation. We try to optimize such cases when it's simple to do so, but we shouldn't let that drive the fundamental design. regards, tom lane