Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vuzjl-00G8GI-3B for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 21:14:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vuzjk-003VpB-2n for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Feb 2026 21:14:32 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vuzjk-003Vp3-18 for pgsql-hackers@lists.postgresql.org; Tue, 24 Feb 2026 21:14:32 +0000 Received: from forward502d.mail.yandex.net ([178.154.239.210]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vuzjg-000000017Ax-32sh for pgsql-hackers@lists.postgresql.org; Tue, 24 Feb 2026 21:14:32 +0000 Received: from mail-nwsmtp-smtp-production-main-85.iva.yp-c.yandex.net (mail-nwsmtp-smtp-production-main-85.iva.yp-c.yandex.net [IPv6:2a02:6b8:c0c:ba27:0:640:538:0]) by forward502d.mail.yandex.net (Yandex) with ESMTPS id 2B4BEC1C75; Wed, 25 Feb 2026 00:14:28 +0300 (MSK) Received: by mail-nwsmtp-smtp-production-main-85.iva.yp-c.yandex.net (smtp/Yandex) with ESMTPSA id OEoI22eGuW20-XzPw5oUO; Wed, 25 Feb 2026 00:14:27 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tantorlabs.com; s=mail; t=1771967667; bh=SdZi+5IBGvCp7ws2FCvsxAY+oglI7JIn2VJVR8UKdE0=; h=In-Reply-To:Cc:Date:References:To:Subject:Message-ID:From; b=Z7sBoeLYGbeffGHPSC+wOXqlUdRwLn5DELv5CeVZ3RPUsudNTwNVRPVCKabvF2AH0 0QMIMG9XKRH3XwaXRauYZC45cCdvSsw6WXtw2MiwSEi6FyT7ZnwbWbvTZM5DxJOef/ UCcJwtfAOYX1Rt4UsZjuMrXuDDg+5NtdGl1jCeLs= Authentication-Results: mail-nwsmtp-smtp-production-main-85.iva.yp-c.yandex.net; dkim=pass header.i=@tantorlabs.com Content-Type: multipart/mixed; boundary="------------Ju0qgutIjFRUAPUUx0EksKxD" Message-ID: <4c761b02-5a60-4076-aa0c-9c6fef06e2c1@tantorlabs.com> Date: Wed, 25 Feb 2026 00:14:01 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Reduce planning time for large NOT IN lists containing NULL To: David Geier , Zsolt Parragi Cc: PostgreSQL Developers References: <52596e08-0902-475b-ad0f-265ec6918f19@gmail.com> <7791ec74-369d-48c7-969c-cf4ad18024f2@tantorlabs.com> <6b3aebb5-f26c-4f50-abd6-e733d452af26@gmail.com> <52f3c637-465d-4f0e-9546-732ecf61ab87@tantorlabs.com> <3fc74079-f429-48c8-ad8c-688b21d4d1c1@gmail.com> Content-Language: en-US From: Ilia Evdokimov In-Reply-To: <3fc74079-f429-48c8-ad8c-688b21d4d1c1@gmail.com> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------Ju0qgutIjFRUAPUUx0EksKxD Content-Type: multipart/alternative; boundary="------------fRW77JvHXRdboVC3GvTwQP6E" --------------fRW77JvHXRdboVC3GvTwQP6E Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2/24/26 11:29, David Geier wrote: > Using array_contains_nulls() seems fine. In case the IN list doesn't > contain NULL, the function can immediately bail thanks to the > !ARR_HASNULL() check in the beginning. > > It only needs to iterate over the NULL-bitmap, if it exists. This is the > case if there's actually a NULL element in the array, or if the array > initially contained NULL and all NULLs got removed subsequently. > > If we ever find the latter case to matter we could remove the > NULL-bitmap in array_set_element() / array_set_element_expanded(), when > the last NULL element got removed. > >> Could you clarify what exactly this additional test meant to verify? > Zsolt's test case creates an array that initially contains NULL. The > NULL element is subsequently replaced by a non-NULL value but > array_set_element_expanded() keeps the NULL-bitmap around. With that, > your ARR_ISNULL() check bails and causes the selectivity estimation to > incorrectly return 0. Ah, right - thanks for the clarification. I agree. Regarding the regression test: the suggestion test case is good, but there is not a straightforward way to expose the estimated row count without also showing the costs, and costs are unstable. To avoid that, I reused the parsing approach already present in stats_ext.sql to extract only the estimated row count from EXPLAIN. Since the test table contains exactly 1000 rows and we run ANALYZE, all rows are included in the statistics sample. Therefore the estimate for x <> ALL(array[1, 99, 2]) is deterministically 997 rows, and the test stable and ensures we detect the incorrect early-zero estimate. Let me know if you'd prefer a different approach. I've attached v4 patch. -- Best regards, Ilia Evdokimov, Tantor Labs LLC, https://tantorlabs.com/ --------------fRW77JvHXRdboVC3GvTwQP6E Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

On 2/24/26 11:29, David Geier wrote:

Using array_contains_nulls() seems fine. In case the IN list doesn't
contain NULL, the function can immediately bail thanks to the
!ARR_HASNULL() check in the beginning.

It only needs to iterate over the NULL-bitmap, if it exists. This is the
case if there's actually a NULL element in the array, or if the array
initially contained NULL and all NULLs got removed subsequently.

If we ever find the latter case to matter we could remove the
NULL-bitmap in array_set_element() / array_set_element_expanded(), when
the last NULL element got removed.

Could you clarify what exactly this additional test meant to verify?
Zsolt's test case creates an array that initially contains NULL. The
NULL element is subsequently replaced by a non-NULL value but
array_set_element_expanded() keeps the NULL-bitmap around. With that,
your ARR_ISNULL() check bails and causes the selectivity estimation to
incorrectly return 0.

Ah, right - thanks for the clarification. I agree.

Regarding the regression test: the suggestion test case is good, but there is not a straightforward way to expose the estimated row count without also showing the costs, and costs are unstable. To avoid that, I reused the parsing approach already present in stats_ext.sql to extract only the estimated row count from EXPLAIN.

Since the test table contains exactly 1000 rows and we run ANALYZE, all rows are included in the statistics sample. Therefore the estimate for x <> ALL(array[1, 99, 2]) is deterministically 997 rows, and the test stable and ensures we detect the incorrect early-zero estimate.

Let me know if you'd prefer a different approach. I've attached v4 patch.

-- 
Best regards, 
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/


--------------fRW77JvHXRdboVC3GvTwQP6E-- --------------Ju0qgutIjFRUAPUUx0EksKxD Content-Type: text/x-patch; charset=UTF-8; name="v4-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patch" Content-Disposition: attachment; filename*0="v4-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.pa"; filename*1="tch" Content-Transfer-Encoding: base64 RnJvbSBhOWRkMWQxYzk5OTBhMzAxNzNjNTIxMDU3MGVmY2NiMDM0ZThiNWY0IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBFdmRva2ltb3YgSWxpYSA8aWx5YS5ldmRva2ltb3ZA dGFudG9ybGFicy5jb20+CkRhdGU6IFdlZCwgMjUgRmViIDIwMjYgMDA6MDg6MjYgKzAzMDAK U3ViamVjdDogW1BBVENIIHY0XSBSZWR1Y2UgcGxhbm5pbmcgdGltZSBmb3IgbGFyZ2UgTk9U IElOIGxpc3RzIGNvbnRhaW5pbmcgCiBOVUxMCgpGb3IgeCA8PiBBTEwgKC4uLikgLyB4IE5P VCBJTiAoLi4uKSwgdGhlIHByZXNlbmNlIG9mIGEgTlVMTCBlbGVtZW50Cm1ha2VzIHRoZSBz ZWxlY3Rpdml0eSAwLjAuCgpUaGUgcGxhbm5lciBjdXJyZW50bHkgc3RpbGwgaXRlcmF0ZXMg b3ZlciBhbGwgZWxlbWVudHMgYW5kIGNvbXB1dGVzCnBlci1lbGVtZW50IHNlbGVjdGl2aXR5 LCBldmVuIHRob3VnaCB0aGUgZmluYWwgcmVzdWx0IGlzIGtub3duLgoKQWRkIGFuIGVhcmx5 IE5VTEwgY2hlY2sgZm9yIGNvbnN0YW50IGFycmF5cyBhbmQgaW1tZWRpYXRlbHkgcmV0dXJu CjAuMCB1bmRlciBBTEwgc2VtYW50aWNzLgoKVGhpcyByZWR1Y2VzIHBsYW5uaW5nIHRpbWUg Zm9yIGxhcmdlIE5PVCBJTiAvIDw+IEFMTCBsaXN0cyB3aXRob3V0CmNoYW5naW5nIHNlbWFu dGljcy4KLS0tCiBzcmMvYmFja2VuZC91dGlscy9hZHQvc2VsZnVuY3MuYyAgICAgICAgICB8 ICA5ICsrKysrCiBzcmMvdGVzdC9yZWdyZXNzL2V4cGVjdGVkL2V4cHJlc3Npb25zLm91dCB8 IDQ0ICsrKysrKysrKysrKysrKysrKysrKysrCiBzcmMvdGVzdC9yZWdyZXNzL3NxbC9leHBy ZXNzaW9ucy5zcWwgICAgICB8IDQxICsrKysrKysrKysrKysrKysrKysrKwogMyBmaWxlcyBj aGFuZ2VkLCA5NCBpbnNlcnRpb25zKCspCgpkaWZmIC0tZ2l0IGEvc3JjL2JhY2tlbmQvdXRp bHMvYWR0L3NlbGZ1bmNzLmMgYi9zcmMvYmFja2VuZC91dGlscy9hZHQvc2VsZnVuY3MuYwpp bmRleCAyOWZlYzY1NTU5My4uZWVmM2YwMzc1YTUgMTAwNjQ0Ci0tLSBhL3NyYy9iYWNrZW5k L3V0aWxzL2FkdC9zZWxmdW5jcy5jCisrKyBiL3NyYy9iYWNrZW5kL3V0aWxzL2FkdC9zZWxm dW5jcy5jCkBAIC0yMDE4LDYgKzIwMTgsMTEgQEAgc2NhbGFyYXJyYXlzZWwoUGxhbm5lcklu Zm8gKnJvb3QsCiAJCWlmIChhcnJheWlzbnVsbCkJCS8qIHF1YWwgY2FuJ3Qgc3VjY2VlZCBp ZiBudWxsIGFycmF5ICovCiAJCQlyZXR1cm4gKFNlbGVjdGl2aXR5KSAwLjA7CiAJCWFycmF5 dmFsID0gRGF0dW1HZXRBcnJheVR5cGVQKGFycmF5ZGF0dW0pOworCisJCS8qIFNlbGVjdGl2 aXR5IG9mICJXSEVSRSB4IE5PVCBJTiAoTlVMTCwgLi4uICkiIGlzIGFsd2F5cyAwICovCisJ CWlmICghdXNlT3IgJiYgYXJyYXlfY29udGFpbnNfbnVsbHMoYXJyYXl2YWwpKQorCQkJcmV0 dXJuIChTZWxlY3Rpdml0eSkgMC4wOworCiAJCWdldF90eXBsZW5ieXZhbGFsaWduKEFSUl9F TEVNVFlQRShhcnJheXZhbCksCiAJCQkJCQkJICZlbG1sZW4sICZlbG1ieXZhbCwgJmVsbWFs aWduKTsKIAkJZGVjb25zdHJ1Y3RfYXJyYXkoYXJyYXl2YWwsCkBAIC0yMTE1LDYgKzIxMjAs MTAgQEAgc2NhbGFyYXJyYXlzZWwoUGxhbm5lckluZm8gKnJvb3QsCiAJCQlMaXN0CSAgICph cmdzOwogCQkJU2VsZWN0aXZpdHkgczI7CiAKKwkJCS8qIFNlbGVjdGl2aXR5IG9mICJXSEVS RSB4IE5PVCBJTiAoTlVMTCwgLi4uICkiIGlzIGFsd2F5cyAwICovCisJCQlpZiAoIXVzZU9y ICYmIElzQShlbGVtLCBDb25zdCkgJiYgKChDb25zdCAqKSBlbGVtKS0+Y29uc3Rpc251bGwp CisJCQkJcmV0dXJuIChTZWxlY3Rpdml0eSkgMC4wOworCiAJCQkvKgogCQkJICogVGhlb3Jl dGljYWxseSwgaWYgZWxlbSBpc24ndCBvZiBub21pbmFsX2VsZW1lbnRfdHlwZSB3ZSBzaG91 bGQKIAkJCSAqIGluc2VydCBhIFJlbGFiZWxUeXBlLCBidXQgaXQgc2VlbXMgdW5saWtlbHkg dGhhdCBhbnkgb3BlcmF0b3IKZGlmZiAtLWdpdCBhL3NyYy90ZXN0L3JlZ3Jlc3MvZXhwZWN0 ZWQvZXhwcmVzc2lvbnMub3V0IGIvc3JjL3Rlc3QvcmVncmVzcy9leHBlY3RlZC9leHByZXNz aW9ucy5vdXQKaW5kZXggOWEzYzk3YjE1YTMuLjM0ZjE0YTU3NzVhIDEwMDY0NAotLS0gYS9z cmMvdGVzdC9yZWdyZXNzL2V4cGVjdGVkL2V4cHJlc3Npb25zLm91dAorKysgYi9zcmMvdGVz dC9yZWdyZXNzL2V4cGVjdGVkL2V4cHJlc3Npb25zLm91dApAQCAtNDI2LDMgKzQyNiw0NyBA QCBzZWxlY3QgKiBmcm9tIGludHRlc3Qgd2hlcmUgYSBub3QgaW4gKDA6Om15aW50LDI6Om15 aW50LDM6Om15aW50LDQ6Om15aW50LDU6Om15aQogKDAgcm93cykKIAogcm9sbGJhY2s7Cist LSBUZXN0IDw+IEFMTCB3aGVuIGFycmF5IGluaXRpYWxseSBjb250YWluZWQgTlVMTCBidXQg bm8gbG9uZ2VyIGRvZXMKK2JlZ2luOworY3JlYXRlIGZ1bmN0aW9uIGNoZWNrX2VzdGltYXRl ZF9yb3dzKHRleHQpIHJldHVybnMgdGFibGUgKGVzdGltYXRlZCBpbnQpCitsYW5ndWFnZSBw bHBnc3FsIGFzCiskJAorZGVjbGFyZQorICAgIGxuIHRleHQ7CisgICAgdG1wIHRleHRbXTsK KyAgICBmaXJzdF9yb3cgYm9vbCA6PSB0cnVlOworYmVnaW4KKyAgICBmb3IgbG4gaW4KKyAg ICAgICAgZXhlY3V0ZSBmb3JtYXQoJ2V4cGxhaW4gJXMnLCAkMSkKKyAgICBsb29wCisgICAg ICAgIGlmIGZpcnN0X3JvdyB0aGVuCisgICAgICAgICAgICBmaXJzdF9yb3cgOj0gZmFsc2U7 CisgICAgICAgICAgICB0bXAgOj0gcmVnZXhwX21hdGNoKGxuLCAncm93cz0oXGQqKScpOwor ICAgICAgICAgICAgcmV0dXJuIHF1ZXJ5IHNlbGVjdCB0bXBbMV06OmludDsKKyAgICAgICAg ZW5kIGlmOworICAgIGVuZCBsb29wOworZW5kOworJCQ7CitjcmVhdGUgZnVuY3Rpb24gcmVw bGFjZV9lbGVtKGFyciBpbnRbXSwgaWR4IGludCwgdmFsIGludCkKK3JldHVybnMgaW50W10g QVMgJCQKK2JlZ2luCisgICAgICBhcnJbaWR4XSA6PSB2YWw7CisgICAgICByZXR1cm4gYXJy OworZW5kOworJCQgbGFuZ3VhZ2UgcGxwZ3NxbCBpbW11dGFibGU7CitjcmVhdGUgdGFibGUg bm90aW5fdGVzdCBhcyBzZWxlY3QgZ2VuZXJhdGVfc2VyaWVzKDEsIDEwMDApIGFzIHg7Cith bmFseXplIG5vdGluX3Rlc3Q7CitzZWxlY3QgKiBmcm9tIGNoZWNrX2VzdGltYXRlZF9yb3dz KCdzZWxlY3QgKiBmcm9tIG5vdGluX3Rlc3Qgd2hlcmUgeCA8PiBhbGwoYXJyYXlbMSw5OSwz XSknKTsKKyBlc3RpbWF0ZWQgCistLS0tLS0tLS0tLQorICAgICAgIDk5NworKDEgcm93KQor CistLSBzYW1lIGFycmF5LCBjb25zdHJ1Y3RlZCBmcm9tIGFuIGFycmF5IHdpdGggYSBOVUxM CitzZWxlY3QgKiBmcm9tIGNoZWNrX2VzdGltYXRlZF9yb3dzKCdzZWxlY3QgKiBmcm9tIG5v dGluX3Rlc3Qgd2hlcmUgeCA8PiBhbGwocmVwbGFjZV9lbGVtKGFycmF5WzEsbnVsbCwzXSwg MiwgOTkpKScpOworIGVzdGltYXRlZCAKKy0tLS0tLS0tLS0tCisgICAgICAgOTk3CisoMSBy b3cpCisKK3JvbGxiYWNrOwpkaWZmIC0tZ2l0IGEvc3JjL3Rlc3QvcmVncmVzcy9zcWwvZXhw cmVzc2lvbnMuc3FsIGIvc3JjL3Rlc3QvcmVncmVzcy9zcWwvZXhwcmVzc2lvbnMuc3FsCmlu ZGV4IGUwMmMyMWYzMzY4Li5jYTk0ODU5YmJmOCAxMDA2NDQKLS0tIGEvc3JjL3Rlc3QvcmVn cmVzcy9zcWwvZXhwcmVzc2lvbnMuc3FsCisrKyBiL3NyYy90ZXN0L3JlZ3Jlc3Mvc3FsL2V4 cHJlc3Npb25zLnNxbApAQCAtMjA5LDMgKzIwOSw0NCBAQCBzZWxlY3QgKiBmcm9tIGludHRl c3Qgd2hlcmUgYSBub3QgaW4gKDE6Om15aW50LDI6Om15aW50LDM6Om15aW50LDQ6Om15aW50 LDU6Om15aQogc2VsZWN0ICogZnJvbSBpbnR0ZXN0IHdoZXJlIGEgbm90IGluICgwOjpteWlu dCwyOjpteWludCwzOjpteWludCw0OjpteWludCw1OjpteWludCwgbnVsbCk7CiAKIHJvbGxi YWNrOworCistLSBUZXN0IDw+IEFMTCB3aGVuIGFycmF5IGluaXRpYWxseSBjb250YWluZWQg TlVMTCBidXQgbm8gbG9uZ2VyIGRvZXMKKworYmVnaW47CisKK2NyZWF0ZSBmdW5jdGlvbiBj aGVja19lc3RpbWF0ZWRfcm93cyh0ZXh0KSByZXR1cm5zIHRhYmxlIChlc3RpbWF0ZWQgaW50 KQorbGFuZ3VhZ2UgcGxwZ3NxbCBhcworJCQKK2RlY2xhcmUKKyAgICBsbiB0ZXh0OworICAg IHRtcCB0ZXh0W107CisgICAgZmlyc3Rfcm93IGJvb2wgOj0gdHJ1ZTsKK2JlZ2luCisgICAg Zm9yIGxuIGluCisgICAgICAgIGV4ZWN1dGUgZm9ybWF0KCdleHBsYWluICVzJywgJDEpCisg ICAgbG9vcAorICAgICAgICBpZiBmaXJzdF9yb3cgdGhlbgorICAgICAgICAgICAgZmlyc3Rf cm93IDo9IGZhbHNlOworICAgICAgICAgICAgdG1wIDo9IHJlZ2V4cF9tYXRjaChsbiwgJ3Jv d3M9KFxkKiknKTsKKyAgICAgICAgICAgIHJldHVybiBxdWVyeSBzZWxlY3QgdG1wWzFdOjpp bnQ7CisgICAgICAgIGVuZCBpZjsKKyAgICBlbmQgbG9vcDsKK2VuZDsKKyQkOworCitjcmVh dGUgZnVuY3Rpb24gcmVwbGFjZV9lbGVtKGFyciBpbnRbXSwgaWR4IGludCwgdmFsIGludCkK K3JldHVybnMgaW50W10gQVMgJCQKK2JlZ2luCisgICAgICBhcnJbaWR4XSA6PSB2YWw7Cisg ICAgICByZXR1cm4gYXJyOworZW5kOworJCQgbGFuZ3VhZ2UgcGxwZ3NxbCBpbW11dGFibGU7 CisKK2NyZWF0ZSB0YWJsZSBub3Rpbl90ZXN0IGFzIHNlbGVjdCBnZW5lcmF0ZV9zZXJpZXMo MSwgMTAwMCkgYXMgeDsKK2FuYWx5emUgbm90aW5fdGVzdDsKKworc2VsZWN0ICogZnJvbSBj aGVja19lc3RpbWF0ZWRfcm93cygnc2VsZWN0ICogZnJvbSBub3Rpbl90ZXN0IHdoZXJlIHgg PD4gYWxsKGFycmF5WzEsOTksM10pJyk7CistLSBzYW1lIGFycmF5LCBjb25zdHJ1Y3RlZCBm cm9tIGFuIGFycmF5IHdpdGggYSBOVUxMCitzZWxlY3QgKiBmcm9tIGNoZWNrX2VzdGltYXRl ZF9yb3dzKCdzZWxlY3QgKiBmcm9tIG5vdGluX3Rlc3Qgd2hlcmUgeCA8PiBhbGwocmVwbGFj ZV9lbGVtKGFycmF5WzEsbnVsbCwzXSwgMiwgOTkpKScpOworCityb2xsYmFjazsKXCBObyBu ZXdsaW5lIGF0IGVuZCBvZiBmaWxlCi0tIAoyLjM0LjEKCg== --------------Ju0qgutIjFRUAPUUx0EksKxD--