Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vxEFY-0048O2-2O for pgsql-hackers@arkaria.postgresql.org; Tue, 03 Mar 2026 01:08:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vxEFX-003pIB-0T for pgsql-hackers@arkaria.postgresql.org; Tue, 03 Mar 2026 01:08:35 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vxEFW-003pI3-2N for pgsql-hackers@lists.postgresql.org; Tue, 03 Mar 2026 01:08:35 +0000 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vxEFV-000000006Fl-1fWj for pgsql-hackers@lists.postgresql.org; Tue, 03 Mar 2026 01:08:34 +0000 Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-436e8758b91so3271149f8f.0 for ; Mon, 02 Mar 2026 17:08:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1772500111; cv=none; d=google.com; s=arc-20240605; b=MEsvDVOz1v2IDfJGU4ylaiOcUWKUNM3xpI0zSGRrBnnoV/gOEvoRAm0W7g+hEcqNkQ 2QJtEPKau957T/tJrbmEGPBNWX+mz6P4EhdKgDB37pGaRPwy8O8aPuqmS/VpaLprHaD6 fhc4GRVgk9lE3wfbduBK1dgd8uIgq0espeunDvvCie0wtEOplULrLQCip/0yzxTdz0k9 cK4hSHRMc9uNNTaNdx54nHn1NurO4dkUNp0+KydSlWSMGCIc3O9Z9hpxTVMpcXEQ4FmS RfEdfV4zfSF+pPMhZ2BOkgbio15FCfVQ7GekTsOJw/bNR+3zcZSmD895m2dKxL6gVe9n TxTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=CjAbBAyf9h3Io3GHk6o9DdwCGLZNoflbTVZJwrfFh04=; fh=fqeZCenVAje0wh8su/Pa71/r5NwHzGc734x0pzf/+J4=; b=TsEkdPmleLadiN+DmhMueqUMNWiMrpFCMvcbAjy+63oy8g2c/UTI1xM/Mug/nPwuvv p3VfFJMBlykQUohQAWMOL2H59uqXmK8Kr6CJC64lqedqlk8l0AHUMYDri/s+EYJ3xCa6 th2yjHbCBqeRKTMrRfgcqVhdK9EdtqlawxGvcr8e3CbBWGivbluXGardB3MGT2fC8QS4 VAcBiEaY92ZPyZiuDLiDJngci0Aajl+V9io4qbnfWhFfRC/nYcZRUN4y77ffNM3p1lo0 voPHHDvwEDhIatjVZVJJiBYCKwsjGaWhgyeR5DHwfmLZnW1i7WAsDwW98Baa14Ahhgbl OktQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772500111; x=1773104911; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=CjAbBAyf9h3Io3GHk6o9DdwCGLZNoflbTVZJwrfFh04=; b=JidfigYfEd+CQ0AxF5JAeVHUbfWqw69Ga7IO1doYY+lPbNhZ4/0EiGkMvWMgIPJ05W STbZyVpHuOwbXZtSuA9SwiHGhzkqsnvocWGcAg2tlmdfCGuHqFEHBnSHQ8m0UvVTSHBO 7IzahhHoiTWLNAPYtmqdIY7WMIssOwLtoL+wcWMkEaMinW88AJbJ02Et9MqnTvblje99 lbgNK/Q/3x0XH/XCBc0Xx2wBPu8kIQYGcXUrnRVW0QNoYQ0yPA5yQQKAYrN5eutenAao kcVcVB5s0+igSkfjguzvKD8U6B5a0sZywuUlEue04pXcxYbucJnwQM0HXcGWx+1GjP06 F8BA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772500111; x=1773104911; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CjAbBAyf9h3Io3GHk6o9DdwCGLZNoflbTVZJwrfFh04=; b=ZNpsoMngp5h2GGxvBG7dF76ZvonfcMpLtVMPre3N2aK7rhxdZc2382opSxZxzXURTC EcYdMNbglesBi+0Ftmeb58dYdyz8W0scS4mbncAAohXWYp+PorIh9yqKtHrFByFN8qZ4 A+TZ1fkMAcWBBw5/ItI8vw3UDYLgT5jE9kHXNTaNHqryb/5Lj2g21f6pCemh+ran/+Tn 8ZSImVGPcMwA1m6JL14Noe3shp/hQGDXSjtXXb/YZWW4v1FZ8TptMON01Cse0XZcS2gW zOT1rplfLpeM5hWteIheLqMwAG84SJoEMspyGIZ8a+QxuhJMZ0vzRHljCZPIX4qLuwbO gQzg== X-Forwarded-Encrypted: i=1; AJvYcCWvsvFNQkeW3CuaBryajMFC739ueV0HVmtBhOtUl87Dh1jMovoyg30zrQniOIdspwbFKNDIsOhCGiDQUDn9@lists.postgresql.org X-Gm-Message-State: AOJu0YwtxC8Krca9HWBKIdZbShtezsD3OAU6zQc2WhhfCEKpELCnVwaI pJF3S14YrJ5OCkUIBgHEaDYZeJ2YtjqjHbVJDzn1Q5gRTdOmgilnMofqQSJrkGr/z0dv5tDkkTz CZudTKWgVApSEd7eZaRRxDbFXaNK7cQ0= X-Gm-Gg: ATEYQzzxdnSbSEOv8msN+YMYFyVpBvDrrhHtAHl6RNqpKw4GoFQWPVoi9lcfThylCyW WUnlMJ7mVN9inRuBNQnzg9tx+oWlVo69I07rTwmt8aMyebnZCmxckynu4LdYLzFcegbKr34dbF6 GDxjAHQvDMOlzpzs5eW5gmlgHvkyd1p/xtk5AFMd8Mwf0rD5RysNvdmdNu9txb9CYCOwmdJLDVj OBiSVNQo6MIR299bu/gJVL4mFxDEUMfaz4pI9WG5u+OwCVPQYueAp+ydNayGFIBKfcu3EK1plc/ t4soFP3nezl1WApU+KIkejc0WchX4DYkaPsIacY5k+Yz4sRf8HnoeBbBucAOlmxtXyMQJCCtAQ= = X-Received: by 2002:a05:6000:1888:b0:439:bc31:a05a with SMTP id ffacd0b85a97d-439bc31a3ffmr6795241f8f.43.1772500110431; Mon, 02 Mar 2026 17:08:30 -0800 (PST) MIME-Version: 1.0 References: <52596e08-0902-475b-ad0f-265ec6918f19@gmail.com> <7791ec74-369d-48c7-969c-cf4ad18024f2@tantorlabs.com> <6b3aebb5-f26c-4f50-abd6-e733d452af26@gmail.com> <52f3c637-465d-4f0e-9546-732ecf61ab87@tantorlabs.com> <3fc74079-f429-48c8-ad8c-688b21d4d1c1@gmail.com> <4c761b02-5a60-4076-aa0c-9c6fef06e2c1@tantorlabs.com> <2bdf4a7e-f5b9-4e75-b5a3-1e2608fd3297@tantorlabs.com> In-Reply-To: <2bdf4a7e-f5b9-4e75-b5a3-1e2608fd3297@tantorlabs.com> From: David Rowley Date: Tue, 3 Mar 2026 14:08:19 +1300 X-Gm-Features: AaiRm53Yw80AqP3Cm3K1rcclND_KL9IBsH1kQze00SyuUR-3TSyo2qsUkz6qBn4 Message-ID: Subject: Re: Reduce planning time for large NOT IN lists containing NULL To: Ilia Evdokimov Cc: Zsolt Parragi , David Geier , PostgreSQL Developers Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, 3 Mar 2026 at 04:04, Ilia Evdokimov wrote: > I've fixed this in v5-patch. I had a look at this and wondered if we guarantee that no rows will match, then why can't we perform constant folding on the ScalarArrayOpExpr when !useOr and the array contains a NULL element and the operator is strict. Seemingly, one of the reasons for that is down to the expression returning NULL vs false. Take the following two tests from expressions.out: select return_int_input(1) not in (10, 9, 2, 8, 3, 7, 4, 6, 5, 2, null); ?column? ---------- (1 row) select return_int_input(1) not in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null); ?column? ---------- f (1 row) Here we see that we return false when we find the left operand in the array, but NULL when we don't find it and the array contains NULL. So, unless the left operand is a const, we wouldn't know how to simplify the ScalarArrayOpExpr during planning as the false or NULL would only be known during evaluation of the expression. However, when the expression being simplified is an EXPRKIND_QUAL, it shouldn't matter if the result is false or NULL as both mean the same and there shouldn't be any code that cares about the difference. Currently, we don't pass the "kind" down into eval_const_expressions(), but I don't really see why we couldn't. It would be a fair bit of work figuring out with confidence what the extra arg should be passed as in all the existing call sites of that function. We'd have to document in the header comment for eval_const_expressions() that constant-folding on EXPRKIND_QUAL expressions can enable additional optimisations which disregard the difference between NULL and false. For the patch, I imagine it's still a useful optimisation as the ScalarArrayOpExpr might not be in an EXPRKIND_QUAL. There are a couple of things I don't like: 1) The new test is in expressions.sql. The comment at the top of that file reads: "expression evaluation tests that don't fit into a more specific file". The new test isn't anything to do with expression evaluation. It's about planner estimation. I see that misc_function.sql has the explain_mask_costs() function. I'm not sure that's the right place either, as the usages of that function are for testing SupportRequestRows prosupport functions. I wonder if we need a dedicated row_estimate.sql or selectivity_est.sql file. The explain_mask_costs() wouldn't be out of place if they were moved into a new test like that. It was me that started putting those in misc_function.sql, and I don't object to them being moved to a new test. I'd be as a separate commit, however. 2) The new test creates a new table and inserts 1000 rows. There does not seem to be anything special about the new table. Why don't you use one of the ones from test_setup.sql? 3) Looking at var_eq_const(), it seems like it's coded to assume the operator is always strict, per "If the constant is NULL, assume operator is strict and return zero". If that's good enough for var_eq_const(), then it should be good enough for the new code. I think it would be good if you wrote that or something similar in the new code so that the reader knows taking the short-circuit with non-strict functions is on purpose. David