Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rh7VT-009hpy-Sm for pgsql-hackers@arkaria.postgresql.org; Mon, 04 Mar 2024 12:33:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rh7VR-007TFo-B8 for pgsql-hackers@arkaria.postgresql.org; Mon, 04 Mar 2024 12:33:21 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rh7VQ-007TCo-7r for pgsql-hackers@lists.postgresql.org; Mon, 04 Mar 2024 12:33:21 +0000 Received: from fhigh7-smtp.messagingengine.com ([103.168.172.158]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rh7VM-002nQK-Io for pgsql-hackers@lists.postgresql.org; Mon, 04 Mar 2024 12:33:19 +0000 Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 3300E11400EE; Mon, 4 Mar 2024 07:33:16 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Mon, 04 Mar 2024 07:33:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eisentraut.org; h=cc:cc:content-transfer-encoding:content-type:content-type :date:date:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:subject:subject:to:to; s=fm3; t=1709555596; x=1709641996; bh=U5J0R4UJk0sVh3cIqRUbi2p8c/LZDwef nYZ/ID0ORxU=; b=Cwv11YB0LAQk2sCZhI3rd+08TqFPj3IyaqXc7m93fODGSZzV K83nNSdQJjowensjdY+klwvSyNsxj3mSy43A1cSK171a6M9GOfq0uOvFzhg6tZiq C7b4XRVKHqAt++1CBtqyzKHxj+b7SMfy5jmeCZ4zkgddUTKW0bPvrdzINphi656v U0GslvuqkBPni8n5e1Cikwa7VKKbcKlhPL5N58jC4M24uE7Xb9ty1dJtpWvrP9Zk VKxRAiCclvIK1o61R8G82tWF93ctn2RelhEvT+U/cH9uuMrahOsu6gNb5OTNfJ+y PNZTRuCyKeOLM9tDi8MK9cruiMzY+6fUvxQ/Ow== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1709555596; x= 1709641996; bh=U5J0R4UJk0sVh3cIqRUbi2p8c/LZDwefnYZ/ID0ORxU=; b=H cnWv4ilUQj6kuqRESdHTe0wMiLW/Wxi/DAb/B30fNw185raOHJNZ7tTJHFGDFkyT NYBi7av5lJ9ye3Z2IcdvOrsj104sdMm9WD4hUERI5ibbJ6bTYFDIhi6fRKXb217r lPXeHfRAuCELGrG93bXomhVKwWjiXA/KBmWpPRyLVa40S1y7Gt5wPBFXkyBtTGUd 51O/8yvXEW8i6sxvBkoXEbT7e8FLOvEzHJD3zm50K4FLVEghFHtqDUTSRN9B3gtv 13WjNlpwamvAKoy0wMVZJ4w0U7YQ1K5ovYnSRVyYg753ulFGFfxImU3aRut4bSWU HZScT0XbhkH7DodO2h5bw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrheejgdegtdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddvjeenucfhrhhomheprfgvthgv rhcugfhishgvnhhtrhgruhhtuceophgvthgvrhesvghishgvnhhtrhgruhhtrdhorhhgqe enucggtffrrghtthgvrhhnpeeljedtudfggffffefhieetfeehtedukeejuedtlefhgefg ffdujeehueefudffkeenucffohhmrghinhepphhoshhtghhrvghsqhhlrdhorhhgnecuve hluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphgvthgvrhes vghishgvnhhtrhgruhhtrdhorhhg X-ME-Proxy: Feedback-ID: ie0a040ee:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 4 Mar 2024 07:33:14 -0500 (EST) Message-ID: <8102ff5b-b156-409e-a48f-e53e63a39b36@eisentraut.org> Date: Mon, 4 Mar 2024 13:33:12 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Extract numeric filed in JSONB more effectively Content-Language: en-US To: Andy Fan , jian he Cc: Chapman Flack , pgsql-hackers@lists.postgresql.org References: <169880504467.94392.3769687331705514588.pgcf@coridan.postgresql.org> <87a5rry0bz.fsf@163.com> <87h6jpob9x.fsf@163.com> <87r0hmvuvr.fsf@163.com> From: Peter Eisentraut In-Reply-To: <87r0hmvuvr.fsf@163.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 09.02.24 10:05, Andy Fan wrote: > 2. Where is the current feature blocked for the past few months? > > It's error message compatible issue! Continue with above setup: > > master: > > select * from tb where (a->'b')::numeric > 3::numeric; > ERROR: cannot cast jsonb string to type numeric > > select * from tb where (a->'b')::int4 > 3::numeric; > ERROR: cannot cast jsonb string to type integer > > You can see the error message is different (numeric vs integer). > > > Patched: > > We still can get the same error message as master BUT the code > looks odd. > > select * from tb where (a->'b')::int4 > 3; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------- > Seq Scan on public.tb > Output: a > Filter: ((jsonb_finish_numeric(jsonb_object_field_start((tb.a)::internal, 'b'::text), '23'::oid))::integer > 3) > (3 rows) > > You can see "jsonb_finish_numeric(.., '23::oid)" the '23::oid' is just > for the *"integer"* output in error message: > > "cannot cast jsonb string to type*integer*" > > Now the sistuation is either we use the odd argument (23::oid) in > jsonb_finish_numeric, or we use a incompatible error message with the > previous version. I'm not sure which way is better, but this is the > place the current feature is blocked. I'm not bothered by that. It also happens on occasion in the backend C code that we pass around extra information to be able to construct better error messages. The functions here are not backend C code, but they are internal functions, so similar considerations can apply. But I have a different question about this patch set. This has some overlap with the JSON_VALUE function that is being discussed at [0][1]. For example, if I apply the patch v39-0001-Add-SQL-JSON-query-functions.patch from that thread, I can run select count(*) from tb where json_value(a, '$.a' returning numeric) = 2; and I get a noticeable performance boost over select count(*) from tb where cast (a->'a' as numeric) = 2; So some questions to think about: 1. Compare performance of base case vs. this patch vs. json_value. 2. Can json_value be optimized further? 3. Is this patch still needed? 3a. If yes, should the internal rewriting make use of json_value or share code with it? [0]: https://www.postgresql.org/message-id/flat/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com [1]: https://commitfest.postgresql.org/47/4377/