public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Extract numeric filed in JSONB more effectively
9+ messages / 4 participants
[nested] [flat]

* Re: Extract numeric filed in JSONB more effectively
@ 2023-08-03 13:52 Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  0 siblings, 1 reply; 9+ messages in thread

From: Pavel Stehule @ 2023-08-03 13:52 UTC (permalink / raw)
  To: Andy Fan <[email protected]>; +Cc: jian he <[email protected]>; pgsql-hackers

Hi

čt 3. 8. 2023 v 15:23 odesílatel Andy Fan <[email protected]> napsal:

> Hi:
>
>
>> More, I believe so lot of people uses more common syntax, and then this
>> syntax should to have good performance - for jsonb - (val->'op')::numeric
>> works, and then there should not be performance penalty, because this
>> syntax will be used in 99%.
>>
>
> This looks like a valid opinion IMO,  but to rescue it, we have to do
> something like "internal structure" and remove the existing cast.
> But even we pay the effort, it still breaks some common knowledge,
> since xx:numeric is not a cast.  It is an "internal structure"!
>

I didn't study jsonb function, but there is an xml function that extracts
value and next casts it to some target type. It does what is expected - for
known types use hard coded casts, for other ask system catalog for cast
function or does IO cast. This code is used for the XMLTABLE function. The
JSON_TABLE function is not implemented yet, but there should be similar
code. If you use explicit cast, then the code should not be hard, in the
rewrite stage all information should be known.



>
> I don't think "Black magic" is a proper word here, since it is not much
>>> different from ->> return a text.  If you argue text can be cast to
>>> most-of-types,  that would be a reason, but I doubt this difference
>>> should generate a "black magic".
>>>
>>
>> I used the term black magic, because nobody without reading documentation
>> can find this operator.
>>
>
> I think this is what document is used for..
>
>
>> It is used just for this special case, and the functionality is the same
>> as using cast (only with different performance).
>>
>
> This is not good, but I didn't see a better choice so far,  see my first
> graph.
>
>
>>
>> The operator ->> is more widely used. But if we have some possibility to
>> work without it, then the usage for a lot of users will be more simple.
>> More if the target types can be based on context
>>
>
> It would be cool but still I didn't see a way to do that without making
> something else complex.
>

sure - it is significantly more work, but it should be usable for all types
and just use common syntax. The custom @-> operator you can implement in
your own custom extension. Builtin solutions should be generic as it is
possible.

The things should be as simple as possible - mainly for users, that missing
knowledge, and any other possibility of how to do some task just increases
their confusion. Can be nice if users find one solution on stack overflow
and this solution should be great for performance too. It is worse if users
find more solutions, but it is not too bad, if these solutions have similar
performance. It is too bad if any solution has great performance and others
not too much. Users has not internal knowledge, and then don't understand
why sometimes should to use special operator and not common syntax.


>
>
>>>> Maybe we can introduce some *internal operator* "extract to type", and
>>>> in rewrite stage we can the pattern (x->'field')::type transform to OP(x,
>>>> 'field', typid)
>>>>
>>>
>>> Not sure what the OP should be?  If it is a function, what is the
>>> return value?  It looks to me like it is hard to do in c language?
>>>
>>
>> It should be internal structure - it can be similar like COALESCE or IS
>> operator
>>
>
> It may work, but see my answer in the first graph.
>

>
>>
>>
>>>
>>> After all,  if we really care about the number of operators, I'm OK
>>> with just let users use the function directly, like
>>>
>>> jsonb_field_as_numeric(jsonb, 'filedname')
>>> jsonb_field_as_timestamp(jsonb, 'filedname');
>>> jsonb_field_as_timestamptz(jsonb, 'filedname');
>>> jsonb_field_as_date(jsonb, 'filedname');
>>>
>>> it can save an operator and sloves the readable issue.
>>>
>>
>> I don't like it too much, but it is better than introduction new operator
>>
>
> Good to know it.  Naming operators is a complex task  if we add four.
>
>
>> We already have the jsonb_extract_path and jsonb_extract_path_text
>> function.
>>
>
> I can't follow this.  jsonb_extract_path returns a jsonb, which is  far
> away from
> our goal: return a numeric effectively?
>

I proposed `jsonb_extract_path_type` that is of anyelement type.

Regards

Pavel



> I can imagine to usage "anyelement" type too. some like
>> `jsonb_extract_path_type(jsonb, anyelement, variadic text[] )`
>>
>
> Can you elaborate this please?
>
> --
> Best Regards
> Andy Fan
>


^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
@ 2023-08-03 14:27 ` Andy Fan <[email protected]>
  2023-08-03 19:13   ` Re: Extract numeric filed in JSONB more effectively Tom Lane <[email protected]>
  2023-08-04 03:54   ` Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  0 siblings, 2 replies; 9+ messages in thread

From: Andy Fan @ 2023-08-03 14:27 UTC (permalink / raw)
  To: Pavel Stehule <[email protected]>; +Cc: jian he <[email protected]>; pgsql-hackers

Hi:


>  If you use explicit cast, then the code should not be hard, in the
> rewrite stage all information should be known.
>

Can you point to me where the code is for the XML stuff?  I thought
this is a bad idea but I may accept it if some existing code does
such a thing already.   "such thing"  is  typeA:typeB is
converted something else but user can't find out an entry in
pg_cast for typeA to typeB.


> It would be cool but still I didn't see a way to do that without making
>> something else complex.
>>
>
>  The custom @-> operator you can implement in your own custom extension.
> Builtin solutions should be generic as it is possible.
>

I agree, but actually I think there is no clean way to do it, at least I
dislike the conversion of typeA to typeB in a cast syntax but there
is no entry in pg_cast for it.  Are you saying something like this
or I misunderstood you?

>

-- 
Best Regards
Andy Fan


^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
@ 2023-08-03 19:13   ` Tom Lane <[email protected]>
  2023-08-04 04:38     ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  1 sibling, 1 reply; 9+ messages in thread

From: Tom Lane @ 2023-08-03 19:13 UTC (permalink / raw)
  To: Andy Fan <[email protected]>; +Cc: Pavel Stehule <[email protected]>; jian he <[email protected]>; pgsql-hackers

Andy Fan <[email protected]> writes:
>> If you use explicit cast, then the code should not be hard, in the
>> rewrite stage all information should be known.

> Can you point to me where the code is for the XML stuff?

I think Pavel means XMLTABLE, which IMO is an ugly monstrosity of
syntax --- but count on the SQL committee to do it that way :-(.

As far as the current discussion goes, I'm strongly against
introducing new functions or operators to do the same things that
we already have perfectly good syntax for.  "There's more than one
way to do it" isn't necessarily a virtue, and for sure it isn't a
virtue if people have to rewrite their existing queries to make use
of your optimization.

Also, why stop at optimizing "(jsonbval->'fld')::sometype"?  There are
many other extraction cases that people might wish were faster, such
as json array indexing, nested fields, etc.  It certainly won't make
sense to introduce yet another set of functions for each pattern you
want to optimize --- or at least, we won't want to ask users to change
their queries to invoke those functions explicitly.

I do like the idea of attaching a Simplify planner support function
to jsonb_numeric (and any other ones that seem worth optimizing)
that can convert a stack of jsonb transformations into a bundled
operation that avoids unnecessary conversions.  Then you get the
speedup without any need for people to change their existing queries.
We'd still have functions like jsonb_field_as_numeric() under the
hood, but there's not an expectation that users call them explicitly.
(Alternatively, the output of this Simplify could be a new kind of
expression node that bundles one or more jsonb extractions with a
type conversion.  I don't have an opinion yet on which way is better.)

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-03 19:13   ` Re: Extract numeric filed in JSONB more effectively Tom Lane <[email protected]>
@ 2023-08-04 04:38     ` Andy Fan <[email protected]>
  2023-08-07 03:04       ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  0 siblings, 1 reply; 9+ messages in thread

From: Andy Fan @ 2023-08-04 04:38 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Pavel Stehule <[email protected]>; jian he <[email protected]>; pgsql-hackers

Hi Tom:

On Fri, Aug 4, 2023 at 3:13 AM Tom Lane <[email protected]> wrote:

> Andy Fan <[email protected]> writes:
> >> If you use explicit cast, then the code should not be hard, in the
> >> rewrite stage all information should be known.
>
> > Can you point to me where the code is for the XML stuff?
>
> I think Pavel means XMLTABLE, which IMO is an ugly monstrosity of
> syntax --- but count on the SQL committee to do it that way :-(.
>

Thanks for this input!


>
> As far as the current discussion goes, I'm strongly against
> introducing new functions or operators to do the same things that
> we already have perfectly good syntax for.  "There's more than one
> way to do it" isn't necessarily a virtue, and for sure it isn't a
> virtue if people have to rewrite their existing queries to make use
> of your optimization.
>

I agree, this is always the best/only reason I'd like to accept.


>
>
> I do like the idea of attaching a Simplify planner support function
> to jsonb_numeric (and any other ones that seem worth optimizing)
>

I have a study planner support function today,  that looks great and
I don't think we need much work to do to get our goal, that's amzing.

For all the people who are interested in this topic, I will post a
planner support function soon,  you can check that then.

-- 
Best Regards
Andy Fan


^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-03 19:13   ` Re: Extract numeric filed in JSONB more effectively Tom Lane <[email protected]>
  2023-08-04 04:38     ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
@ 2023-08-07 03:04       ` Andy Fan <[email protected]>
  2023-08-07 06:19         ` Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  0 siblings, 1 reply; 9+ messages in thread

From: Andy Fan @ 2023-08-07 03:04 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Pavel Stehule <[email protected]>; jian he <[email protected]>; pgsql-hackers

Hi:


> For all the people who are interested in this topic, I will post a
> planner support function soon,  you can check that then.
>
>
The updated patch doesn't need users to change their codes and can get
better performance. Thanks for all the feedback which makes things better.

To verify there is no unexpected stuff happening, here is the performance
comparison between master and patched.

create table tb(a jsonb);
insert into tb select '{"a": true, "b": 23.3333}' from generate_series(1,
100000)i;

Master:
select 1 from tb where  (a->'b')::numeric = 1;
Time: 31.020 ms

select 1 from tb where not (a->'a')::boolean;
Time: 25.888 ms

select 1 from tb where  (a->'b')::int2 = 1;
Time: 30.138 ms

select 1 from tb where  (a->'b')::int4 = 1;
Time: 32.384 ms

select 1 from tb where  (a->'b')::int8 = 1;\
Time: 29.922 ms

select 1 from tb where  (a->'b')::float4 = 1;
Time: 54.139 ms

select 1 from tb where  (a->'b')::float8 = 1;
Time: 66.933 ms

Patched:

select 1 from tb where  (a->'b')::numeric = 1;
Time: 15.203 ms

select 1 from tb where not (a->'a')::boolean;
Time: 12.894 ms

select 1 from tb where  (a->'b')::int2 = 1;
Time: 16.847 ms

select 1 from tb where  (a->'b')::int4 = 1;
Time: 17.105 ms

select 1 from tb where  (a->'b')::int8 = 1;
Time: 16.720 ms

select 1 from tb where  (a->'b')::float4 = 1;
Time: 33.409 ms

select 1 from tb where  (a->'b')::float8 = 1;
Time: 34.660 ms

-- 
Best Regards
Andy Fan


Attachments:

  [application/octet-stream] v3-0001-Optimize-extracting-a-given-data-type-from-jsonb.patch (16.9K, 3-v3-0001-Optimize-extracting-a-given-data-type-from-jsonb.patch)
  download | inline diff:
From dcd8d776feaf8c295b51fa2544846c6173ee664c Mon Sep 17 00:00:00 2001
From: Andy Fan <[email protected]>
Date: Mon, 7 Aug 2023 10:44:23 +0800
Subject: [PATCH v3] Optimize extracting a given data type from jsonb.

Previously after we get a JsonbValue, we need to convert it to
Jsonb first then cast the Jsonb to the given type. In this patch,
we covert the JsonbValue to the desired type directly.
---
 src/backend/utils/adt/jsonb.c            | 134 +++++++++++++++++++++++
 src/include/catalog/catversion.h         |   2 +-
 src/include/catalog/pg_proc.dat          |  22 ++--
 src/test/regress/expected/jsonb.out      | 134 ++++++++++++-----------
 src/test/regress/expected/opr_sanity.out |   7 +-
 src/test/regress/sql/jsonb.sql           |  36 +++---
 6 files changed, 244 insertions(+), 91 deletions(-)

diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 9781852b0cb..a577b2a279f 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -17,11 +17,14 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "funcapi.h"
+#include "nodes/makefuncs.h"
+#include "nodes/supportnodes.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/datetime.h"
+#include "utils/fmgroids.h"
 #include "utils/json.h"
 #include "utils/jsonb.h"
 #include "utils/jsonfuncs.h"
@@ -2038,6 +2041,137 @@ cannotCastJsonbValue(enum jbvType type, const char *sqltype)
 	elog(ERROR, "unknown jsonb type: %d", (int) type);
 }
 
+static bool
+jsonb_cast_is_optimized(Oid target_type)
+{
+	switch(target_type)
+	{
+		case NUMERICOID:
+		case BOOLOID:
+		case INT2OID:
+		case INT4OID:
+		case INT8OID:
+		case FLOAT4OID:
+		case FLOAT8OID:
+			return true;
+		default:
+			return false;
+	}
+}
+
+Datum
+jsonb_cast_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+
+	if (IsA(rawreq, SupportRequestSimplify))
+	{
+		SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
+		FuncExpr	*fexpr = palloc(sizeof(FuncExpr));
+		OpExpr		*opexpr;
+
+		memcpy(fexpr, req->fcall, sizeof(FuncExpr));
+
+		opexpr = (OpExpr *) linitial(fexpr->args);
+
+		/*
+		 * Simplify cast(jsonb_object_filed(jsonb, filedName) as type)
+		 * to jsonb_object_field_type(jsonb, filedName, targetTypeOid);
+		 */
+		if (IsA(opexpr, OpExpr) &&
+			opexpr->opfuncid  == F_JSONB_OBJECT_FIELD &&
+			jsonb_cast_is_optimized(fexpr->funcresulttype))
+		{
+			fexpr->funcid = F_JSONB_OBJECT_FIELD_TYPE;
+			fexpr->args = opexpr->args;
+
+			/* Tell the jsonb_object_field_type what is the target type. */
+			fexpr->args = lappend(fexpr->args, makeConst(OIDOID, 0, 0, sizeof(Oid),
+														 fexpr->funcresulttype,
+														 false, true));
+		}
+
+		PG_RETURN_POINTER(fexpr);
+	}
+
+	PG_RETURN_POINTER(NULL);
+}
+
+Datum
+jsonb_object_field_type(PG_FUNCTION_ARGS)
+{
+	Jsonb	   *jb = PG_GETARG_JSONB_P(0);
+	text	   *key = PG_GETARG_TEXT_PP(1);
+	Oid			targetOid = PG_GETARG_OID(2);
+
+	JsonbValue *v;
+	JsonbValue	vbuf;
+
+	if (!JB_ROOT_IS_OBJECT(jb))
+		PG_RETURN_NULL();
+
+	v = getKeyJsonValueFromContainer(&jb->root,
+									 VARDATA_ANY(key),
+									 VARSIZE_ANY_EXHDR(key),
+									 &vbuf);
+
+	if (v == NULL)
+		PG_RETURN_NULL();
+
+	switch(targetOid)
+	{
+		Datum	retValue;
+
+		case BOOLOID:
+			if (v->type != jbvBool)
+				cannotCastJsonbValue(v->type, "bool");
+			PG_RETURN_BOOL(v->val.boolean);
+
+		case NUMERICOID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "numeric");
+			PG_RETURN_NUMERIC(v->val.numeric);
+		case INT2OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "smallint");
+			retValue = DirectFunctionCall1(numeric_int2,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+		case INT4OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "integer");
+			retValue = DirectFunctionCall1(numeric_int4,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		case INT8OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "bigint");
+			retValue = DirectFunctionCall1(numeric_int8,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		case FLOAT4OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "real");
+			retValue = DirectFunctionCall1(numeric_float4,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		case FLOAT8OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "double precision");
+			retValue = DirectFunctionCall1(numeric_float8,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		default:
+			elog(ERROR, "cast jsonb field to %d is not supported.", targetOid);
+	}
+
+	PG_RETURN_POINTER(0);
+}
+
 Datum
 jsonb_bool(PG_FUNCTION_ARGS)
 {
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index f507b49bb28..391dfb81b2a 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -57,6 +57,6 @@
  */
 
 /*							yyyymmddN */
-#define CATALOG_VERSION_NO	202307261
+#define CATALOG_VERSION_NO	202308071
 
 #endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 6996073989a..108280b355d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4575,25 +4575,26 @@
   proname => 'pg_lsn', prorettype => 'pg_lsn', proargtypes => 'numeric',
   prosrc => 'numeric_pg_lsn' },
 
-{ oid => '3556', descr => 'convert jsonb to boolean',
+{ oid => '3556', descr => 'convert jsonb to boolean', prosupport => 'jsonb_cast_support',
   proname => 'bool', prorettype => 'bool', proargtypes => 'jsonb',
   prosrc => 'jsonb_bool' },
 { oid => '3449', descr => 'convert jsonb to numeric',
-  proname => 'numeric', prorettype => 'numeric', proargtypes => 'jsonb',
+  proname => 'numeric', prosupport => 'jsonb_cast_support',
+  prorettype => 'numeric', proargtypes => 'jsonb',
   prosrc => 'jsonb_numeric' },
-{ oid => '3450', descr => 'convert jsonb to int2',
+{ oid => '3450', descr => 'convert jsonb to int2', prosupport => 'jsonb_cast_support',
   proname => 'int2', prorettype => 'int2', proargtypes => 'jsonb',
   prosrc => 'jsonb_int2' },
-{ oid => '3451', descr => 'convert jsonb to int4',
+{ oid => '3451', descr => 'convert jsonb to int4', prosupport => 'jsonb_cast_support',
   proname => 'int4', prorettype => 'int4', proargtypes => 'jsonb',
   prosrc => 'jsonb_int4' },
-{ oid => '3452', descr => 'convert jsonb to int8',
+{ oid => '3452', descr => 'convert jsonb to int8', prosupport => 'jsonb_cast_support',
   proname => 'int8', prorettype => 'int8', proargtypes => 'jsonb',
   prosrc => 'jsonb_int8' },
-{ oid => '3453', descr => 'convert jsonb to float4',
+{ oid => '3453', descr => 'convert jsonb to float4', prosupport => 'jsonb_cast_support',
   proname => 'float4', prorettype => 'float4', proargtypes => 'jsonb',
   prosrc => 'jsonb_float4' },
-{ oid => '2580', descr => 'convert jsonb to float8',
+{ oid => '2580', descr => 'convert jsonb to float8', prosupport => 'jsonb_cast_support',
   proname => 'float8', prorettype => 'float8', proargtypes => 'jsonb',
   prosrc => 'jsonb_float8' },
 
@@ -9928,6 +9929,13 @@
   proname => 'jsonb_object_field_text', prorettype => 'text',
   proargtypes => 'jsonb text', proargnames => '{from_json, field_name}',
   prosrc => 'jsonb_object_field_text' },
+{ oid => '3813', descr => 'return a given type specified in desired_type from jsonb field',
+  proname => 'jsonb_object_field_type', prorettype => 'anyelement',
+  proargtypes => 'jsonb text oid', proargnames => '{from_json, field_name, desired_type}',
+  prosrc => 'jsonb_object_field_type'},
+{ oid => '3814', descr => 'planner support for numeric(jsonb)',
+  proname => 'jsonb_cast_support', prorettype => 'internal',
+  proargtypes => 'internal', prosrc => 'jsonb_cast_support' },
 { oid => '3215',
   proname => 'jsonb_array_element', prorettype => 'jsonb',
   proargtypes => 'jsonb int4', proargnames => '{from_json, element_index}',
diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index 4a16d0dbafb..5a144534cc1 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -5471,107 +5471,113 @@ select ts_headline('[]'::jsonb, tsquery('aaa & bbb'));
 (1 row)
 
 -- casts
-select 'true'::jsonb::bool;
- bool 
-------
- t
+select 'true'::jsonb::bool, ('{"a": true}'::jsonb->'a')::bool;
+ bool | bool 
+------+------
+ t    | t
 (1 row)
 
 select '[]'::jsonb::bool;
 ERROR:  cannot cast jsonb array to type boolean
-select '1.0'::jsonb::float;
- float8 
---------
-      1
+select ('{"a": []}'::jsonb->'a')::bool;
+ERROR:  cannot cast jsonb array to type boolean
+select '1.0'::jsonb::float, ('{"a": 1.0}'::jsonb->'a')::float;
+ float8 | float8 
+--------+--------
+      1 |      1
 (1 row)
 
 select '[1.0]'::jsonb::float;
 ERROR:  cannot cast jsonb array to type double precision
-select '12345'::jsonb::int4;
- int4  
--------
- 12345
+select ('{"a": [1.0]}'::jsonb->'a')::float;
+ERROR:  cannot cast jsonb array to type double precision
+select '12345'::jsonb::int4,  ('{"a": 12345}'::jsonb->'a')::int4;
+ int4  | int4  
+-------+-------
+ 12345 | 12345
 (1 row)
 
 select '"hello"'::jsonb::int4;
 ERROR:  cannot cast jsonb string to type integer
-select '12345'::jsonb::numeric;
- numeric 
----------
-   12345
+select ('{"a": "hello"}'::jsonb->'a')::int4;
+ERROR:  cannot cast jsonb string to type integer
+select '12345'::jsonb::numeric, ('{"a": 12345}'::jsonb->'a')::numeric;
+ numeric | numeric 
+---------+---------
+   12345 |   12345
 (1 row)
 
 select '{}'::jsonb::numeric;
 ERROR:  cannot cast jsonb object to type numeric
-select '12345.05'::jsonb::numeric;
- numeric  
-----------
- 12345.05
+select '12345.05'::jsonb::numeric, ('{"a": 12345.05}'::jsonb->'a')::numeric;
+ numeric  | numeric  
+----------+----------
+ 12345.05 | 12345.05
 (1 row)
 
-select '12345.05'::jsonb::float4;
-  float4  
-----------
- 12345.05
+select '12345.05'::jsonb::float4, ('{"a": 12345.05}'::jsonb->'a')::float4;
+  float4  |  float4  
+----------+----------
+ 12345.05 | 12345.05
 (1 row)
 
-select '12345.05'::jsonb::float8;
-  float8  
-----------
- 12345.05
+select '12345.05'::jsonb::float8, ('{"a": 12345.05}'::jsonb->'a')::float8;
+  float8  |  float8  
+----------+----------
+ 12345.05 | 12345.05
 (1 row)
 
-select '12345.05'::jsonb::int2;
- int2  
--------
- 12345
+select '12345.05'::jsonb::int2, ('{"a": 12345.05}'::jsonb->'a')::int2;
+ int2  | int2  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.05'::jsonb::int4;
- int4  
--------
- 12345
+select '12345.05'::jsonb::int4, ('{"a": 12345.05}'::jsonb->'a')::int4;
+ int4  | int4  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.05'::jsonb::int8;
- int8  
--------
- 12345
+select '12345.05'::jsonb::int8, ('{"a": 12345.05}'::jsonb->'a')::int8;
+ int8  | int8  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric;
-                       numeric                        
-------------------------------------------------------
- 12345.0000000000000000000000000000000000000000000005
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::numeric;
+                       numeric                        |                       numeric                        
+------------------------------------------------------+------------------------------------------------------
+ 12345.0000000000000000000000000000000000000000000005 | 12345.0000000000000000000000000000000000000000000005
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4;
- float4 
---------
-  12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4,  ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float4;
+ float4 | float4 
+--------+--------
+  12345 |  12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8;
- float8 
---------
-  12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float8;
+ float8 | float8 
+--------+--------
+  12345 |  12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2;
- int2  
--------
- 12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int2;
+ int2  | int2  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4;
- int4  
--------
- 12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int4;
+ int4  | int4  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
- int8  
--------
- 12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int8;
+ int8  | int8  
+-------+-------
+ 12345 | 12345
 (1 row)
 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index a1bdf2c0b5f..b4c36472eb2 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -309,8 +309,8 @@ WHERE p1.prorettype IN
      'anyrange'::regtype = ANY (p1.proargtypes) OR
      'anymultirange'::regtype = ANY (p1.proargtypes))
 ORDER BY 2;
- oid  |    proname     
-------+----------------
+ oid  |         proname         
+------+-------------------------
  2296 | anyarray_in
  2502 | anyarray_recv
  2312 | anyelement_in
@@ -320,7 +320,8 @@ ORDER BY 2;
  2400 | array_recv
  3506 | enum_in
  3532 | enum_recv
-(9 rows)
+ 3813 | jsonb_object_field_type
+(10 rows)
 
 -- anyrange and anymultirange are tighter than the rest, can only resolve
 -- from each other
diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index e4b7cdf703d..a0b324c8bef 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -1496,23 +1496,27 @@ select ts_headline('{}'::jsonb, tsquery('aaa & bbb'));
 select ts_headline('[]'::jsonb, tsquery('aaa & bbb'));
 
 -- casts
-select 'true'::jsonb::bool;
+select 'true'::jsonb::bool, ('{"a": true}'::jsonb->'a')::bool;
 select '[]'::jsonb::bool;
-select '1.0'::jsonb::float;
+select ('{"a": []}'::jsonb->'a')::bool;
+select '1.0'::jsonb::float, ('{"a": 1.0}'::jsonb->'a')::float;
 select '[1.0]'::jsonb::float;
-select '12345'::jsonb::int4;
+select ('{"a": [1.0]}'::jsonb->'a')::float;
+select '12345'::jsonb::int4,  ('{"a": 12345}'::jsonb->'a')::int4;
 select '"hello"'::jsonb::int4;
-select '12345'::jsonb::numeric;
+select ('{"a": "hello"}'::jsonb->'a')::int4;
+
+select '12345'::jsonb::numeric, ('{"a": 12345}'::jsonb->'a')::numeric;
 select '{}'::jsonb::numeric;
-select '12345.05'::jsonb::numeric;
-select '12345.05'::jsonb::float4;
-select '12345.05'::jsonb::float8;
-select '12345.05'::jsonb::int2;
-select '12345.05'::jsonb::int4;
-select '12345.05'::jsonb::int8;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
+select '12345.05'::jsonb::numeric, ('{"a": 12345.05}'::jsonb->'a')::numeric;
+select '12345.05'::jsonb::float4, ('{"a": 12345.05}'::jsonb->'a')::float4;
+select '12345.05'::jsonb::float8, ('{"a": 12345.05}'::jsonb->'a')::float8;
+select '12345.05'::jsonb::int2, ('{"a": 12345.05}'::jsonb->'a')::int2;
+select '12345.05'::jsonb::int4, ('{"a": 12345.05}'::jsonb->'a')::int4;
+select '12345.05'::jsonb::int8, ('{"a": 12345.05}'::jsonb->'a')::int8;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::numeric;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4,  ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float4;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float8;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int2;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int4;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int8;
-- 
2.21.0



^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-03 19:13   ` Re: Extract numeric filed in JSONB more effectively Tom Lane <[email protected]>
  2023-08-04 04:38     ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-07 03:04       ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
@ 2023-08-07 06:19         ` Pavel Stehule <[email protected]>
  2023-08-07 07:32           ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  0 siblings, 1 reply; 9+ messages in thread

From: Pavel Stehule @ 2023-08-07 06:19 UTC (permalink / raw)
  To: Andy Fan <[email protected]>; +Cc: Tom Lane <[email protected]>; jian he <[email protected]>; pgsql-hackers

Hi

po 7. 8. 2023 v 5:04 odesílatel Andy Fan <[email protected]> napsal:

> Hi:
>
>
>> For all the people who are interested in this topic, I will post a
>> planner support function soon,  you can check that then.
>>
>>
> The updated patch doesn't need users to change their codes and can get
> better performance. Thanks for all the feedback which makes things better.
>
> To verify there is no unexpected stuff happening, here is the performance
> comparison between master and patched.
>

I am looking on your patch, and the message

+
+ default:
+ elog(ERROR, "cast jsonb field to %d is not supported.", targetOid);

is a little bit messy. This case should not be possible, because it is
filtered by jsonb_cast_is_optimized. So the message should be changed or it
needs a comment.

Regards

Pavel


>
> create table tb(a jsonb);
> insert into tb select '{"a": true, "b": 23.3333}' from generate_series(1,
> 100000)i;
>
> Master:
> select 1 from tb where  (a->'b')::numeric = 1;
> Time: 31.020 ms
>
> select 1 from tb where not (a->'a')::boolean;
> Time: 25.888 ms
>
> select 1 from tb where  (a->'b')::int2 = 1;
> Time: 30.138 ms
>
> select 1 from tb where  (a->'b')::int4 = 1;
> Time: 32.384 ms
>
> select 1 from tb where  (a->'b')::int8 = 1;\
> Time: 29.922 ms
>
> select 1 from tb where  (a->'b')::float4 = 1;
> Time: 54.139 ms
>
> select 1 from tb where  (a->'b')::float8 = 1;
> Time: 66.933 ms
>
> Patched:
>
> select 1 from tb where  (a->'b')::numeric = 1;
> Time: 15.203 ms
>
> select 1 from tb where not (a->'a')::boolean;
> Time: 12.894 ms
>
> select 1 from tb where  (a->'b')::int2 = 1;
> Time: 16.847 ms
>
> select 1 from tb where  (a->'b')::int4 = 1;
> Time: 17.105 ms
>
> select 1 from tb where  (a->'b')::int8 = 1;
> Time: 16.720 ms
>
> select 1 from tb where  (a->'b')::float4 = 1;
> Time: 33.409 ms
>
> select 1 from tb where  (a->'b')::float8 = 1;
> Time: 34.660 ms
>
> --
> Best Regards
> Andy Fan
>


^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-03 19:13   ` Re: Extract numeric filed in JSONB more effectively Tom Lane <[email protected]>
  2023-08-04 04:38     ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-07 03:04       ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-07 06:19         ` Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
@ 2023-08-07 07:32           ` Andy Fan <[email protected]>
  2023-08-07 09:36             ` Re: Extract numeric filed in JSONB more effectively jian he <[email protected]>
  0 siblings, 1 reply; 9+ messages in thread

From: Andy Fan @ 2023-08-07 07:32 UTC (permalink / raw)
  To: Pavel Stehule <[email protected]>; +Cc: Tom Lane <[email protected]>; jian he <[email protected]>; pgsql-hackers

Hi Pavel:

Thanks for the code level review!


>
> I am looking on your patch, and the message
>
> +
> + default:
> + elog(ERROR, "cast jsonb field to %d is not supported.", targetOid);
>
> is a little bit messy. This case should not be possible, because it is
> filtered by jsonb_cast_is_optimized. So the message should be changed or it
> needs a comment.
>

Yes, the double check is not necessary,  that is removed in the attached v4
patch.


-- 
Best Regards
Andy Fan


Attachments:

  [application/octet-stream] v4-0001-Optimize-extracting-a-given-data-type-from-jsonb.patch (16.9K, 3-v4-0001-Optimize-extracting-a-given-data-type-from-jsonb.patch)
  download | inline diff:
From 458d1d85f07b78cfa4dce8d5bf38ddeafc5400ff Mon Sep 17 00:00:00 2001
From: Andy Fan <[email protected]>
Date: Mon, 7 Aug 2023 10:44:23 +0800
Subject: [PATCH v4] Optimize extracting a given data type from jsonb.

Previously after we get a JsonbValue, we need to convert it to
Jsonb first then cast the Jsonb to the given type. In this patch,
we covert the JsonbValue to the desired type directly.
---
 src/backend/utils/adt/jsonb.c            | 135 +++++++++++++++++++++++
 src/include/catalog/catversion.h         |   2 +-
 src/include/catalog/pg_proc.dat          |  22 ++--
 src/test/regress/expected/jsonb.out      | 134 +++++++++++-----------
 src/test/regress/expected/opr_sanity.out |   7 +-
 src/test/regress/sql/jsonb.sql           |  36 +++---
 6 files changed, 245 insertions(+), 91 deletions(-)

diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 9781852b0cb..c72e402eb55 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -17,11 +17,14 @@
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "funcapi.h"
+#include "nodes/makefuncs.h"
+#include "nodes/supportnodes.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/datetime.h"
+#include "utils/fmgroids.h"
 #include "utils/json.h"
 #include "utils/jsonb.h"
 #include "utils/jsonfuncs.h"
@@ -2038,6 +2041,138 @@ cannotCastJsonbValue(enum jbvType type, const char *sqltype)
 	elog(ERROR, "unknown jsonb type: %d", (int) type);
 }
 
+static bool
+jsonb_cast_is_optimized(Oid target_type)
+{
+	switch(target_type)
+	{
+		case NUMERICOID:
+		case BOOLOID:
+		case INT2OID:
+		case INT4OID:
+		case INT8OID:
+		case FLOAT4OID:
+		case FLOAT8OID:
+			return true;
+		default:
+			return false;
+	}
+}
+
+Datum
+jsonb_cast_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+
+	if (IsA(rawreq, SupportRequestSimplify))
+	{
+		SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
+		FuncExpr	*fexpr = palloc(sizeof(FuncExpr));
+		OpExpr		*opexpr;
+
+		memcpy(fexpr, req->fcall, sizeof(FuncExpr));
+
+		opexpr = (OpExpr *) linitial(fexpr->args);
+
+		/*
+		 * Simplify cast(jsonb_object_filed(jsonb, filedName) as type)
+		 * to jsonb_object_field_type(jsonb, filedName, targetTypeOid);
+		 */
+		if (IsA(opexpr, OpExpr) &&
+			opexpr->opfuncid  == F_JSONB_OBJECT_FIELD &&
+			jsonb_cast_is_optimized(fexpr->funcresulttype))
+		{
+			fexpr->funcid = F_JSONB_OBJECT_FIELD_TYPE;
+			fexpr->args = opexpr->args;
+
+			/* Tell the jsonb_object_field_type what is the target type. */
+			fexpr->args = lappend(fexpr->args, makeConst(OIDOID, 0, 0, sizeof(Oid),
+														 fexpr->funcresulttype,
+														 false, true));
+		}
+
+		PG_RETURN_POINTER(fexpr);
+	}
+
+	PG_RETURN_POINTER(NULL);
+}
+
+Datum
+jsonb_object_field_type(PG_FUNCTION_ARGS)
+{
+	Jsonb	   *jb = PG_GETARG_JSONB_P(0);
+	text	   *key = PG_GETARG_TEXT_PP(1);
+	Oid			targetOid = PG_GETARG_OID(2);
+
+	JsonbValue *v;
+	JsonbValue	vbuf;
+
+	if (!JB_ROOT_IS_OBJECT(jb))
+		PG_RETURN_NULL();
+
+	v = getKeyJsonValueFromContainer(&jb->root,
+									 VARDATA_ANY(key),
+									 VARSIZE_ANY_EXHDR(key),
+									 &vbuf);
+
+	if (v == NULL)
+		PG_RETURN_NULL();
+
+	switch(targetOid)
+	{
+		Datum	retValue;
+
+		case BOOLOID:
+			if (v->type != jbvBool)
+				cannotCastJsonbValue(v->type, "bool");
+			PG_RETURN_BOOL(v->val.boolean);
+
+		case NUMERICOID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "numeric");
+			PG_RETURN_NUMERIC(v->val.numeric);
+		case INT2OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "smallint");
+			retValue = DirectFunctionCall1(numeric_int2,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+		case INT4OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "integer");
+			retValue = DirectFunctionCall1(numeric_int4,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		case INT8OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "bigint");
+			retValue = DirectFunctionCall1(numeric_int8,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		case FLOAT4OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "real");
+			retValue = DirectFunctionCall1(numeric_float4,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		case FLOAT8OID:
+			if (v->type != jbvNumeric)
+				cannotCastJsonbValue(v->type, "double precision");
+			retValue = DirectFunctionCall1(numeric_float8,
+										   NumericGetDatum(v->val.numeric));
+			PG_RETURN_DATUM(retValue);
+
+		default:
+			Assert(false);
+			break;
+	}
+
+	PG_RETURN_POINTER(0);
+}
+
 Datum
 jsonb_bool(PG_FUNCTION_ARGS)
 {
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index f507b49bb28..391dfb81b2a 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -57,6 +57,6 @@
  */
 
 /*							yyyymmddN */
-#define CATALOG_VERSION_NO	202307261
+#define CATALOG_VERSION_NO	202308071
 
 #endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 6996073989a..108280b355d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4575,25 +4575,26 @@
   proname => 'pg_lsn', prorettype => 'pg_lsn', proargtypes => 'numeric',
   prosrc => 'numeric_pg_lsn' },
 
-{ oid => '3556', descr => 'convert jsonb to boolean',
+{ oid => '3556', descr => 'convert jsonb to boolean', prosupport => 'jsonb_cast_support',
   proname => 'bool', prorettype => 'bool', proargtypes => 'jsonb',
   prosrc => 'jsonb_bool' },
 { oid => '3449', descr => 'convert jsonb to numeric',
-  proname => 'numeric', prorettype => 'numeric', proargtypes => 'jsonb',
+  proname => 'numeric', prosupport => 'jsonb_cast_support',
+  prorettype => 'numeric', proargtypes => 'jsonb',
   prosrc => 'jsonb_numeric' },
-{ oid => '3450', descr => 'convert jsonb to int2',
+{ oid => '3450', descr => 'convert jsonb to int2', prosupport => 'jsonb_cast_support',
   proname => 'int2', prorettype => 'int2', proargtypes => 'jsonb',
   prosrc => 'jsonb_int2' },
-{ oid => '3451', descr => 'convert jsonb to int4',
+{ oid => '3451', descr => 'convert jsonb to int4', prosupport => 'jsonb_cast_support',
   proname => 'int4', prorettype => 'int4', proargtypes => 'jsonb',
   prosrc => 'jsonb_int4' },
-{ oid => '3452', descr => 'convert jsonb to int8',
+{ oid => '3452', descr => 'convert jsonb to int8', prosupport => 'jsonb_cast_support',
   proname => 'int8', prorettype => 'int8', proargtypes => 'jsonb',
   prosrc => 'jsonb_int8' },
-{ oid => '3453', descr => 'convert jsonb to float4',
+{ oid => '3453', descr => 'convert jsonb to float4', prosupport => 'jsonb_cast_support',
   proname => 'float4', prorettype => 'float4', proargtypes => 'jsonb',
   prosrc => 'jsonb_float4' },
-{ oid => '2580', descr => 'convert jsonb to float8',
+{ oid => '2580', descr => 'convert jsonb to float8', prosupport => 'jsonb_cast_support',
   proname => 'float8', prorettype => 'float8', proargtypes => 'jsonb',
   prosrc => 'jsonb_float8' },
 
@@ -9928,6 +9929,13 @@
   proname => 'jsonb_object_field_text', prorettype => 'text',
   proargtypes => 'jsonb text', proargnames => '{from_json, field_name}',
   prosrc => 'jsonb_object_field_text' },
+{ oid => '3813', descr => 'return a given type specified in desired_type from jsonb field',
+  proname => 'jsonb_object_field_type', prorettype => 'anyelement',
+  proargtypes => 'jsonb text oid', proargnames => '{from_json, field_name, desired_type}',
+  prosrc => 'jsonb_object_field_type'},
+{ oid => '3814', descr => 'planner support for numeric(jsonb)',
+  proname => 'jsonb_cast_support', prorettype => 'internal',
+  proargtypes => 'internal', prosrc => 'jsonb_cast_support' },
 { oid => '3215',
   proname => 'jsonb_array_element', prorettype => 'jsonb',
   proargtypes => 'jsonb int4', proargnames => '{from_json, element_index}',
diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out
index 4a16d0dbafb..5a144534cc1 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -5471,107 +5471,113 @@ select ts_headline('[]'::jsonb, tsquery('aaa & bbb'));
 (1 row)
 
 -- casts
-select 'true'::jsonb::bool;
- bool 
-------
- t
+select 'true'::jsonb::bool, ('{"a": true}'::jsonb->'a')::bool;
+ bool | bool 
+------+------
+ t    | t
 (1 row)
 
 select '[]'::jsonb::bool;
 ERROR:  cannot cast jsonb array to type boolean
-select '1.0'::jsonb::float;
- float8 
---------
-      1
+select ('{"a": []}'::jsonb->'a')::bool;
+ERROR:  cannot cast jsonb array to type boolean
+select '1.0'::jsonb::float, ('{"a": 1.0}'::jsonb->'a')::float;
+ float8 | float8 
+--------+--------
+      1 |      1
 (1 row)
 
 select '[1.0]'::jsonb::float;
 ERROR:  cannot cast jsonb array to type double precision
-select '12345'::jsonb::int4;
- int4  
--------
- 12345
+select ('{"a": [1.0]}'::jsonb->'a')::float;
+ERROR:  cannot cast jsonb array to type double precision
+select '12345'::jsonb::int4,  ('{"a": 12345}'::jsonb->'a')::int4;
+ int4  | int4  
+-------+-------
+ 12345 | 12345
 (1 row)
 
 select '"hello"'::jsonb::int4;
 ERROR:  cannot cast jsonb string to type integer
-select '12345'::jsonb::numeric;
- numeric 
----------
-   12345
+select ('{"a": "hello"}'::jsonb->'a')::int4;
+ERROR:  cannot cast jsonb string to type integer
+select '12345'::jsonb::numeric, ('{"a": 12345}'::jsonb->'a')::numeric;
+ numeric | numeric 
+---------+---------
+   12345 |   12345
 (1 row)
 
 select '{}'::jsonb::numeric;
 ERROR:  cannot cast jsonb object to type numeric
-select '12345.05'::jsonb::numeric;
- numeric  
-----------
- 12345.05
+select '12345.05'::jsonb::numeric, ('{"a": 12345.05}'::jsonb->'a')::numeric;
+ numeric  | numeric  
+----------+----------
+ 12345.05 | 12345.05
 (1 row)
 
-select '12345.05'::jsonb::float4;
-  float4  
-----------
- 12345.05
+select '12345.05'::jsonb::float4, ('{"a": 12345.05}'::jsonb->'a')::float4;
+  float4  |  float4  
+----------+----------
+ 12345.05 | 12345.05
 (1 row)
 
-select '12345.05'::jsonb::float8;
-  float8  
-----------
- 12345.05
+select '12345.05'::jsonb::float8, ('{"a": 12345.05}'::jsonb->'a')::float8;
+  float8  |  float8  
+----------+----------
+ 12345.05 | 12345.05
 (1 row)
 
-select '12345.05'::jsonb::int2;
- int2  
--------
- 12345
+select '12345.05'::jsonb::int2, ('{"a": 12345.05}'::jsonb->'a')::int2;
+ int2  | int2  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.05'::jsonb::int4;
- int4  
--------
- 12345
+select '12345.05'::jsonb::int4, ('{"a": 12345.05}'::jsonb->'a')::int4;
+ int4  | int4  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.05'::jsonb::int8;
- int8  
--------
- 12345
+select '12345.05'::jsonb::int8, ('{"a": 12345.05}'::jsonb->'a')::int8;
+ int8  | int8  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric;
-                       numeric                        
-------------------------------------------------------
- 12345.0000000000000000000000000000000000000000000005
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::numeric;
+                       numeric                        |                       numeric                        
+------------------------------------------------------+------------------------------------------------------
+ 12345.0000000000000000000000000000000000000000000005 | 12345.0000000000000000000000000000000000000000000005
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4;
- float4 
---------
-  12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4,  ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float4;
+ float4 | float4 
+--------+--------
+  12345 |  12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8;
- float8 
---------
-  12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float8;
+ float8 | float8 
+--------+--------
+  12345 |  12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2;
- int2  
--------
- 12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int2;
+ int2  | int2  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4;
- int4  
--------
- 12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int4;
+ int4  | int4  
+-------+-------
+ 12345 | 12345
 (1 row)
 
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
- int8  
--------
- 12345
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int8;
+ int8  | int8  
+-------+-------
+ 12345 | 12345
 (1 row)
 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index a1bdf2c0b5f..b4c36472eb2 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -309,8 +309,8 @@ WHERE p1.prorettype IN
      'anyrange'::regtype = ANY (p1.proargtypes) OR
      'anymultirange'::regtype = ANY (p1.proargtypes))
 ORDER BY 2;
- oid  |    proname     
-------+----------------
+ oid  |         proname         
+------+-------------------------
  2296 | anyarray_in
  2502 | anyarray_recv
  2312 | anyelement_in
@@ -320,7 +320,8 @@ ORDER BY 2;
  2400 | array_recv
  3506 | enum_in
  3532 | enum_recv
-(9 rows)
+ 3813 | jsonb_object_field_type
+(10 rows)
 
 -- anyrange and anymultirange are tighter than the rest, can only resolve
 -- from each other
diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index e4b7cdf703d..a0b324c8bef 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -1496,23 +1496,27 @@ select ts_headline('{}'::jsonb, tsquery('aaa & bbb'));
 select ts_headline('[]'::jsonb, tsquery('aaa & bbb'));
 
 -- casts
-select 'true'::jsonb::bool;
+select 'true'::jsonb::bool, ('{"a": true}'::jsonb->'a')::bool;
 select '[]'::jsonb::bool;
-select '1.0'::jsonb::float;
+select ('{"a": []}'::jsonb->'a')::bool;
+select '1.0'::jsonb::float, ('{"a": 1.0}'::jsonb->'a')::float;
 select '[1.0]'::jsonb::float;
-select '12345'::jsonb::int4;
+select ('{"a": [1.0]}'::jsonb->'a')::float;
+select '12345'::jsonb::int4,  ('{"a": 12345}'::jsonb->'a')::int4;
 select '"hello"'::jsonb::int4;
-select '12345'::jsonb::numeric;
+select ('{"a": "hello"}'::jsonb->'a')::int4;
+
+select '12345'::jsonb::numeric, ('{"a": 12345}'::jsonb->'a')::numeric;
 select '{}'::jsonb::numeric;
-select '12345.05'::jsonb::numeric;
-select '12345.05'::jsonb::float4;
-select '12345.05'::jsonb::float8;
-select '12345.05'::jsonb::int2;
-select '12345.05'::jsonb::int4;
-select '12345.05'::jsonb::int8;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4;
-select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
+select '12345.05'::jsonb::numeric, ('{"a": 12345.05}'::jsonb->'a')::numeric;
+select '12345.05'::jsonb::float4, ('{"a": 12345.05}'::jsonb->'a')::float4;
+select '12345.05'::jsonb::float8, ('{"a": 12345.05}'::jsonb->'a')::float8;
+select '12345.05'::jsonb::int2, ('{"a": 12345.05}'::jsonb->'a')::int2;
+select '12345.05'::jsonb::int4, ('{"a": 12345.05}'::jsonb->'a')::int4;
+select '12345.05'::jsonb::int8, ('{"a": 12345.05}'::jsonb->'a')::int8;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::numeric;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float4,  ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float4;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::float8;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int2;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int4;
+select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8, ('{"a": 12345.0000000000000000000000000000000000000000000005}'::jsonb->'a')::int8;
-- 
2.21.0



^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-03 19:13   ` Re: Extract numeric filed in JSONB more effectively Tom Lane <[email protected]>
  2023-08-04 04:38     ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-07 03:04       ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
  2023-08-07 06:19         ` Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-07 07:32           ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
@ 2023-08-07 09:36             ` jian he <[email protected]>
  0 siblings, 0 replies; 9+ messages in thread

From: jian he @ 2023-08-07 09:36 UTC (permalink / raw)
  To: Andy Fan <[email protected]>; +Cc: Pavel Stehule <[email protected]>; Tom Lane <[email protected]>; pgsql-hackers

Hi.

+Datum
+jsonb_object_field_type(PG_FUNCTION_ARGS)
+{
+ Jsonb    *jb = PG_GETARG_JSONB_P(0);
+ text    *key = PG_GETARG_TEXT_PP(1);
+ Oid targetOid = PG_GETARG_OID(2);

compared with jsonb_numeric. I am wondering if you need a free *jb.
elog(INFO,"jb=%p arg pointer=%p ", jb, PG_GETARG_POINTER(0));
says there two are not the same.






^ permalink  raw  reply  [nested|flat] 9+ messages in thread

* Re: Extract numeric filed in JSONB more effectively
  2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
  2023-08-03 14:27 ` Re: Extract numeric filed in JSONB more effectively Andy Fan <[email protected]>
@ 2023-08-04 03:54   ` Pavel Stehule <[email protected]>
  1 sibling, 0 replies; 9+ messages in thread

From: Pavel Stehule @ 2023-08-04 03:54 UTC (permalink / raw)
  To: Andy Fan <[email protected]>; +Cc: jian he <[email protected]>; pgsql-hackers

čt 3. 8. 2023 v 16:27 odesílatel Andy Fan <[email protected]> napsal:

> Hi:
>
>
>>  If you use explicit cast, then the code should not be hard, in the
>> rewrite stage all information should be known.
>>
>
> Can you point to me where the code is for the XML stuff?  I thought
> this is a bad idea but I may accept it if some existing code does
> such a thing already.   "such thing"  is  typeA:typeB is
> converted something else but user can't find out an entry in
> pg_cast for typeA to typeB.
>

in XML there is src/backend/utils/adt/xml.c, the XmlTableGetValue routine.
It is not an internal transformation - and from XML type to some else.

you can look at parser - parse_expr, parse_func. You can watch the
lifecycle of :: operator. There are transformations of nodes to different
nodes

you can look to patches related to SQL/JSON (not fully committed yet) and
json_table




>
>
>> It would be cool but still I didn't see a way to do that without making
>>> something else complex.
>>>
>>
>>  The custom @-> operator you can implement in your own custom extension.
>> Builtin solutions should be generic as it is possible.
>>
>
> I agree, but actually I think there is no clean way to do it, at least I
> dislike the conversion of typeA to typeB in a cast syntax but there
> is no entry in pg_cast for it.  Are you saying something like this
> or I misunderstood you?
>

There is not any possibility of user level space.  The conversions should
be supported by cast from pg_cast, where it is possible. When it is
impossible, then you can raise an exception in some strict mode, or you can
do IO cast. But this is not hard part

You should to teach parser to push type info deeper to some nodes about
expected result

(2023-08-04 05:28:36) postgres=# select ('{"a":2,
"b":"nazdar"}'::jsonb)['a']::numeric;
┌─────────┐
│ numeric │
╞═════════╡
│       2 │
└─────────┘
(1 row)

(2023-08-04 05:28:36) postgres=# select ('{"a":2,
"b":"nazdar"}'::jsonb)['a']::numeric;
┌─────────┐
│ numeric │
╞═════════╡
│       2 │
└─────────┘
(1 row)

(2023-08-04 05:28:41) postgres=# select ('{"a":2,
"b":"nazdar"}'::jsonb)['a']::int;
┌──────┐
│ int4 │
╞══════╡
│    2 │
└──────┘
(1 row)

when the parser iterates over the expression, it crosses ::type node first,
so you have information about the target type. Currently this information
is used when the parser is going back and when the source type is the same
as the target type, the cast can be ignored. Probably it needs to add some
flag to the operator if they are able to use this. Maybe it can be a new
third argument with an expected type. So new kinds of op functions can look
like opfx("any", "any", anyelement) returns anyelement. Maybe you find
another possibility. It can be invisible for me (or for you) now.

It is much more work, but the benefits will be generic. I think this is an
important part for container types, so partial fix is not good, and it
requires a system solution. The performance is important, but without
generic solutions, the complexity increases, and this is a much bigger
problem.

Regards

Pavel




>
> --
> Best Regards
> Andy Fan
>


^ permalink  raw  reply  [nested|flat] 9+ messages in thread


end of thread, other threads:[~2023-08-07 09:36 UTC | newest]

Thread overview: 9+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2023-08-03 13:52 Re: Extract numeric filed in JSONB more effectively Pavel Stehule <[email protected]>
2023-08-03 14:27 ` Andy Fan <[email protected]>
2023-08-03 19:13   ` Tom Lane <[email protected]>
2023-08-04 04:38     ` Andy Fan <[email protected]>
2023-08-07 03:04       ` Andy Fan <[email protected]>
2023-08-07 06:19         ` Pavel Stehule <[email protected]>
2023-08-07 07:32           ` Andy Fan <[email protected]>
2023-08-07 09:36             ` jian he <[email protected]>
2023-08-04 03:54   ` Pavel Stehule <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox