Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qRlty-002ut9-AT for pgsql-hackers@arkaria.postgresql.org; Fri, 04 Aug 2023 03:54:58 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qRltw-0039yJ-FZ for pgsql-hackers@arkaria.postgresql.org; Fri, 04 Aug 2023 03:54:56 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qRltw-0039yA-2A for pgsql-hackers@lists.postgresql.org; Fri, 04 Aug 2023 03:54:56 +0000 Received: from mail-yw1-x1134.google.com ([2607:f8b0:4864:20::1134]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qRltt-000Qt7-OZ for pgsql-hackers@postgresql.org; Fri, 04 Aug 2023 03:54:55 +0000 Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-583f99641adso18391977b3.2 for ; Thu, 03 Aug 2023 20:54:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691121293; x=1691726093; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=jQJJODUguzHmITHYQKSZFwMRefWy7h5vzEq7UUdqkOo=; b=mnlgATi8qOFOepYzLJgVhBUXstm/X+c752FWfNIydjXaZSZqtdFWx6zE9N/RH2C1X/ fVox48I2qIQxdCbIcNutq0dS9MoSnqhoaNxrrR9Y5IC6Q9WDetu28WUWZSTs0Olt/zBc /7DwupyBo/lyp2TMW9D23HFjMaBOnWcLJCYEQ0IMk3Xywh15u8d/r6qRGqHRrEjiEY3c OD+vQeKEcr+NQgBQEb2gwKONRxy7WFmkAy0lTqg7/8KsSJ32pc2NZNVdaUBeENg+TvA4 zTjb6ybQOVTAlIRupSsc9ZT5vRBL5+6CFyTWnueVjEhUZy+b0AtJxWUfbwRB7Gz5g9xF 5F7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691121293; x=1691726093; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jQJJODUguzHmITHYQKSZFwMRefWy7h5vzEq7UUdqkOo=; b=ePPuavOY9g+gSTlQQAn+DASZZDyTFzf5Itosl7sHaYBa1a0xlTFmBgMlt1tTcRsk1i qoOBhZ3HRvCtwcix7XKLXvjo3iLg0bYjdyEQJ5sNbgWsLrcgigHHx9hS5O3CY7BW+Uoy L2gSMXwScSLKsQL8tz1N89xJBirtpD5cB3FKq4npXU+8wVth5fGLBfggc2BBwf/vO0gW Feln7l83fPN4iO406Hoh5XzkrDW+x+FTOfIu5JptfELSqCJOFJUuoxV/5f1pveqRlH5K Oddz8+va0Yyf1mOxa+NiwpaPiNr5YZ6y44FFoOJYgli2dI7Tp15kNN1p6M9O7cTI2Rqo peZw== X-Gm-Message-State: AOJu0YzD/7mJkBGyGVvF46MnNGoOFSRyCR8VIgwMoHLy3nPVABsbkpAb Hp5+vkffR8fRzK0l1zc0PmRgabppyds73KuyaRE= X-Google-Smtp-Source: AGHT+IGyvD0anut6WL43Z0smhNXxKNNAdaVy7gb5BH/mFFWiVZsyz1RepL8bUb2qdOQC6MqeeWbp16cyojJjjD9jWqo= X-Received: by 2002:a81:a110:0:b0:57a:5039:aa01 with SMTP id y16-20020a81a110000000b0057a5039aa01mr575501ywg.22.1691121292920; Thu, 03 Aug 2023 20:54:52 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Pavel Stehule Date: Fri, 4 Aug 2023 05:54:15 +0200 Message-ID: Subject: Re: Extract numeric filed in JSONB more effectively To: Andy Fan Cc: jian he , pgsql-hackers Content-Type: multipart/alternative; boundary="000000000000efcc61060210db24" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000efcc61060210db24 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =C4=8Dt 3. 8. 2023 v 16:27 odes=C3=ADlatel Andy Fan napsal: > Hi: > > >> If you use explicit cast, then the code should not be hard, in the >> rewrite stage all information should be known. >> > > Can you point to me where the code is for the XML stuff? I thought > this is a bad idea but I may accept it if some existing code does > such a thing already. "such thing" is typeA:typeB is > converted something else but user can't find out an entry in > pg_cast for typeA to typeB. > in XML there is src/backend/utils/adt/xml.c, the XmlTableGetValue routine. It is not an internal transformation - and from XML type to some else. you can look at parser - parse_expr, parse_func. You can watch the lifecycle of :: operator. There are transformations of nodes to different nodes you can look to patches related to SQL/JSON (not fully committed yet) and json_table > > >> It would be cool but still I didn't see a way to do that without making >>> something else complex. >>> >> >> The custom @-> operator you can implement in your own custom extension. >> Builtin solutions should be generic as it is possible. >> > > I agree, but actually I think there is no clean way to do it, at least I > dislike the conversion of typeA to typeB in a cast syntax but there > is no entry in pg_cast for it. Are you saying something like this > or I misunderstood you? > There is not any possibility of user level space. The conversions should be supported by cast from pg_cast, where it is possible. When it is impossible, then you can raise an exception in some strict mode, or you can do IO cast. But this is not hard part You should to teach parser to push type info deeper to some nodes about expected result (2023-08-04 05:28:36) postgres=3D# select ('{"a":2, "b":"nazdar"}'::jsonb)['a']::numeric; =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=90 =E2=94=82 numeric =E2=94=82 =E2=95=9E=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2= =95=90=E2=95=90=E2=95=A1 =E2=94=82 2 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=98 (1 row) (2023-08-04 05:28:36) postgres=3D# select ('{"a":2, "b":"nazdar"}'::jsonb)['a']::numeric; =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=90 =E2=94=82 numeric =E2=94=82 =E2=95=9E=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2= =95=90=E2=95=90=E2=95=A1 =E2=94=82 2 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=98 (1 row) (2023-08-04 05:28:41) postgres=3D# select ('{"a":2, "b":"nazdar"}'::jsonb)['a']::int; =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90 =E2=94=82 int4 =E2=94=82 =E2=95=9E=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=A1 =E2=94=82 2 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98 (1 row) when the parser iterates over the expression, it crosses ::type node first, so you have information about the target type. Currently this information is used when the parser is going back and when the source type is the same as the target type, the cast can be ignored. Probably it needs to add some flag to the operator if they are able to use this. Maybe it can be a new third argument with an expected type. So new kinds of op functions can look like opfx("any", "any", anyelement) returns anyelement. Maybe you find another possibility. It can be invisible for me (or for you) now. It is much more work, but the benefits will be generic. I think this is an important part for container types, so partial fix is not good, and it requires a system solution. The performance is important, but without generic solutions, the complexity increases, and this is a much bigger problem. Regards Pavel > > -- > Best Regards > Andy Fan > --000000000000efcc61060210db24 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
=C4=8Dt 3. 8. 2023 v=C2=A016:27 odes= =C3=ADlatel Andy Fan <zhihui= .fan1213@gmail.com> napsal:
Hi:=C2=A0


=C2=A0If you use explicit cast, then the code should = not be hard, in the rewrite stage all information should be known.

Can you point to me where the cod= e is for the XML stuff?=C2=A0 I thought
this is a bad idea but I = may accept it if some existing code does
such a thing already.=C2= =A0 =C2=A0"such thing"=C2=A0 is=C2=A0 typeA:typeB is
co= nverted something else but user can't find out an entry in
pg= _cast for typeA to typeB.=C2=A0

in XML there is src/backend/utils/adt/xml.c, the=C2=A0XmlTableGetVal= ue routine. It is not an internal transformation - and from XML type to som= e else.

you can look at parser - parse_expr, parse= _func. You can watch the lifecycle of :: operator. There are transformation= s of nodes to different nodes

you can look to = patches related to SQL/JSON (not fully committed yet) and json_table
<= div>

=C2=A0
=C2= =A0
It would be cool but st= ill I didn't see a way to do that without making
something el= se complex.=C2=A0

=C2=A0T= he custom @-> operator you can=20 implement in your own custom extension. Builtin solutions should be=20 generic as it is possible.

I agree, but actually I think there is no clean way to do it, at least I<= /div>
dislike the conversion of=C2=A0typeA to typeB in a cast syntax=C2= =A0but there
is no entry in pg_cast for it.=C2=A0 Are you saying = something like this=C2=A0
or I misunderstood you?=C2=A0

There is not any possibility of us= er level space.=C2=A0 The conversions should be supported by cast from pg_c= ast, where it is possible. When it is impossible, then you can raise an exc= eption in some strict mode, or you can do IO cast. But this is not hard par= t

You should to teach parser to push type info dee= per to some nodes about expected result

(2023-08-04 05:28:36) postgres=3D# select ('= ;{"a":2, "b":"nazdar"}'::jsonb)['a= 9;]::numeric;
=E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90
=E2=94=82 numeric =E2=94=82=E2=95=9E=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90= =E2=95=90=E2=95=90=E2=95=A1
=E2=94=82 =C2=A0 =C2=A0 =C2=A0 2 =E2=94=82=E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=98
(1 row)

(2023-08-04 05:28:36) postgres=3D# select ('{"a":2, &= quot;b":"nazdar"}'::jsonb)['a']::numeric;
=E2= =94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=90
=E2=94=82 numeric =E2=94=82
=E2=95=9E=E2=95=90= =E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2= =95=A1
=E2=94=82 =C2=A0 =C2=A0 =C2=A0 2 =E2=94=82
=E2=94=94=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=98
(1 row)

(2023-08-04 05:28:41) postgres=3D# select ('{&= quot;a":2, "b":"nazdar"}'::jsonb)['a']= ::int;
=E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=90
=E2=94=82 int4 =E2=94=82
=E2=95=9E=E2=95=90=E2=95=90=E2=95= =90=E2=95=90=E2=95=90=E2=95=90=E2=95=A1
=E2=94=82 =C2=A0 =C2=A02 =E2=94= =82
=E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=98
(1 row)

when the parser iterates = over the expression, it crosses ::type node first, so you have information = about the target type. Currently this information is used when the parser i= s going back and when the source type is the same as the target type, the c= ast can be ignored. Probably it needs to add some flag to the operator if t= hey are able to use this. Maybe it can be a new third argument with an expe= cted type. So new kinds of op functions can look like opfx("any",= "any", anyelement) returns anyelement. Maybe you find another po= ssibility. It can be invisible for me (or for you) now.

<= /div>
It is much more work, but the benefits will be generic. I think t= his is an important part for container types, so partial fix is not good, a= nd it requires a system solution. The performance is important, but without= generic solutions, the complexity increases, and this is a much bigger pro= blem.

Regards

Pavel





--
Best Regards
Andy Fan
--000000000000efcc61060210db24--