Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qRT9v-001r7i-5k for pgsql-hackers@arkaria.postgresql.org; Thu, 03 Aug 2023 07:54:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qRT9s-00B7En-VK for pgsql-hackers@arkaria.postgresql.org; Thu, 03 Aug 2023 07:54:09 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qRT9s-00B7Ef-JJ for pgsql-hackers@lists.postgresql.org; Thu, 03 Aug 2023 07:54:09 +0000 Received: from mail-oa1-x2f.google.com ([2001:4860:4864:20::2f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qRT9l-000IB4-DK for pgsql-hackers@postgresql.org; Thu, 03 Aug 2023 07:54:07 +0000 Received: by mail-oa1-x2f.google.com with SMTP id 586e51a60fabf-1bb5c259b44so464234fac.1 for ; Thu, 03 Aug 2023 00:54:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691049239; x=1691654039; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=d+tN+n+FWbfjMKQ/eXOcXS2UbpR8mLZGASbXuma5e5s=; b=Lhrj/wzrbV7SOppmJtYKmXurxX/xnrk8bqDgbqCD6eIhjmHu+vgnoeX3uudpgx2xcZ E0CPceBObP9nHB/pY1KSOF77qRWh3rcN+o0LmBQJvVMQ0oC3RZ5+8VmmLVVsFmEmR09B fIKUxYkHkqulq634PgYJ5XCtf3JXO3jQnq7vgKqYl2O59/gqWm0GI+1Qjw/AjMgu0gSK vGRjlZqnv1VRjwWp6clYagItsqIxhaZIG4jnz0Dq/U0FRcgimIWkj8v0JgEPyZD4jMju dC+v9daSRqNyWQ5PeZTBgobJMBxsWRx5hnAlYTgkMuK7ft0cQwpMlMSsYX/DaUnhc58Z ju8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691049239; x=1691654039; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=d+tN+n+FWbfjMKQ/eXOcXS2UbpR8mLZGASbXuma5e5s=; b=GxfV1EcPWX2W5iYa1P2IPZeR1FWJLHci13WvzYp9itPT5fhZdL5eMjS2FgTBvj+dpY 53671gNxerBHD+SAiPSrxfgZQOge0cXWF6IdoviwE2Hcpfgj6iSednJD3cH37D0iCxar jp1dWrjdosLd42fnhFPrOWJb6SyHZujArAmiF3jv04n5gheGbylhlnMEZRQh8VN5Jj3j DWvrKxMy2xvQ6xHQztyxRH9/r6apItEOJhy/H2YRrdFMrI/k5ajjG3NLdUzbz/pEKSzQ sRkureQg2x3NKLxZ+wasYmEbmkctI1Dhiv2pb9Jk36Nb6DGFp9qg1eZkD22h/HkdmuCG xp1A== X-Gm-Message-State: ABy/qLaRDHO8E5hipAjLq5I54xJ2xr4+r7AWsEPn7/UaTuF/t2FeCFfK CngtPIoa90ULXFMDDGtgIu+AreGtO/jFqFCB1WY= X-Google-Smtp-Source: APBJJlHaprAfyv9uWcgOkdmKopmupOEkW4iOgUW9MQscohwMhF0TgKojYVjIKQ1S671XrU7TYT2xNTH5TrCaL4dafiI= X-Received: by 2002:a05:6871:5206:b0:1bb:70be:5061 with SMTP id ht6-20020a056871520600b001bb70be5061mr18591384oac.5.1691049239042; Thu, 03 Aug 2023 00:53:59 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Andy Fan Date: Thu, 3 Aug 2023 15:53:47 +0800 Message-ID: Subject: Re: Extract numeric filed in JSONB more effectively To: Pavel Stehule Cc: jian he , pgsql-hackers Content-Type: multipart/alternative; boundary="00000000000030deff06020015ae" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000030deff06020015ae Content-Type: text/plain; charset="UTF-8" Hi Pavel: Thanks for the feedback. I don't like this solution because it is bloating operators and it is not > extra readable. > If we support it with cast, could we say we are bloating CAST? It is true that it is not extra readable, if so how about a->>'a' return text? Actually I can't guess any meaning of the existing jsonb operations without documentation. For completeness you should implement cast for date, int, boolean too. > Next, the same problem is with XML or hstore type (probably with any types > that are containers). > I am not sure completeness is a gold rule we should obey anytime, like we have some function like int24le to avoid the unnecessary cast, but we just avoid casting for special types for performance reason, but not for all. At the same time, `int2/int4/int8` doesn't have a binary compatibility type in jsonb. and the serialization /deserialization for boolean is pretty cheap. I didn't realize timetime types are binary compatible with SQL, so maybe we can have some similar optimization as well. (It is a pity that timestamp(tz) are not binary, or else we may just need one operator). > > I don't like the idea so using a special operator is 2x faster than common > syntax for casting. It is a signal, so there is a space for optimization. > Black magic with special operators is not user friendly for relatively > common problems. > I don't think "Black magic" is a proper word here, since it is not much different from ->> return a text. If you argue text can be cast to most-of-types, that would be a reason, but I doubt this difference should generate a "black magic". > > Maybe we can introduce some *internal operator* "extract to type", and in > rewrite stage we can the pattern (x->'field')::type transform to OP(x, > 'field', typid) > Not sure what the OP should be? If it is a function, what is the return value? It looks to me like it is hard to do in c language? After all, if we really care about the number of operators, I'm OK with just let users use the function directly, like jsonb_field_as_numeric(jsonb, 'filedname') jsonb_field_as_timestamp(jsonb, 'filedname'); jsonb_field_as_timestamptz(jsonb, 'filedname'); jsonb_field_as_date(jsonb, 'filedname'); it can save an operator and sloves the readable issue. -- Best Regards Andy Fan --00000000000030deff06020015ae Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Pavel:

Thanks for the fee= dback.=C2=A0

I don't like this solution because it is bloating=C2=A0 operat= ors and it is not extra readable.
=C2= =A0
If we support it with cast, could we say we are bloating= =C2=A0CAST?=C2=A0 It is true
that it is not extra readable, if so= how about=C2=A0 a->>'a'=C2=A0 return text?=C2=A0 Actually
I can't guess any meaning of the existing jsonb operation= s without
documentation.

<= div>For completeness you should implement cast for date, int, boolean too. = Next, the same problem is with XML or hstore type (probably with any types = that are containers).

I a= m not sure completeness is a gold rule we should obey anytime,
li= ke we have some function like int24le to avoid the unnecessary
ca= st, but we just avoid casting for special types for performance
r= eason, but not for all. At the same time,=C2=A0 `int2/int4/int8` doesn'= t
have a binary compatibility type in jsonb. and the serializatio= n
/deserialization for boolean is pretty cheap.

I didn't realize timetime=C2=A0types are binary compatible with= SQL,=C2=A0=C2=A0
so maybe we can have some similar optimization = as well.=C2=A0
(It is a pity that timestamp(tz) are not binary, o= r else we may=C2=A0
just need one operator).=C2=A0
=C2= =A0

I don't like the idea s= o using a special operator is 2x faster than common syntax for casting. It = is a signal, so there is a space for optimization. Black magic with special= operators is not user friendly for relatively common problems.

I don't think "Black ma= gic" is a proper word here, since it is not much
different f= rom ->> return a text.=C2=A0 If you argue text can be cast to=C2=A0
most-of-types,=C2=A0 that would be a reason, but I doubt this diff= erence
should generate a "black magic".=C2=A0
=C2=A0

Maybe we can= introduce some *internal operator* "extract to type", and in rew= rite stage we can the pattern (x->'field')::type transform to OP= (x, 'field', typid)

Not sure what the OP should be?=C2=A0 If it is a function, what is the
return value?=C2=A0 It looks to me like it is hard to do in c lang= uage?

After all,=C2=A0 if we really care about the= number of operators, I'm OK
with just let users use the func= tion directly, like

jsonb_field_as_numeric(jsonb, = 'filedname')=C2=A0
jsonb_field_as_timestamp(jsonb, 'f= iledname');=C2=A0
jsonb_field_as_timestamptz(jsonb, 'file= dname');=C2=A0
jsonb_field_as_date(jsonb, 'filedname&= #39;);=C2=A0

it can save an operator and slove= s the readable issue.=C2=A0

--
=
Best Regards
Andy Fan
--00000000000030deff06020015ae--