Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qRYIC-0029j9-Pf for pgsql-hackers@arkaria.postgresql.org; Thu, 03 Aug 2023 13:23:05 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qRYIB-00DYQ1-5A for pgsql-hackers@arkaria.postgresql.org; Thu, 03 Aug 2023 13:23:03 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1qRYIA-00DYPt-O4 for pgsql-hackers@lists.postgresql.org; Thu, 03 Aug 2023 13:23:03 +0000 Received: from mail-oi1-x230.google.com ([2607:f8b0:4864:20::230]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qRYI8-000JuR-Fp for pgsql-hackers@postgresql.org; Thu, 03 Aug 2023 13:23:01 +0000 Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3a36b52b4a4so659976b6e.1 for ; Thu, 03 Aug 2023 06:23:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691068979; x=1691673779; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=bXhX+YGumHP8S0vGSBigKUJ+SeMYxiMfMEgt0VkdIa0=; b=qkbKF7MmwNvlD7TnY4ER7zxCpaEgxktBckgMcsT8gIQyrBNpfSQBQxPk0XI2iLMZD/ Ou93WN4AprESfJ4sxEqzcuHQr42zFhNQLUg9qXRmj3NgBhGnhO520kNRoPMqe81QmL+c zadHzNmhzhHw5eCPYuaCal/KkenvwSqOkckE3vj4Xj9N9X79THOT16gio14vsJyIdzJu 3UADgogX4kOa/YiihK/48eGWLMASLxQbfOK2HhF7S4aQcUXs7NLcW86AlvpalivZJGfh Py0GPf48GZIMjdRILOpIkWmonB5YJZR6egh2FFeUe3MvzC5WMUfcfRstgk0JnF4rlPwu OjCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691068979; x=1691673779; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bXhX+YGumHP8S0vGSBigKUJ+SeMYxiMfMEgt0VkdIa0=; b=bDUpefSkzcq4JiHfp78yDW1A22s4IJFww5WpHy+aIcNsuTTREuwnO+6ZCAGLHb6HCB 26tXnIEABYZq9ffcafCKG3i5DCqA6XMTqmtw1XZ9nBUZIFrzXbDwojPYK3ByqVGxwoFx EalTbykQyY4s5rviBn6JLPPK8DWYfDF3E5bVW9x9Sz3YlcYPrDOPo35TdxU4IfwrHbtc kTS7HKQojnH0EmjlSJW0ScASzMaUULgaFhifjf5i5j/7JM6Xjmw+PTB/FNpURIgSGpTO bDsbxIPtB5WoqrEPP6OxgG/2t9dS6M5guRSQQ1hYOrTtbEtydq2Oro8sBuPbt+lO8kVJ qkiQ== X-Gm-Message-State: ABy/qLYeEtopODgYlHMrzIFerw0hXzc0NJeqmATxoEDZ8T9QVJCvLJIW O2bJ8ivDh0DQa2h1UfD8bOHmDXy/zFJ1bU8xeSIR+Y/9dko= X-Google-Smtp-Source: APBJJlFEXUNKekWPeG+dR6N5s86WLNhWHMlv+KTEbg9cruw1fnHZYip9/aDloVrRpkJSTtGrYndJQWFI0boC1uvbZFA= X-Received: by 2002:a05:6808:2194:b0:3a1:bced:9e83 with SMTP id be20-20020a056808219400b003a1bced9e83mr17603381oib.5.1691068979545; Thu, 03 Aug 2023 06:22:59 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Andy Fan Date: Thu, 3 Aug 2023 21:22:48 +0800 Message-ID: Subject: Re: Extract numeric filed in JSONB more effectively To: Pavel Stehule Cc: jian he , pgsql-hackers Content-Type: multipart/alternative; boundary="000000000000d10984060204ade8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000d10984060204ade8 Content-Type: text/plain; charset="UTF-8" Hi: > More, I believe so lot of people uses more common syntax, and then this > syntax should to have good performance - for jsonb - (val->'op')::numeric > works, and then there should not be performance penalty, because this > syntax will be used in 99%. > This looks like a valid opinion IMO, but to rescue it, we have to do something like "internal structure" and remove the existing cast. But even we pay the effort, it still breaks some common knowledge, since xx:numeric is not a cast. It is an "internal structure"! I don't think "Black magic" is a proper word here, since it is not much >> different from ->> return a text. If you argue text can be cast to >> most-of-types, that would be a reason, but I doubt this difference >> should generate a "black magic". >> > > I used the term black magic, because nobody without reading documentation > can find this operator. > I think this is what document is used for.. > It is used just for this special case, and the functionality is the same > as using cast (only with different performance). > This is not good, but I didn't see a better choice so far, see my first graph. > > The operator ->> is more widely used. But if we have some possibility to > work without it, then the usage for a lot of users will be more simple. > More if the target types can be based on context > It would be cool but still I didn't see a way to do that without making something else complex. >>> Maybe we can introduce some *internal operator* "extract to type", and >>> in rewrite stage we can the pattern (x->'field')::type transform to OP(x, >>> 'field', typid) >>> >> >> Not sure what the OP should be? If it is a function, what is the >> return value? It looks to me like it is hard to do in c language? >> > > It should be internal structure - it can be similar like COALESCE or IS > operator > It may work, but see my answer in the first graph. > > >> >> After all, if we really care about the number of operators, I'm OK >> with just let users use the function directly, like >> >> jsonb_field_as_numeric(jsonb, 'filedname') >> jsonb_field_as_timestamp(jsonb, 'filedname'); >> jsonb_field_as_timestamptz(jsonb, 'filedname'); >> jsonb_field_as_date(jsonb, 'filedname'); >> >> it can save an operator and sloves the readable issue. >> > > I don't like it too much, but it is better than introduction new operator > Good to know it. Naming operators is a complex task if we add four. > We already have the jsonb_extract_path and jsonb_extract_path_text > function. > I can't follow this. jsonb_extract_path returns a jsonb, which is far away from our goal: return a numeric effectively? I can imagine to usage "anyelement" type too. some like > `jsonb_extract_path_type(jsonb, anyelement, variadic text[] )` > Can you elaborate this please? -- Best Regards Andy Fan --000000000000d10984060204ade8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi:
=C2=A0
=
More, I believe so lot = of people uses more common syntax, and then this syntax should to have good= performance - for jsonb - (val->'op')::numeric works, and then = there should not be performance penalty, because this syntax will be used i= n 99%.

This looks like a valid = opinion IMO,=C2=A0 but to rescue=C2=A0it, we have to do
something= like "internal structure" and remove the existing cast.=C2=A0
But even we pay the effort, it still breaks some common knowledge,<= /div>
since xx:numeric is not a cast.=C2=A0 It is an "internal str= ucture"!=C2=A0=C2=A0


I don't think "Black magic" is a= proper word here, since it is not much
different from ->> = return a text.=C2=A0 If you argue text can be cast to=C2=A0
most-= of-types,=C2=A0 that would be a reason, but I doubt this difference
should generate a "black magic".=C2=A0

I used the term black magic, because nobody with= out reading documentation can find this operator.

I think this is what document is used for..=C2=A0= =C2=A0
=C2=A0
It is used just for t= his special case, and the functionality is the same as using cast (only wit= h different performance).

This is not good, but I didn't see a better choice so far,=C2=A0 see m= y first graph.
=C2=A0

The operator ->> is more widely used. But if we have some possibilit= y to work without it, then the usage for a lot of users will be more simple= . More if the target types can be based on context

It would be cool but still I didn't see a= way to do that without making
something else complex.=C2=A0

<= div>
Maybe we can introduce some *internal operator* "ex= tract to type", and in rewrite stage we can the pattern (x->'fi= eld')::type transform to OP(x, 'field', typid)

Not sure what the OP should be?=C2=A0 If = it is a function, what is the
return value?=C2=A0 It looks to me = like it is hard to do in c language?
It should be internal structure - it can be similar like COALES= CE or IS operator

It may = work, but see my answer in the first graph.=C2=A0
=C2=A0
=C2=A0

After all,=C2=A0 if we really care about the number of operators, I'm= OK
with just let users use the function directly, like

jsonb_field_as_numeric(jsonb, 'filedname')=C2=A0
jsonb_field_as_timestamp(jsonb, 'filedname');=C2=A0
jsonb_field_as_timestamptz(jsonb, 'filedname');=C2=A0
<= div>jsonb_field_as_date(jsonb, 'filedname');=C2=A0
it can save an operator and sloves the readable issue.=C2=A0

I don't like it too mu= ch, but it is better than introduction new operator=C2=A0
=

Good to know it.=C2=A0 Naming operators is= a complex task=C2=A0 if we add four.=C2=A0


We already have the jsonb_extract_path and json= b_extract_path_text function.

=
I can't follow this.=C2=A0 jsonb_extract_path returns a jsonb, whi= ch is=C2=A0=C2=A0far away from
our goal: return a numeric effecti= vely?

I can imagine to usage &q= uot;anyelement" type too. some like `jsonb_extract_path_type(jsonb, an= yelement, variadic text[] )`
=C2=A0
Can you=C2=A0elaborate this please?=C2=A0 =C2=A0

=
--
Best Regards
Andy Fan
=
--000000000000d10984060204ade8--