Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8WUx-000cW8-0M for pgsql-hackers@arkaria.postgresql.org; Fri, 03 Apr 2026 04:51:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w8WUv-009kOv-2T for pgsql-hackers@arkaria.postgresql.org; Fri, 03 Apr 2026 04:51:10 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8WUv-009kOm-17 for pgsql-hackers@lists.postgresql.org; Fri, 03 Apr 2026 04:51:09 +0000 Received: from mail-yx1-xb133.google.com ([2607:f8b0:4864:20::b133]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w8WUq-00000000IeM-2ftU for pgsql-hackers@lists.postgresql.org; Fri, 03 Apr 2026 04:51:08 +0000 Received: by mail-yx1-xb133.google.com with SMTP id 956f58d0204a3-64ef161129bso1308395d50.1 for ; Thu, 02 Apr 2026 21:51:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1775191864; cv=none; d=google.com; s=arc-20240605; b=NTSTOMyUZKRZV4/ZiPfUVwDKJNuKhkBGmy3TPCabWqPqhL1XIwGzcYgJi63EIsf5I/ K8/+Bg0FA9cWJPCRjEbN0uFI5AZ5KZVsSQaZzuA7J+gY9ABoFXWU3CFve2yIYJkpQ5ev D2bk+xIp5+PJ059CpHDmcZRKq/GnaOFINU32dDB5k80aKWIdNliDMtVXs3DWBKEJMXFU 8ZPoQcwHTeI5xeWRzgnqqrAL4Vd66UkZRPdJmH9ExF1j4fEWJCraXVVJLVHxFAq0zNkK PMXgRoEvmQjwmDuqBFthQqiZLbqE46W6q0XYB8tZU/0iuWUTkFeY9mMwpZFQpyCYNakn 9tBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=HTRj1my1JLCGtWD34ZF85y/htOXVy5Yiujjf7CNhJe8=; fh=uAOLH2HrwfcRp/M3tDvNXZ1bNY1Dxx+grKQGlZsCSdo=; b=Kbrcs0lsB5M8O+UuiXPpzzZUB+m38f1chyeTHhyuD2UsR/vFjxn9XACpPSzL33K2Dz q4JLQPQdgH/TWUHz3cEpMXuJedOKkO5kQJ+k1GetNedWu7/PFyd96rG2BxjWS5A1Rdbw SWHgwgIItwWOfgSbqMOodqRxWVOLupY6g8TxgCvlTFztR4P9HfE4O1Uw8/tQPBrWj7iJ JBvsqZ7z0Wx5QDQGlKPNFv3iM0Uu51I3zh7nTagHo1gbNH8p4a4arl6IdZG6c1gHp80V 35hy2He825Ngld/goiw34hELjvPhiO87S9AJ5nC3GF7QUvZQ1IkfXwlUvotIcYzhgoKu 1K2Q==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775191864; x=1775796664; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=HTRj1my1JLCGtWD34ZF85y/htOXVy5Yiujjf7CNhJe8=; b=kCHqA9ziqilzWgeup8zLb1w0gi04gzjCuuJXsGAisTk9zT+mgcDcxsj8i8+IDWsF3J Tp/jyg9S5H/YBtQ+Pq1rXt8i5Gqj2IbqD0GXuepnZBHwe8LxPkpvk2Cvux2P9Y0NlmrY 0IJrG0GyOg5+7et5bAq1Y+iKM6oFrmFJwbwSfqsighaA912IRxoK94RUdxVdhPJfrzBO zUo3w5OBfJtCRSDLNanHPw6k7Gi548GybRXS/d0wtHTG6ushs5d1iRG4pJBIsUVk0ZSq I47naCkB3fUIfbZ5jtnAJuzVX5TPaq0hdCbDcYX4aKtCcydAvm7QDiqjDJKlRVwOz8dT lSfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775191864; x=1775796664; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=HTRj1my1JLCGtWD34ZF85y/htOXVy5Yiujjf7CNhJe8=; b=FfuHf/T2mxJ4lKwnBnv7OfiqpdQSYLv9y3UjIN0I3JoqRbji2wh7bYo/bIh4nhjzCt Nf3shOR+HkD0KE4IKcRnREmivDrt64winKz86chn0SqjdRc0WVOHBRKQsDHfMhhKdNhD 2Ir/IKBk+Xv5wzRhsoeIRoCUk+I00p+HJOE8zcVYe08abAjvpXK1YrkNod6sEUFVKQv4 0NJjk8ASk42nH4uLHjT4Y2fHg7c+WZdMbWns7aHZVtfkOCD7TTnuGNYOGsjj/YDsMV2G DzgbZPX8HdCpYVByUTIQgfzrBNFzEnOHn1QOb1U8kd+3iPBhmRj/YATMKgTl24XlKIY8 a67A== X-Forwarded-Encrypted: i=1; AJvYcCWklpBGwn/djebwYTncSypPqrEq8OwYg1KS+s1744VOlmTMqskQ2dy92di+2k4lN2CfQ34A5Dv4uQOnJXmx@lists.postgresql.org X-Gm-Message-State: AOJu0Yw+iOVqRcwDp/mcLYk6/GcknvcJho1SUGed1DwzX2dbGJwnwLfi cQFJjl3z1qpeJHvnDDxIrsqYHR6az0+r42eqiiIpjAttlKRm9J/0WF5L0b1QhG2Y3J8jaiQEmm7 nB3GL1IXIIBFXi8o0deOv1Zdk1HvRMSQ= X-Gm-Gg: AeBDietetjmyyqNG0RYe1CpFbsqx8pRa30iGASahxBnwYcMdHxU91IDeEgO4GnEC1mT 74ZEH1HshBY8QUxoWsHzmpWdXivSWofsUl30lRXsdMYIrLDfKiuTrI3/OMEGzsEU1nF9K9Mn2da cbQ4g8vujF035Z3SNWzTLTbEiVRZc5s2Ro2hoP2VKUwMYa6CUE5eFDAM3v+kvcHxzaKygpBqKX8 gN6vm+33FVEKilacoIAPSXQKkwvlkC0vhTXJ972BJmVg5zHxNaBulSKJeoVQeSoUuL+pBJycfz8 FzJghRrIZwvnXGh8yEUKikEx1Hq6Y/KVEcWj1cvUl0DEWHiGn9AvvFuEh1opLI0AzjFN1KNlEbu V36BgwuXFB4G2cHe3Z7sAOCdXu6e0BXw2+KZBC/oD/3xNZtg7ywbHxyMq3A== X-Received: by 2002:a05:690e:140b:b0:64a:e589:ecb2 with SMTP id 956f58d0204a3-65048877510mr1439859d50.62.1775191863743; Thu, 02 Apr 2026 21:51:03 -0700 (PDT) MIME-Version: 1.0 References: <87r0hmvuvr.fsf@163.com> <8102ff5b-b156-409e-a48f-e53e63a39b36@eisentraut.org> <8734t6c5rh.fsf@163.com> <87o7bn7z56.fsf@163.com> <875xx197bp.fsf@163.com> <87ttk0lgcx.fsf@163.com> <8734m5fua1.fsf@163.com> <87a5dx4cfb.fsf@163.com> <7487A577-7BDC-4B94-9FFA-741E95439BFC@gmail.com> In-Reply-To: <7487A577-7BDC-4B94-9FFA-741E95439BFC@gmail.com> From: Pavel Stehule Date: Fri, 3 Apr 2026 06:50:27 +0200 X-Gm-Features: AQROBzDW75Bg5I39kmTqkKUZEfTjIZZfvYVcLQ9IBt43cTEpvv-ijYr84bnLBho Message-ID: Subject: Re: Extract numeric filed in JSONB more effectively To: Haibo Yan Cc: Dmitry Dolgov <9erthalion6@gmail.com>, Andy Fan , David Rowley , Peter Eisentraut , Amit Langote , Alvaro Herrera , jian he , Chapman Flack , pgsql-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="00000000000072571c064e871074" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000072571c064e871074 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable p=C3=A1 3. 4. 2026 v 6:46 odes=C3=ADlatel Haibo Yan = napsal: > On Nov 22, 2024, at 10:14=E2=80=AFAM, Dmitry Dolgov <9erthalion6@gmail.co= m> wrote: > > > >> On Mon, Nov 18, 2024 at 08:23:52AM GMT, Andy Fan wrote: > >> > >>>>> I imagined you'd the patch should create a SupportRequestSimplify > >>>>> support function for jsonb_numeric() that checks if the input > >>>>> expression is an OpExpr with funcid of jsonb_object_field(). All y= ou > >>>>> do then is ditch the cast and change the OpExpr to call a new > function > >>>>> named jsonb_object_field_numeric() which returns the val.numeric > >>>>> directly. Likely the same support function could handle jsonb cast= s > >>>>> to other types too, in which case you'd just call some other > function, > >>>>> e.g jsonb_object_field_timestamp() or jsonb_object_field_boolean(). > >>>> > >>>> Basically yes. The reason complexity comes when we many operators we > >>>> want to optimize AND my patch I want to reduce the number of functio= n > >>>> created. > >>>> > >>>> [...] > >>>> > >>>> Within the start / finish function, we need to create *7* functions. > >>> > >>> Any particular reason you want to keep number of functions minimal? I= s > >>> it just to make the patch smaller? I might be missing something witho= ut > >>> looking at the implementation in details, but the difference between = 10 > >>> and 7 functions doesn't seem to be significant. > >> > >> Another reason is for reducing code duplication, writting too many > >> similar function looks not good to me. Chapman expressed this idea > >> first at [1]. Search "it would make me happy to further reduce some > >> of the code" in the message. > >> > >> Acutally this doesn't make the patch complexer too much. > >> > >> [1] > >> > https://www.postgresql.org/message-id/5138c6b5fd239e7ce4e1a4e63826ac27%40= anastigmatix.net > > > > It might not make everything too much complex, but e.g. relabeling of > > the first argument for a "finish" function into an internal one sounds > > strange to me. Maybe there is a way to avoid duplication of the code, > > but keep all needed functions in pg_proc? > > > > Btw, sorry to complain about small details, but I find start / finish > > naming pattern not quite fitting here. Their main purpose is to extract > > / convert a value, the order in which they are happening is less > > relevant. > > > > > > > > > > Hi all, > I=E2=80=99d like to continue pushing this patch forward. > Based on the earlier discussion, I reworked the patch into a smaller > stage-1 version with a narrower scope and a simpler rewrite strategy. The > current patch keeps the normal SQL syntax unchanged and uses > support-function simplification to rewrite only the following patterns: > (jsonb_object_field(...))::numeric > jsonb_object_field(...))::bool > into explicit typed extractor calls. > So at this stage it intentionally covers only: > jsonb_object_field(...) / -> > casts to numeric > casts to bool > and does not yet try to cover array/path extraction or integer/float type= d > extractors. > I also ran a small microbenchmark to isolate the cast-over-object-field > path. On my setup, the current patch shows the following gains: > Query Before After > Speedup > > -------------------------------------------------------------------------= ------- > SELECT sum((j->'n')::numeric) FROM t 118.028 ms 56.082 ms 2.10= x > SELECT count(*) FROM t WHERE (j->'b')::bool 115.665 ms 51.945 ms 2.23= x > > -------------------------------------------------------------------------= ------- > These are microbenchmark numbers rather than end-to-end workload results, > but they suggest that the simplified rewrite path is worth pursuing. > My goal with this version is not to solve the full matrix at once, but to > first land a reviewer-friendly subset that: > 1. does not introduce new user-visible operators, > 2. keeps ordinary cast syntax unchanged, > 3. avoids the more abstract internal/start-finish style machinery, > 4. and uses explicit rewrite targets that are easier to review. > If this direction looks reasonable, I=E2=80=99d appreciate another round = of review > on the updated patch. If people think the stage-1 scope is acceptable, I > can continue with follow-up patches for additional extractor families and > target types. > Thanks, > Haibo > +1 Pavel --00000000000072571c064e871074 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


p=C3=A1 3. 4. 2= 026 v=C2=A06:46 odes=C3=ADlatel Haibo Yan <tristan.yim@gmail.com> napsal:
On Nov 22, 2024, at 10:14=E2=80=AFAM, Dmi= try Dolgov <9= erthalion6@gmail.com> wrote:
>
>> On Mon, Nov 18, 2024 at 08:23:52AM GMT, Andy Fan wrote:
>>
>>>>> I imagined you'd the patch should create a Support= RequestSimplify
>>>>> support function for jsonb_numeric() that checks if th= e input
>>>>> expression is an OpExpr with funcid of jsonb_object_fi= eld().=C2=A0 All you
>>>>> do then is ditch the cast and change the OpExpr to cal= l a new function
>>>>> named jsonb_object_field_numeric() which returns the v= al.numeric
>>>>> directly.=C2=A0 Likely the same support function could= handle jsonb casts
>>>>> to other types too, in which case you'd just call = some other function,
>>>>> e.g jsonb_object_field_timestamp() or jsonb_object_fie= ld_boolean().
>>>>
>>>> Basically yes. The reason complexity comes when we many op= erators we
>>>> want to optimize AND my patch I want to reduce the number = of function
>>>> created.
>>>>
>>>> [...]
>>>>
>>>> Within the start / finish function, we need to create *7* = functions.
>>>
>>> Any particular reason you want to keep number of functions min= imal? Is
>>> it just to make the patch smaller? I might be missing somethin= g without
>>> looking at the implementation in details, but the difference b= etween 10
>>> and 7 functions doesn't seem to be significant.
>>
>> Another reason is for reducing code duplication, writting too many=
>> similar function looks not good to me. Chapman expressed this idea=
>> first at [1]. Search "it would make me happy to further reduc= e some
>> of the code" in the message.
>>
>> Acutally this doesn't make the patch complexer too much.
>>
>> [1]
>> ht= tps://www.postgresql.org/message-id/5138c6b5fd239e7ce4e1a4e63826ac27%40anas= tigmatix.net
>
> It might not make everything too much complex, but e.g. relabeling of<= br> > the first argument for a "finish" function into an internal = one sounds
> strange to me. Maybe there is a way to avoid duplication of the code,<= br> > but keep all needed functions in pg_proc?
>
> Btw, sorry to complain about small details, but I find start / finish<= br> > naming pattern not quite fitting here. Their main purpose is to extrac= t
> / convert a value, the order in which they are happening is less
> relevant.
>
>
>
>

Hi all,
I=E2=80=99d like to continue pushing this patch forward.
Based on the earlier discussion, I reworked the patch into a smaller stage-= 1 version with a narrower scope and a simpler rewrite strategy. The current= patch keeps the normal SQL syntax unchanged and uses support-function simp= lification to rewrite only the following patterns:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 (jsonb_object_field(...))::numeric
=C2=A0 =C2=A0 =C2=A0 =C2=A0 jsonb_object_field(...))::bool
into explicit typed extractor calls.
So at this stage it intentionally covers only:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 jsonb_object_field(...) / ->
=C2=A0 =C2=A0 =C2=A0 =C2=A0 casts to numeric
=C2=A0 =C2=A0 =C2=A0 =C2=A0 casts to bool
and does not yet try to cover array/path extraction or integer/float typed = extractors.
I also ran a small microbenchmark to isolate the cast-over-object-field pat= h. On my setup, the current patch shows the following gains:
Query=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Befor= e=C2=A0 =C2=A0 =C2=A0 After=C2=A0 =C2=A0 =C2=A0 =C2=A0Speedup
---------------------------------------------------------------------------= -----
SELECT sum((j->'n')::numeric) FROM t=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0118.028 ms=C2=A0 56.082 ms=C2=A0 =C2=A02.10x
SELECT count(*) FROM t WHERE (j->'b')::bool=C2=A0 115.665 ms=C2= =A0 51.945 ms=C2=A0 =C2=A02.23x
---------------------------------------------------------------------------= -----
These are microbenchmark numbers rather than end-to-end workload results, b= ut they suggest that the simplified rewrite path is worth pursuing.
My goal with this version is not to solve the full matrix at once, but to f= irst land a reviewer-friendly subset that:
1. does not introduce new user-visible operators,
2. keeps ordinary cast syntax unchanged,
3. avoids the more abstract internal/start-finish style machinery,
4. and uses explicit rewrite targets that are easier to review.
If this direction looks reasonable, I=E2=80=99d appreciate another round of= review on the updated patch. If people think the stage-1 scope is acceptab= le, I can continue with follow-up patches for additional extractor families= and target types.
Thanks,
Haibo

+1

Pavel=
=C2=A0
--00000000000072571c064e871074--