Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wA1qQ-001zxW-1x for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 08:31:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wA1qN-00FoXa-3D for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 08:31:32 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wA1qN-00FoXR-2B for pgsql-hackers@lists.postgresql.org; Tue, 07 Apr 2026 08:31:32 +0000 Received: from mail-yx1-xb12e.google.com ([2607:f8b0:4864:20::b12e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wA1qL-000000013t0-1GmO for pgsql-hackers@lists.postgresql.org; Tue, 07 Apr 2026 08:31:31 +0000 Received: by mail-yx1-xb12e.google.com with SMTP id 956f58d0204a3-64eee7b83cfso3733603d50.3 for ; Tue, 07 Apr 2026 01:31:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1775550688; cv=none; d=google.com; s=arc-20240605; b=VVp1JPA4Qzwy5SAC5ifw7hmzbkbWZf+0/tZ54D3f1DAJTSuE5CK8nJIQYQFlVqR7Eq O+GfBuK5nnaNyEv1BE7OUW22hLJLZuQo0nTATdCVg2WcEoOFW2IbT+gLIqUBmEpoIG7d dThqpr0NyD9AxnNoSl9wxEiGWGxXuu91qJJJqXy9ud8pnVd7Qrgrgp6RJ35T6EnCbPT5 2TbfH18TtkXFMkEYnEn0O8/1b5tAVG008AVgbkdAH53djxLubp3gSVXB+RU0jMNa19s1 q9ouegD4Xw1KCiTUtNnLEuj5qPMOhzj9JP3lAVZsmJ6ngoSWvYDk8F43l4V/wx9o21e/ n7rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=hy8BYIB71bNTzmrGuJjW6Nqa26SdXaqKY2PreSRb/1A=; fh=pd41jyUcyq28QW/0EVCsFSPI9JOSkUVYgivTG1+D8yQ=; b=I6JviV+g34efQXO9RE+rz0xVf2XIZvARmofnUUF6/bFKyw8Dx4Y1s3V8PIsxYL0l+k aFjmRyRURtWrzSxzOz0f6utZnQDZ7HRLIVkX43MZnsG8QkivCOXfBUgFrHLioGm/WQ77 kktNF8zwMPyBzUZHH4M1uMFarFjQPU23HX0KrIORSC+So8Xyrs1yyVgIxMb+6262fi0u 9NxSnrQzaXminjZ4PNMI5KiwiOUf+cEvT44CI2ZYWFtEeHZBbHHsD2v1Lsyz9ArkLULu tuA35dF8zSNRba5RqO0ykDE1nIfUsR8AXyD3/fHPyWckIEqHEFI6Xkcmthg2mnfygJHF v37g==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775550688; x=1776155488; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=hy8BYIB71bNTzmrGuJjW6Nqa26SdXaqKY2PreSRb/1A=; b=GiLOijFL3LGYxk/wOC0EFRMDcEW5QEUbU9ZTGr5FP613lXA/3yq1CIdXdeL8ACFA8H zbZ4EbpgmY0/AuEMmcKRqJhq0oyLsvRTQpoNd4O2J6wTF7RBwZwc4SA9h3Sjr+Whfb65 03s1mu2FjSUREL7DMMTGnrGvQQ4tlu9Uh4L925+d4tX9+FBaKU0dUBWDc5pptH7Hamxk 474T+pTBTDNi+ekx1NOpd2IvNzISa/cpA7dYfajDulQ9bjwEocA2wzDXxfTCKIhRq41A iQzLjsVuyS4m+lylutCI5gSvljK3djvgz5XLACVHqJSHH2MVrbXgZEGAVl5Q5caHqm0g zGqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775550688; x=1776155488; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hy8BYIB71bNTzmrGuJjW6Nqa26SdXaqKY2PreSRb/1A=; b=f+QRVetvDnnFtn0dbgP/jhGkx8MC19neli4qhb4VdN4r/O45voSUlNSQED5hq1JclK 3b91n3Kb8h86o/asD5tjJvSyI0wZz2Rx/9sk08dv7ypIoWYmC3+TWbGcxGKayaryxS9C 7Klj2eNE0TbSgPaIvG3/QM18vlaIX9mOd6H3Iec3d8+l1NZMrzgHQ4yMe4JyCdoY0xtI bSH60zdbXKndgg3ssVROO8nRNOBMKL8iLrymY+XC9pmcb5HK9PjAzCV5N83SuQLAbxuT +okZrSoCmO7pcMVOLe7TtDM99P9AEUCqiR3Zdk7YmvFDfd28BshGwLeYR/NfODCIIpKo L0KA== X-Forwarded-Encrypted: i=1; AJvYcCUFMVhJjWYl9ccVLnHxDrqbeCleDAnUUD27DtEQBLBsf9/CvZ5k4pwpKTYhPCADQgc/OK47YeYt/TbskBSQ@lists.postgresql.org X-Gm-Message-State: AOJu0YzUJD4mF+5zSNdYDSC4wk9kA6nI8rcoHaLTQbHSTGcmTJ61cZ+c TzaU62EVPQICmvAZv1gBX3yxzCvdzRQ7/qYeC2tt+L/E85B7MWCIzwGtZnjlsghxLeGKknHh2f+ rISH/8uaDokYMQKnC6sycjbrvWo5SgDU= X-Gm-Gg: AeBDievqqDAXfCcRnXX55v8s1y4ZVtOOrT4YopOW4KhkNnaH7kgXkwHqTh0Amd3ZBRu Cc2V4ZXV/v/5eI+dxE5oYAdOqHpVXMVVR6+nrESDoqWebPcY469S/E7XETvsLuD2UayT1c51iun UZSxkPxJSWMFJskVaKlIGK0vrcawpqVpYrgUG8Eb+KPdkBy08Hiiim4nZw9YkikOKZkXRuy+Jwk TtxBwhfVsfbT4oxR0FM8XI8U3yom6jrxL/MEgSuOw+FcrywAQSZPjTXVKrjFNRYpl5QzZzUchk9 YvTgs17wh5PR/oPr9e/1UY4WqF2mkttTZVSMbidcXc7bgXpKOlE= X-Received: by 2002:a53:bb48:0:b0:650:3431:24c7 with SMTP id 956f58d0204a3-65048825349mr10983405d50.34.1775550687767; Tue, 07 Apr 2026 01:31:27 -0700 (PDT) MIME-Version: 1.0 References: <87r0hmvuvr.fsf@163.com> <8102ff5b-b156-409e-a48f-e53e63a39b36@eisentraut.org> <8734t6c5rh.fsf@163.com> <87o7bn7z56.fsf@163.com> <875xx197bp.fsf@163.com> <87ttk0lgcx.fsf@163.com> <8734m5fua1.fsf@163.com> <87a5dx4cfb.fsf@163.com> <7487A577-7BDC-4B94-9FFA-741E95439BFC@gmail.com> <873417fgc0.fsf@163.com> In-Reply-To: <873417fgc0.fsf@163.com> From: Haibo Yan Date: Tue, 7 Apr 2026 01:31:16 -0700 X-Gm-Features: AQROBzAPK8U2uyKzc1dkJLt1lCFwLc3zw15nekf1vlhpt3F8EIq8WCXpQk4DVoE Message-ID: Subject: Re: Extract numeric filed in JSONB more effectively To: Andy Fan Cc: Dmitry Dolgov <9erthalion6@gmail.com>, David Rowley , Peter Eisentraut , Amit Langote , Alvaro Herrera , jian he , Chapman Flack , pgsql-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000000671e7064eda9cdc" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000000671e7064eda9cdc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Apr 6, 2026 at 11:14=E2=80=AFPM Andy Fan wr= ote: > > Hi Haibo, > > > I=E2=80=99d like to continue pushing this patch forward. > > Based on the earlier discussion, I reworked the patch into a smaller > > stage-1 version with a narrower scope and a simpler rewrite > > strategy. .. > > and does not yet try to cover array/path extraction or integer/float > > typed extractors. > > Thanks for working on this. I did a quick comparison between this > version and my last patch v18 [1], here is the difference. > > My previous v18[1]: 542 insertions(+), 59 deletions(-) > Your patch: 543 insertions(+), 5 deletions(-) > > However as what you have realized, your current patch still lack of many > optimizations, Not only the integer/float stuff, but also many > functions, e.g. jsonb_array_element, jsonb_extract_path, > jsonb_path_query and jsonb_path_query_first. After considering this, > what patch will look like in your approach? I guess you can see many > duplications. Less code doesn't always mean better, but I think this > still be a key consideration to address/check. > Yes, I agree the current patch is still missing a large part of the full matrix from v18 =E2=80=94 not only int4/int8/float8, but also other extract= or families such as jsonb_array_element, jsonb_extract_path, jsonb_path_query, and jsonb_path_query_first. My intent with this version was not to cover the whole space at once, but to first reduce it to a smaller stage-1 subset that is easier to review and easier to reason about. In particular, I wanted to validate the simpler rewrite shape first: keep the support-function-based approach, but rewrite directly to explicit typed extractor functions, without the previous start/finish/internal pipeline. I agree that if this approach is extended to the full matrix naively, duplication will become a real issue. So I am not claiming that the current patch shape should simply be copied mechanically across every extractor/type combination. My thought was to first see whether this narrower subset is acceptable in principle, and then decide how best to extend it without reintroducing too much abstraction or too much duplication. So for now I would prefer to keep the current scope intentionally narrow: - jsonb_object_field / -> / equivalent subscripting form - casts to numeric and bool and treat the rest as follow-up work, rather than trying to solve the entire matrix in one patch. > > > I also ran a small microbenchmark to isolate the cast-over-object-field > path. On my setup, the current patch shows the following gains: > > Query Before After > Speedup > > > -------------------------------------------------------------------------= ------- > > SELECT sum((j->'n')::numeric) FROM t 118.028 ms 56.082 ms > 2.10x > > SELECT count(*) FROM t WHERE (j->'b')::bool 115.665 ms 51.945 ms > 2.23x > > > -------------------------------------------------------------------------= ------- > > Thanks for running the test. > > [1] https://www.postgresql.org/message-id/87ttk0lgcx.fsf%40163.com > > -- > Best Regards > Andy Fan > Thanks, Haibo --0000000000000671e7064eda9cdc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Mon, Apr 6, 2026 at 11:14=E2=80=AFPM A= ndy Fan <zhihuifan1213@163.com<= /a>> wrote:

Hi Haibo,

> I=E2=80=99d like to continue pushing this patch forward.
> Based on the earlier discussion, I reworked the patch into a smaller > stage-1 version with a narrower scope and a simpler rewrite
> strategy. ..
> and does not yet try to cover array/path extraction or integer/float > typed extractors.

Thanks for working on this. I did a quick comparison between this
version and my last patch v18 [1],=C2=A0 here is the difference.

My previous v18[1]: 542 insertions(+), 59 deletions(-)
Your patch: 543 insertions(+), 5 deletions(-)

However as what you have realized, your current patch still lack of many optimizations, Not only the integer/float stuff, but also many
functions, e.g. jsonb_array_element, jsonb_extract_path,
jsonb_path_query and jsonb_path_query_first. After considering this,
what patch will look like in your approach? I guess you can see many
duplications. Less code doesn't always mean better, but I think this still be a key consideration to address/check.=C2=A0
<= br>

Yes, I agree the current patch is still missing a lar= ge part of the full matrix from v18 =E2=80=94 not only int4/int8/float8, but also other extractor families such as jsonb_array_element, jsonb_extra= ct_path, jsonb_path_query, and jsonb_path_query_first.

My intent with this version was not to cover the whol= e space at once, but to first reduce it to a smaller stage-1 subset that is= easier to review and easier to reason about. In particular, I wanted to va= lidate the simpler rewrite shape first: keep the support-function-based app= roach, but rewrite directly to explicit typed extractor functions, without = the previous start/finish/internal pipeline.

I agree that if this approach is extended to the full= matrix naively, duplication will become a real issue. So I am not claiming= that the current patch shape should simply be copied mechanically across e= very extractor/type combination. My thought was to first see whether this n= arrower subset is acceptable in principle, and then decide how best to exte= nd it without reintroducing too much abstraction or too much duplication.

So for now I would prefer to keep the current scope i= ntentionally narrow:

  • jsonb_object_field / = -> / equivalent subscripting form

  • casts to numeric and = bool

and treat the rest as follow-up work, rather than try= ing to solve the entire matrix in one patch.=C2=A0


> I also ran a small microbenchmark to isolate the cast-over-object-fiel= d path. On my setup, the current patch shows the following gains:
> Query=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 Before=C2=A0 =C2=A0 =C2=A0 After=C2=A0 =C2=A0 =C2=A0 =C2=A0Speedup
> ----------------------------------------------------------------------= ----------
> SELECT sum((j->'n')::numeric) FROM t=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0118.028 ms=C2=A0 56.082 ms=C2=A0 =C2=A02.10x
> SELECT count(*) FROM t WHERE (j->'b')::bool=C2=A0 115.665 m= s=C2=A0 51.945 ms=C2=A0 =C2=A02.23x
> ----------------------------------------------------------------------= ----------

Thanks for running the test.

[1]
https://www.postgresql.org/message= -id/87ttk0lgcx.fsf%40163.com

--
Best Regards
Andy Fan

Thanks,

Haibo=C2=A0<= /div>
--0000000000000671e7064eda9cdc--