On Fri, Apr 10, 2026 at 2:48 PM Haibo Yan <tristan.yim@gmail.com> wrote:On Wed, Apr 8, 2026 at 12:50 PM Haibo Yan <tristan.yim@gmail.com> wrote:On Tue, Apr 7, 2026 at 6:21 PM Haibo Yan <tristan.yim@gmail.com> wrote:On Tue, Apr 7, 2026 at 5:00 PM Andy Fan <zhihuifan1213@163.com> wrote:Haibo Yan <tristan.yim@gmail.com> writes:Hi Andy,
Hi Haibo,
> I agree that if this approach is extended to the full matrix naively,
> duplication will become a real issue.
Could you summary how it would be? I think it would be helpful for
others to review. Otherwise every reviewer needs to count them many
times.
--
Best Regards
Andy Fan
Sure.
My current thought is to extend it in stages, rather than trying to solve the full matrix in a single patch.
A rough plan would be:
1. Keep the current stage-1 patch small and validate the basic approach first
2. Extend target types before extending extractor families
jsonb_object_field / -> / equivalent subscripting form
casts to numeric and bool
support-function rewrite directly to explicit typed extractor functions
3. Then extend to other extractor families with the same overall pattern
add int4 / int8 / float8 for the same object-field family first
keep the SQL-visible rewrite targets explicit, e.g.
jsonb_object_field_int4
jsonb_object_field_int8
jsonb_object_field_float8
avoid the previous numeric-intermediate rewrite shape
4. Keep duplication manageable by sharing the implementation underneath
likely starting with jsonb_array_element and jsonb_extract_path
and possibly jsonb_path_query_first later
each family would still rewrite to explicit typed extractor entry points, e.g.
jsonb_array_element_numeric
jsonb_extract_path_bool
jsonb_path_query_first_int4
keep the SQL/catalog-level rewrite targets explicit for readability and reviewability
but factor the C implementation into:
extractor-family lookup helpers
target-type conversion helpers
thin wrappers, possibly generated with small macros
So the idea would be: explicit rewrite targets at the SQL/catalog level, but shared lookup/conversion code underneath, instead of going back to the earlier start/finish/internal pipeline.
I agree that if this is extended naively across the full matrix, duplication will become a real issue. My reason for keeping the current patch narrow is that I wanted to first validate this simpler rewrite shape on a small subset before deciding how best to scale it further.
Regards,
Haibo
Hi all,
Following up on our previous discussion, I want to clarify the current patch plan together with the updated first patch.
Earlier I described this work roughly as a 4-patch line. After iterating on the implementation and trying to keep each step reviewable, I now think the cleaner split is a 5-patch series:
object-field casts to scalar types
array-element casts to scalar types
extract-path casts to scalar types
multi-subscript casts via extract-path lowering
jsonpath-first casts to scalar types (jsonb_path_query_first and _tz)
The overall design is unchanged: use the cast function’s support hook to recognize cast(extract(...)) over scalar-returning jsonb extraction functions, and rewrite that directly to explicit typed extractor calls.
Supported target types remain:
- numeric
- bool
- int4
- int8
- float8
One point I also want to make explicit is that I do not plan to include jsonb_path_query in this series.
After looking at it more carefully, I do not think it fits the same model as the rest of the series. The patches here are all about scalar-returning extraction functions, where the cast prosupport hook can see and rewrite a scalar expression pair. jsonb_path_query is set-returning, so optimizing casts over it would likely need a different mechanism, probably at planner or executor level, rather than one more patch in this prosupport-based series.
Attached here is the updated first patch in the current plan.
This patch covers object-field extraction only:
- jsonb_object_field
- -> with text key
- key subscripting
and rewrites casts to:
- numeric
- bool
- int4
- int8
- float8
to direct typed extractor calls.
Thanks again for the earlier comments. I plan to send the remaining patches in follow-up emails in the order above.
Regards,
HaiboHi all,
Following up on the earlier thread, I am continuing with the same overall plan and sending the next two patches in the series together.
The design is still the same as before: use the cast function’s support hook to recognize cast(extract(...)) over scalar-returning jsonb extraction functions, and rewrite that directly to explicit typed extractor calls, without changing normal SQL syntax.
At this point, I have also folded int2 and float4 into the per-family patches, so each patch now carries the full target-type coverage for the functionality it introduces.
The supported target types are now:
numericI am still keeping jsonb_path_query out of scope for this series. The series is focused on scalar-returning extraction functions, where the cast prosupport hook can see and rewrite a scalar expression pair. jsonb_path_query is set-returning, so optimizing casts over it looks like a different planner/executor problem rather than one more patch in this prosupport-based line.
- bool
- int2
- int4
- int8
- float4
- float8
With that in mind, the current 5-patch plan is:In this email I am sending the first two patches:
- object-field casts to scalar types
- array-element casts to scalar types
- extract-path casts to scalar types
- multi-subscript casts via extract-path lowering
- jsonpath-first casts to scalar types (jsonb_path_query_first and _tz)
Patch 1 covers:
- patch 1: object-field casts to scalar types
- patch 2: array-element casts to scalar types
Patch 2 covers:
- jsonb_object_field
- -> with text key
- key subscripting
Both patches now support the full target-type set listed above.
- jsonb_array_element
- -> with integer RHS
- single-index array subscripting
Thanks again for the earlier comments. I plan to continue with the remaining patches in follow-up emails in the same order.
Regards,
HaiboHi all,
I spent some time reworking this patch series into a shape that I think is easier to review and easier to reason about patch-by-patch.
The goal of the series is still the same: optimize casts over scalar-returning
jsonbextraction functions by using the cast function’s support hook to recognize:
cast(extract(...))and rewrite that directly to explicit typed extractor calls.
That keeps ordinary SQL syntax unchanged, but avoids the extra jsonb scalar wrapping/unwrapping on the optimized path.
At this point the series is organized as 5 patches:
- object-field casts to scalar types
- array-element casts to scalar types
- extract-path casts to scalar types
- multi-subscript casts via extract-path lowering
- jsonpath-first casts to scalar types
The supported target types are:
numericboolint2int4int8float4float8The covered scalar-returning extraction families/forms are:
jsonb_object_field->with text key- equivalent key subscripting
jsonb_array_element->with integer RHS- single-index array subscripting
jsonb_extract_path#>- direct
jsonb_extract_path(...)- multi-subscript jsonb subscripting (lowered to extract-path)
jsonb_path_query_firstjsonb_path_query_first_tzA few points about the current shape of the series:
- I intentionally stayed with the support-function rewrite model and did not introduce new user-visible operators.
- I also did not go back to the earlier start/finish/internal pipeline approach. The current version rewrites directly to explicit typed extractor functions.
- I folded
int2andfloat4into the same family-based design, since they are the remaining natural numeric-family targets that fit the same conversion model.- I kept the multi-subscript patch conservative. It lowers through the existing extract-path family, but it does not try to turn this into a broader executor-side subscripting redesign.
One explicit boundary of the series is that it does not try to optimize
jsonb_path_query.After looking at that more carefully, I do not think it belongs in the same patch line. This series is about scalar-returning extraction functions, where the cast prosupport hook can see and rewrite a scalar expression pair.
jsonb_path_queryis set-returning, so optimizing casts over it would likely require a different mechanism, probably at planner or executor level, rather than one more patch in this prosupport-based series.Another explicit boundary is that the series stops at the current numeric/bool scalar targets. I did not try to extend it to text/date/uuid-like conversions, because those would require different semantics such as full jsonb serialization or string parsing, rather than the direct scalar conversion pattern used here.
I expect one likely question will be the catalog footprint, since this approach adds a noticeable number of typed extractor builtins. I think that is the cleanest tradeoff in this case: SQL is statically typed, and explicit typed extractors keep the rewrite path simple, predictable, and reviewable. The implementation-side boilerplate is kept under control with shared helpers and thin family-specific macros.
Thanks in advance for any review and feedback.
Regards,
Haibo
Hi all,
I saw that CI failed on the latest version of the series, in the jsonb_path_query_first / jsonb_path_query_first_tz typed-extractor path.
The failure does not look like an expected-output issue. The more likely problem is in how the typed jsonpath-first helper is obtaining or returning the first JsonbValue from the jsonpath execution result.
At the moment my working suspicion is one of these:
JsonbValue *, for example a pointer into a temporary JsonValueList or other local container whose contents are no longer stable by the time the conversion helper inspects it; orJsonbValue to the conversion helper, and is instead interpreting some other internal structure as a JsonbValue, which would explain the bogus unknown jsonb type failure seen in CI.Regards,
Haibo