public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andy Fan <[email protected]>
To: Chapman Flack <[email protected]>
Cc: jian he <[email protected]>
Cc: Pavel Stehule <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Extract numeric filed in JSONB more effectively
Date: Wed, 30 Aug 2023 12:47:35 +0800
Message-ID: <CAKU4AWrxBXomhGcubsLqBj4FRbOf1AEXqpwBsm-mGJEJ0G_ypA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAKU4AWoqAVya6PBhn+BCbFaBMt3z-2=i5fKO3bW=6HPhbid2Dw@mail.gmail.com>
<CACJufxHASOEpngQ8V2tbXgs4VZC3ETrVbS=uk0KC_B_J1j7ejQ@mail.gmail.com>
<CAKU4AWrap1zpYqunJwWTN=CdP7E8e0U4mYmwn7hvTW3ERuENVg@mail.gmail.com>
<CAFj8pRD-R-GsGCjeYApbhZoiW8TV6zACaYStMBMM0=--+WgN_A@mail.gmail.com>
<CAKU4AWpDdFXAD+dMC1HeErXSKBUUBRGWkf=dAcX3wZgBNsWM=g@mail.gmail.com>
<CAFj8pRAO3oEiBaJJ9=HZp6CoP2ffbwSgrKkLKjPfYZwx9wOOuQ@mail.gmail.com>
<CAKU4AWoCHpKAVuQeOrk44cVPy_dVxn1aHrMUvHy5Ag-daFCSsQ@mail.gmail.com>
<CAFj8pRD4cdUmK0RG4oN5B2KRSeDhwfMYaL=XpfEu4iaLeZ_Kow@mail.gmail.com>
<CAKU4AWp8ab61e96v57OaB-Gm1bMfBNVLVy+s17U6_Ne3veB84g@mail.gmail.com>
<[email protected]>
<CAKU4AWp+KLes8g=BWLqZfDmW9+=ZY0UC4G0i3qVcYEviK_dDTA@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAKU4AWrBY9GHj9oZbvhiOG1BgiWyZC8FGPAET-CfRKDhYyv1HQ@mail.gmail.com>
<CAKU4AWrGM5bK7wi4Y8bTYhKgh=A1fW=X00eC_jfk6_JXyaEURQ@mail.gmail.com>
<CAFj8pRC+4pvSuibB2xcNKJ=6PSF=TAcOtRNpdLPBXZjijFg7ag@mail.gmail.com>
<CAKU4AWrxHFVZM-gGPpOrVPreZMePAOoY580Tq-+CvxDWHmP_uA@mail.gmail.com>
<CAKU4AWp3410VpYFdCxpFdaHc8he6zj_=Fvww53TnU+g-bvvvsQ@mail.gmail.com>
<CACJufxH7ftu9HD+h_gDWPDvq1ZO8vGm81JomSKjvQacCLMLcxg@mail.gmail.com>
<CAKU4AWqbd_oDwXyK7=yMKbhAR=CQtVOWszcgft+cTG6JCTKmzQ@mail.gmail.com>
<[email protected]>
<CAKU4AWr1bsGaWWzQHJwB=WXboJrCM242=x6XbHO06vuhWsY4Ww@mail.gmail.com>
<[email protected]>
<CAKU4AWoLgC5ejOF8jxskd5oq52D-eR_1Q-HM5+e8OBVLak=qTg@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAKU4AWpnFXYXtfOQp_QdQ7RwCD-d+qzyZKdNygc+9DY8bveVYQ@mail.gmail.com>
<[email protected]>
<CAKU4AWrysndLra+SZLsODZF-bet1JnPMLU9HsiFH75ZnSPK2zw@mail.gmail.com>
<CAKU4AWrs4Pzajm2_tgtUTf=CWfDJEx=3h45Lhqg7tNOVZw5YxA@mail.gmail.com>
<CAKU4AWryj3pFG87mfQGS-K6XGOiyeYsMLOgajZ337NRV1F6Wfw@mail.gmail.com>
<[email protected]>
<[email protected]>
(Sorry for leaving this discussion for such a long time, how times fly!)
On Sun, Aug 27, 2023 at 6:28 AM Chapman Flack <[email protected]> wrote:
> On 2023-08-22 08:16, Chapman Flack wrote:
> > On 2023-08-22 01:54, Andy Fan wrote:
> >> After we label it, we will get error like this:
> >>
> >> select (a->'a')::int4 from m;
> >> ERROR: cannot display a value of type internal
> >
> > Without looking in depth right now, I would double-check
> > what relabel node is being applied at the result. The idea,
> > of course, was to relabel the result as the expected result
>
> as I suspected, looking at v10-0003, there's this:
>
> + fexpr = (FuncExpr *)makeRelabelType((Expr *) fexpr, INTERNALOID,
> + 0, InvalidOid,
> COERCE_IMPLICIT_CAST);
>
> compared to the example I had sent earlier:
>
> On 2023-08-18 17:02, Chapman Flack wrote:
> > expr = (Expr *)makeRelabelType((Expr *)fexpr,
> > targetOid, -1, InvalidOid, COERCE_IMPLICIT_CAST);
>
> The key difference: this is the label going on the result type
> of the function we are swapping in.
I'm feeling we have some understanding gap in this area, let's
see what it is. Suppose the original query is:
numeric(jsonb_object_field(v_jsonb, text)) -> numeric.
without the patch 003, the rewritten query is:
jsonb_object_field_type(NUMERICOID, v_jsonb, text) -> NUMERIC.
However the declared type of jsonb_object_field_type is:
jsonb_object_field_type(internal, jsonb, text) -> internal.
So the situation is: a). We input a CONST(type=OIDOID, ..) for an
internal argument. b). We return a NUMERIC type which matches
the original query c). result type NUMERIC doesn't match the declared
type 'internal' d). it doesn't match the run-time type of internal
argument which is OID.
case a) is fixed by RelableType. case b) shouldn't be treat as an
issue. I thought you wanted to address the case c), and patch
003 tries to fix it, then the ERROR above. At last I realized case
c) isn't the one you want to fix. case d) shouldn't be requirement
at the first place IIUC.
So your new method is:
1. jsonb_{op}_start() -> internal (internal actually JsonbValue).
2. jsonb_finish_{type}(internal, ..) -> type. (internal is JsonbValue ).
This avoids the case a) at the very beginning. I'd like to provides
patches for both solutions for comparison. Any other benefits of
this method I am missing?
> Two more minor differences: (1) the node you get from
> makeRelabelType is an Expr, but not really a FuncExpr. Casting
> it to FuncExpr is a bit bogus. Also, the third argument to
> makeRelabelType is a typmod, and I believe the "not-modified"
> typmod is -1, not 0.
>
My fault, you are right.
>
> On 2023-08-21 06:50, Andy Fan wrote:
> > I'm not very excited with this manner, reasons are: a). It will have
> > to emit more steps in ExprState->steps which will be harmful for
> > execution. The overhead is something I'm not willing to afford.
>
> I would be open to a performance comparison, but offhand I am not
> sure whether the overhead of another step or two in an ExprState
> is appreciably more than some of the overhead in the present patch,
> such as the every-time-through fcinfo initialization buried in
> DirectFunctionCall1 where you don't necessarily see it. I bet
the fcinfo in an ExprState step gets set up once, and just has
> new argument values slammed into it each time through.
>
fcinfo initialization in DirectFunctionCall1 is an interesting point!
so I am persuaded the extra steps in ExprState may not be
worse than the current way due to the "every-time-through
fcinfo initialization" (in which case the memory is allocated
once in heap rather than every time in stack). I can do a
comparison at last to see if we can find some other interesting
findings.
> I would not underestimate the benefit of reducing the code
> duplication and keeping the patch as clear as possible.
> The key contributions of the patch are getting a numeric or
> boolean efficiently out of the JSON operation. Getting from
> numeric to int or float are things the system already does
> well.
True, reusing the casting system should be better than hard-code
the casting function manually. I'd apply this on both methods.
> A patch that focuses on what it contributes, and avoids
> redoing things the system already can do--unless the duplication
> can be shown to have a strong performance benefit--is easier to
> review and probably to get integrated.
>
Agreed.
At last, thanks for the great insights and patience!
--
Best Regards
Andy Fan
view thread (35+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Extract numeric filed in JSONB more effectively
In-Reply-To: <CAKU4AWrxBXomhGcubsLqBj4FRbOf1AEXqpwBsm-mGJEJ0G_ypA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox