Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t2ad5-005GnX-SV for pgsql-general@arkaria.postgresql.org; Sun, 20 Oct 2024 18:26:16 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t2ad2-000Od0-GM for pgsql-general@arkaria.postgresql.org; Sun, 20 Oct 2024 18:26:12 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t2ad2-000Ocs-0R for pgsql-general@lists.postgresql.org; Sun, 20 Oct 2024 18:26:12 +0000 Received: from mail-lj1-x22a.google.com ([2a00:1450:4864:20::22a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t2acz-001sZC-BA for pgsql-general@lists.postgresql.org; Sun, 20 Oct 2024 18:26:11 +0000 Received: by mail-lj1-x22a.google.com with SMTP id 38308e7fff4ca-2f7657f9f62so34686551fa.3 for ; Sun, 20 Oct 2024 11:26:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729448768; x=1730053568; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=b3WGNM0pA1yFhTXkzo8VkzHqwmjg3OPAfEXuNbjlH30=; b=Lub7TMiTsuWGvQTidLq92ODzU6GAP9Kml70CXnh3pZ2k6ytaVO2ARUGq0eQqBTU7Y2 j0adE9IuUQVxT8pKx2YMA7v262T1Kd2CXDTFH8hWUW74Lzz1+0Cm1vYUbLELFS5VNHa8 0MWg8h0OU5y7VVtRmSrEnMLtR+yOT5LMpO8RJWo1VkMvf8QW20eXLYcHxM3lr3Hn5yrI iQRfHkfnX0IAzQiHnySPzpZ31D6nOY+Yeo8iLQgqVnMKVGKpYs/euoj+QcoG2GIykmSt 45nEaiveAIbC1MkTfQLo7/KZb17Z2hyZ+W7GYzSUPoDT85e1xA/RZlb3se/LZt78V2+Y fpcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729448768; x=1730053568; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=b3WGNM0pA1yFhTXkzo8VkzHqwmjg3OPAfEXuNbjlH30=; b=JHqa7LUH6+j0DIyDvQDlHJUWlj/efm2RZEtVmW0YGGvHJPBfrgbeAQZVbJLwfxPXxy AbCKti7ypn+xsc7069bjpj7zjQUycKLozYEuYjixws7Yw5C0Hi7IdRO+nhlPl5nGevJc ogSe4+a3ytf8lSduPQsSfv0pzefWhAR5juP2zhLXbrVjgZoXdR53cxFuKThadGbNtP05 OQLXgdK87cvqFtxICX7jSjJ68ygeHjseS9dkfPuIWERFla/SRWqUK8R5mSJCbTzpFtpE BaQLA5qodR31iiEztrMRu6c4fZSrV1iJpRweVkkpjoH3HBhJd1644NimKIns7+bQeIFI iQCw== X-Gm-Message-State: AOJu0YzCHoJBaQdzp8MoWPJj83ih8uSjHtP/NfCZH/HVsnYXsV5WJ4xJ ySm4VMHNXZ77nsw5IeX+M0pJV50z23EwbTyvcDjjxEqgDXz44QS7bc7pyDL9jzjTR6Pr4ejTvn3 qy+wpPhS0ZfReg4E2LwgALAeLE/5nVGSw X-Google-Smtp-Source: AGHT+IFrzANXHldnRSGtY9y6DRbJ+SYx4tD90jwiP2XzUMCAucGLMY+JLEtWkfCRvv1lSonl3Tp6++zeaIKr6eFPcXQ= X-Received: by 2002:a2e:4c02:0:b0:2fb:b8a:7abb with SMTP id 38308e7fff4ca-2fb82eb0738mr28385351fa.21.1729448767255; Sun, 20 Oct 2024 11:26:07 -0700 (PDT) MIME-Version: 1.0 References: <1342498.1729444411@sss.pgh.pa.us> In-Reply-To: <1342498.1729444411@sss.pgh.pa.us> From: Michel Pelletier Date: Sun, 20 Oct 2024 11:25:30 -0700 Message-ID: Subject: Re: Using Expanded Objects other than Arrays from plpgsql To: Tom Lane Cc: pgsql-general Content-Type: multipart/alternative; boundary="0000000000006e02180624ecab45" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000006e02180624ecab45 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Oct 20, 2024 at 10:13=E2=80=AFAM Tom Lane wrote= : > Michel Pelletier writes: > > I found this thread from the original path implementation from Tom Lane > in > > >> appropriate APIs is left as a task for future work. > > Yeah, we thought that it wouldn't be appropriate to try to design > general APIs till we had more kinds of expandable objects. Maybe > now is a good time to push forward on that. > Great, I'm happy to be involved! > My first thought was to add a flag to CREATE TYPE like "EXPANDED =3D true= " > or > > some other better name that indicates that the object can be safely tak= en > > ownership of in its expanded state and not copied. > > Isn't that inherent in the notion of R/W vs R/O expanded-object > pointers? > I'm not sure I'm qualified enough to agree with you but I see your point. > > And then there is just removing the existing restriction on arrays only= . > > Is any other expanded object out there really interested in being > > flattened/expanded over and over again? > > I'm not sure. It seems certain that if the object is already expanded > (either R/W or R/O), the paths for that in plpgsql_exec_function could > be taken regardless of its specific type. Agreed. > The thing that is debatable > is "if the object is in a flat state, should we forcibly expand it > here?". That could be a win if the function later does object > accesses that would benefit --- but it might never do so. We chose > to always expand arrays, and we've gotten little pushback on that, > but the tradeoffs might be different for other sorts of expanded > objects. > The OneSparse objects will always expand themselves on first use via a DatumGetExpandedArray like macro wrapper, there is no run-time benefit to their flat representation, so I'm fine with not forcing their expansion ahead of time, but once I expand it I'd like it to stay expanded until the function returns (as much as possible) the serialization costs for very large sparse matrices is heavy. The other problem is that plpgsql only knows how to do such expansion > for arrays, and it's not obvious how to extend that part. > Perhaps a third member function for ExpandedObjectMethods that formalizes the expansion interface like found in DatumGetExpandedArray? I closely follow that same pattern in my code. > > But it seems like we could get an easy win by adjusting > plpgsql_exec_function along the lines of > ... > > How far does that improve matters for you? > I'll give it a try in my local benchmarking code and get back to you, thank you for the fast reply and the insight! -Michel --0000000000006e02180624ecab45 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Sun, Oct 20, 2024 at 10:13=E2=80=AFAM = Tom Lane <tgl@sss.pgh.pa.us>= wrote:
Michel P= elletier <pelletier.michel@gmail.com> writes:
> I found this thread from the original path implementation from Tom Lan= e in

>> appropriate APIs is left as a task for future work.

Yeah, we thought that it wouldn't be appropriate to try to design
general APIs till we had more kinds of expandable objects.=C2=A0 Maybe
now is a good time to push forward on that.

=
Great, I'm happy to be involved!

> My first thought was to add a flag to CREATE TYPE like "EXPANDED = =3D true" or
> some other better name that indicates that the object can be safely ta= ken
> ownership of in its expanded state and not copied.

Isn't that inherent in the notion of R/W vs R/O expanded-object
pointers?

I'm not sure I'm qual= ified enough to agree with you but I see your point.
=C2=A0
=
> And then there is just removing the existing restriction on arrays onl= y.
> Is any other expanded object out there really interested in being
> flattened/expanded over and over again?

I'm not sure.=C2=A0 It seems certain that if the object is already expa= nded
(either R/W or R/O), the paths for that in plpgsql_exec_function could
be taken regardless of its specific type.

A= greed.
=C2=A0
=C2=A0 The thing that is debatable
is "if the object is in a flat state, should we forcibly expand it
here?".=C2=A0 That could be a win if the function later does object accesses that would benefit --- but it might never do so.=C2=A0 We chose to always expand arrays, and we've gotten little pushback on that,
but the tradeoffs might be different for other sorts of expanded
objects.

The OneSparse objects will alw= ays expand themselves on first use via a DatumGetExpandedArray like macro w= rapper, there is no run-time benefit to their flat representation, so I'= ;m fine with not forcing their expansion ahead of time, but once I expand i= t I'd like it to stay expanded until the function returns (as much as p= ossible) the serialization costs for very large sparse matrices is heavy.

The other problem is that plpgsql only knows how to do such expansion
for arrays, and it's not obvious how to extend that part.

Perhaps a third member function for ExpandedObjectMe= thods that formalizes the expansion interface like found in=C2=A0DatumGetEx= pandedArray?=C2=A0 I closely follow that same pattern in my code.=C2=A0
=C2=A0

But it seems like we could get an easy win by adjusting
plpgsql_exec_function along the lines of
...

How far does that improve matters for you?

<= div>I'll give it a try in my local benchmarking code and get back to yo= u, thank you for the fast reply and the insight!
=C2=A0
-Michel
--0000000000006e02180624ecab45--