public inbox for [email protected]  
help / color / mirror / Atom feed
Re: search_path for PL/pgSQL functions partially cached?
33+ messages / 7 participants
[nested] [flat]

* Re: search_path for PL/pgSQL functions partially cached?
@ 2024-12-27 20:26 David G. Johnston <[email protected]>
  2024-12-27 20:57 ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  2024-12-27 21:03 ` Re: search_path for PL/pgSQL functions partially cached? Tom Lane <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  0 siblings, 3 replies; 33+ messages in thread

From: David G. Johnston @ 2024-12-27 20:26 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: [email protected] <[email protected]>

On Friday, December 27, 2024, Jan Behrens <[email protected]> wrote:
>
>
> It seems that it matters *both* how the search_path was set during the
> *first* invocation of the function within a session *and* how it is set
> during the actual call of the function. So even if there are just two
> schemas involved, there are 4 possible outcomes for the "run" function's
> result ('2.4', '2', '5', and '5.4'). To me, this behavior seems to be
> somewhat dangerous. Maybe it is even considered a bug?


It is what it is - and if one is not careful one can end up writing
hard-to-understand and possibly buggy code due to the various execution
environments and caches involved.

I’ve never really understood why “%TYPE’ exists…


> Or is it documented somewhere?



https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

Can someone explain to me what's going on, and what is the best practice to
> deal with it? Is there a way to avoid fully qualifying every type and
> expression? Which parts do I have to qualify or is this something that
> could be fixed in a future version of PostgreSQL?
>

Add qualification or attach a “set search_path” clause to “create
function”.  Code stored in the server should not rely on the session
search_path.

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
@ 2024-12-27 20:57 ` Pavel Stehule <[email protected]>
  2 siblings, 0 replies; 33+ messages in thread

From: Pavel Stehule @ 2024-12-27 20:57 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: Jan Behrens <[email protected]>; [email protected] <[email protected]>

Hi

pá 27. 12. 2024 v 21:26 odesílatel David G. Johnston <
[email protected]> napsal:

> On Friday, December 27, 2024, Jan Behrens <[email protected]> wrote:
>>
>>
>> It seems that it matters *both* how the search_path was set during the
>> *first* invocation of the function within a session *and* how it is set
>> during the actual call of the function. So even if there are just two
>> schemas involved, there are 4 possible outcomes for the "run" function's
>> result ('2.4', '2', '5', and '5.4'). To me, this behavior seems to be
>> somewhat dangerous. Maybe it is even considered a bug?
>
>
> It is what it is - and if one is not careful one can end up writing
> hard-to-understand and possibly buggy code due to the various execution
> environments and caches involved.
>

I think plan cache should be invalidated when search_path is different, but
maybe there is some bug - there are some optimizations related to faster
execution of simple expressions.


> I’ve never really understood why “%TYPE’ exists…
>

referenced types should increase readability - it ensures type
compatibility - minimally on oracle, where the change of schema requires
recompilation. In Postgres it is working on 99% - plpgsql functions don't
hold dependency on types.


>
>> Or is it documented somewhere?
>
>
>
> https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
>
> Can someone explain to me what's going on, and what is the best practice
>> to deal with it? Is there a way to avoid fully qualifying every type and
>> expression? Which parts do I have to qualify or is this something that
>> could be fixed in a future version of PostgreSQL?
>>
>
> Add qualification or attach a “set search_path” clause to “create
> function”.  Code stored in the server should not rely on the session
> search_path.
>

a lot of functionality in Postgres depends on the search path - and then
all should be consistent. Sure, writing procedures that depend on the
current search path can be a short way to hell.

I cannot to reproduce it

CREATE OR REPLACE FUNCTION s1.fx1()
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
begin
  return 100;
end
$function$

CREATE OR REPLACE FUNCTION s2.fx1()
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
begin
  return 200;
end
$function$

CREATE OR REPLACE FUNCTION public.foo()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
declare v int;
begin v := fx1();
  raise notice '%', v;
end;
$function$

(2024-12-27 21:53:13) postgres=# set search_path to s1;
SET
(2024-12-27 21:53:34) postgres=# select public.foo();
NOTICE:  100
┌─────┐
│ foo │
╞═════╡
│     │
└─────┘
(1 row)

(2024-12-27 21:53:44) postgres=# set search_path to s2;
SET
(2024-12-27 21:53:47) postgres=# select public.foo();
NOTICE:  200
┌─────┐
│ foo │
╞═════╡
│     │
└─────┘
(1 row)

(2024-12-27 21:53:48) postgres=# set search_path to s1;
SET
(2024-12-27 21:53:51) postgres=# select public.foo();
NOTICE:  100
┌─────┐
│ foo │
╞═════╡
│     │
└─────┘
(1 row)

so from my perspective is pg ok, tested on pg16 and pg18






> David J.
>
>


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
@ 2024-12-27 21:03 ` Tom Lane <[email protected]>
  2024-12-27 21:23   ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  2 siblings, 1 reply; 33+ messages in thread

From: Tom Lane @ 2024-12-27 21:03 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: Jan Behrens <[email protected]>; [email protected] <[email protected]>

"David G. Johnston" <[email protected]> writes:
> It is what it is - and if one is not careful one can end up writing
> hard-to-understand and possibly buggy code due to the various execution
> environments and caches involved.

Yeah, I don't see this changing.  The actual answer is that we have
search_path-aware caching of expressions and query plans within a
plpgsql function, which is why the call to foo() reacts to the current
search path.  But the types of plpgsql variables are only looked up
on the first use (within a session).  Perhaps we ought to work harder
on that, but it seems like a lot of overhead to add for something that
will benefit next to nobody.

> I’ve never really understood why “%TYPE’ exists…

Compatibility with Oracle, I imagine.  I agree it's a bizarre feature.
But you could get the same behavior without %TYPE, just by referencing
some other type that has different declarations in different schemas.

> Add qualification or attach a “set search_path” clause to “create
> function”.  Code stored in the server should not rely on the session
> search_path.

Yeah, adding "set search_path" is recommendable if you don't want to
think hard about this stuff.

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 21:03 ` Re: search_path for PL/pgSQL functions partially cached? Tom Lane <[email protected]>
@ 2024-12-27 21:23   ` Pavel Stehule <[email protected]>
  0 siblings, 0 replies; 33+ messages in thread

From: Pavel Stehule @ 2024-12-27 21:23 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: David G. Johnston <[email protected]>; Jan Behrens <[email protected]>; [email protected] <[email protected]>

pá 27. 12. 2024 v 22:03 odesílatel Tom Lane <[email protected]> napsal:

> "David G. Johnston" <[email protected]> writes:
> > It is what it is - and if one is not careful one can end up writing
> > hard-to-understand and possibly buggy code due to the various execution
> > environments and caches involved.
>
> Yeah, I don't see this changing.  The actual answer is that we have
> search_path-aware caching of expressions and query plans within a
> plpgsql function, which is why the call to foo() reacts to the current
> search path.  But the types of plpgsql variables are only looked up
> on the first use (within a session).  Perhaps we ought to work harder
> on that, but it seems like a lot of overhead to add for something that
> will benefit next to nobody.
>
> > I’ve never really understood why “%TYPE’ exists…
>
> Compatibility with Oracle, I imagine.  I agree it's a bizarre feature.
> But you could get the same behavior without %TYPE, just by referencing
> some other type that has different declarations in different schemas.
>

This feature is not bizarre - just the implementation in Postgres is not
fully complete (and I am not sure if it is fixable). PLpgSQL uses plan
cache, but there is nothing similar for types.
It is designed for Oracle where search_path doesn't exist, and where change
of schema invalidates code, and requires recompilation. PL/pgSQL and
Postgres are much more dynamic systems than Oracle. Maybe PL/pgSQL
functions can holds dependency on types, and when any related custom type
is changed, then the cached function can be invalidated. Unfortunately, the
frequent change of search path can kill the performance.


> > Add qualification or attach a “set search_path” clause to “create
> > function”.  Code stored in the server should not rely on the session
> > search_path.
>
> Yeah, adding "set search_path" is recommendable if you don't want to
> think hard about this stuff.
>
>                         regards, tom lane
>
>
>


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
@ 2024-12-27 23:40 ` Jan Behrens <[email protected]>
  2024-12-28 05:34   ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2 siblings, 3 replies; 33+ messages in thread

From: Jan Behrens @ 2024-12-27 23:40 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: [email protected] <[email protected]>

On Fri, 27 Dec 2024 13:26:28 -0700
"David G. Johnston" <[email protected]> wrote:

> > Or is it documented somewhere?
> 
> https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

I can't find any notes regarding functions and schemas in that section.

> Can someone explain to me what's going on, and what is the best practice to
> > deal with it? Is there a way to avoid fully qualifying every type and
> > expression? Which parts do I have to qualify or is this something that
> > could be fixed in a future version of PostgreSQL?
> >
> 
> Add qualification or attach a “set search_path” clause to “create
> function”.  Code stored in the server should not rely on the session
> search_path.
> 
> David J.

In my (real world) case, I was unable to use "SET search_path FROM
CURRENT" because it isn't possible to use "SET" in procedures that use
transactions, due to this documented limitation:

"If a SET clause is attached to a procedure, then that procedure cannot
execute transaction control statements (for example, COMMIT and
ROLLBACK, depending on the language)."

https://www.postgresql.org/docs/17/sql-createprocedure.html

My procedure looks more or less like this:

CREATE PROCEDURE "myfunc"()
  LANGUAGE plpgsql AS
  $$
  DECLARE
    "old_search_path" TEXT;
    -- some more variables
  BEGIN
    SELECT current_setting('search_path') INTO "old_search_path";
    SET search_path TO 'myschema';
    -- some code that uses COMMIT and SET TRANSACTION ISOLATION LEVEL
    PERFORM set_config('search_path', "old_search_path", FALSE);
  END;
  $$;

My question is: Am I safe if I use fully-qualified types in the DECLARE
section only? Or do I need to provide full qualification also in the
code below (after SET search_path TO 'myschema')?

And bonus question: Is it documented somewhere?

Maybe not many people run into these issues because schemas and
functions aren't used as often in combination?

Kind Regards
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2024-12-28 05:34   ` Pavel Stehule <[email protected]>
  2 siblings, 0 replies; 33+ messages in thread

From: Pavel Stehule @ 2024-12-28 05:34 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

Hi


> Maybe not many people run into these issues because schemas and
> functions aren't used as often in combination?
>

I think schema and functions are common combinations. But when people have
objects with the same name, then they are careful to be sure, so objects
have really identical structure.
Using different types in these objects is very rare. And because Postgres
doesn't support it well, experienced developers don't use it. Similar
issues can do some issues after an stored procedures update, because can
require session reset. Or when you need it, you can use a much more dynamic
type like record.


> Kind Regards
> Jan Behrens
>
>
>


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-01 17:55   ` Jan Behrens <[email protected]>
  2025-01-01 18:12     ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-01 18:19     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2 siblings, 2 replies; 33+ messages in thread

From: Jan Behrens @ 2025-01-01 17:55 UTC (permalink / raw)
  To: [email protected] <[email protected]>; +Cc: David G. Johnston <[email protected]>

On Sat, 28 Dec 2024 00:40:09 +0100
Jan Behrens <[email protected]> wrote:

> On Fri, 27 Dec 2024 13:26:28 -0700
> "David G. Johnston" <[email protected]> wrote:
> 
> > > Or is it documented somewhere?
> > 
> > https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
> 
> I can't find any notes regarding functions and schemas in that section.

Actually, I found another note in the documentation. But it doesn't
explain things correctly. In the documentation for PostgreSQL 17,
section 36.17.6.1. (Security Considerations for Extension Functions)
says:

"SQL-language and PL-language functions provided by extensions are at
risk of search-path-based attacks when they are executed, since parsing
of these functions occurs at execution time not creation time."

https://www.postgresql.org/docs/17/extend-extensions.html#EXTEND-EXTENSIONS-SECURITY

So here, the manual explicity states that functions are parsed at
execution, not creation time. As seen in my original example in this
thread, this isn't (fully) true. Moreover, it isn't true for all
SQL-language functions, as can be demonstrated with the following code:

============

CREATE SCHEMA s1;
CREATE SCHEMA s2;

CREATE VIEW s1.v AS SELECT 'creation' AS col;
CREATE VIEW s2.v AS SELECT 'runtime' AS col;

SET search_path TO 'public', 's1';

CREATE FUNCTION use_sql_atomic() RETURNS TEXT LANGUAGE sql BEGIN ATOMIC
  SELECT 'use_sql_atomic = ' || col FROM v;
END;

CREATE FUNCTION use_sql_string() RETURNS TEXT LANGUAGE sql AS $$
  SELECT 'use_sql_string = ' || col FROM v;
$$;

CREATE FUNCTION use_plpgsql() RETURNS TEXT LANGUAGE plpgsql AS $$ BEGIN
  RETURN (SELECT 'use_plpgsql = ' || col FROM v);
END; $$;

SET search_path TO 'public', 's2';

SELECT use_sql_atomic() AS "output" UNION ALL
SELECT use_sql_string() AS "output" UNION ALL
SELECT use_plpgsql() AS "output";

============

This generates the following output:

          output           
---------------------------
 use_sql_atomic = creation
 use_sql_string = runtime
 use_plpgsql = runtime
(3 rows)

Overall, PostgreSQL doesn't behave consistent, and to me it seems that
the documentation isn't describing its behavior correctly either.

I understand if fixing this is too much work (even though I would
really like to see this fixed). But given that the current behavior is
highly surprising and inconsistent - and keeping in mind that this is a
subject that may affect security - I think the documentation should
reflect the current behavior at least. I thus see this as a
documentation issue.

Kind regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-01 18:12     ` Adrian Klaver <[email protected]>
  1 sibling, 0 replies; 33+ messages in thread

From: Adrian Klaver @ 2025-01-01 18:12 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; [email protected] <[email protected]>; +Cc: David G. Johnston <[email protected]>

On 1/1/25 09:55, Jan Behrens wrote:
> On Sat, 28 Dec 2024 00:40:09 +0100
> Jan Behrens <[email protected]> wrote:
> 
>> On Fri, 27 Dec 2024 13:26:28 -0700
>> "David G. Johnston" <[email protected]> wrote:
>>
>>>> Or is it documented somewhere?
>>>
>>> https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
>>
>> I can't find any notes regarding functions and schemas in that section.
> 
> Actually, I found another note in the documentation. But it doesn't
> explain things correctly. In the documentation for PostgreSQL 17,
> section 36.17.6.1. (Security Considerations for Extension Functions)
> says:
> 
> "SQL-language and PL-language functions provided by extensions are at
> risk of search-path-based attacks when they are executed, since parsing
> of these functions occurs at execution time not creation time."
> 
> https://www.postgresql.org/docs/17/extend-extensions.html#EXTEND-EXTENSIONS-SECURITY
> 
> So here, the manual explicity states that functions are parsed at
> execution, not creation time. As seen in my original example in this
> thread, this isn't (fully) true. Moreover, it isn't true for all
> SQL-language functions, as can be demonstrated with the following code:
> 
> ============
> 
> CREATE SCHEMA s1;
> CREATE SCHEMA s2;
> 
> CREATE VIEW s1.v AS SELECT 'creation' AS col;
> CREATE VIEW s2.v AS SELECT 'runtime' AS col;
> 
> SET search_path TO 'public', 's1';
> 
> CREATE FUNCTION use_sql_atomic() RETURNS TEXT LANGUAGE sql BEGIN ATOMIC
>    SELECT 'use_sql_atomic = ' || col FROM v;
> END;
> 
> CREATE FUNCTION use_sql_string() RETURNS TEXT LANGUAGE sql AS $$
>    SELECT 'use_sql_string = ' || col FROM v;
> $$;
> 
> CREATE FUNCTION use_plpgsql() RETURNS TEXT LANGUAGE plpgsql AS $$ BEGIN
>    RETURN (SELECT 'use_plpgsql = ' || col FROM v);
> END; $$;
> 
> SET search_path TO 'public', 's2';
> 
> SELECT use_sql_atomic() AS "output" UNION ALL
> SELECT use_sql_string() AS "output" UNION ALL
> SELECT use_plpgsql() AS "output";
> 
> ============
> 
> This generates the following output:
> 
>            output
> ---------------------------
>   use_sql_atomic = creation
>   use_sql_string = runtime
>   use_plpgsql = runtime
> (3 rows)
> 
> Overall, PostgreSQL doesn't behave consistent, and to me it seems that
> the documentation isn't describing its behavior correctly either.

https://www.postgresql.org/docs/current/sql-createfunction.html

"sql_body

     The body of a LANGUAGE SQL function. This can either be a single 
statement

     RETURN expression

     or a block

     BEGIN ATOMIC
       statement;
       statement;
       ...
       statement;
     END

     This is similar to writing the text of the function body as a 
string constant (see definition above), but there are some differences: 
This form only works for LANGUAGE SQL, the string constant form works 
for all languages. This form is parsed at function definition time, the 
string constant form is parsed at execution time; therefore this form 
cannot support polymorphic argument types and other constructs that are 
not resolvable at function definition time. This form tracks 
dependencies between the function and objects used in the function body, 
so DROP ... CASCADE will work correctly, whereas the form using string 
literals may leave dangling functions. Finally, this form is more 
compatible with the SQL standard and other SQL implementations.
"

> 
> I understand if fixing this is too much work (even though I would
> really like to see this fixed). But given that the current behavior is
> highly surprising and inconsistent - and keeping in mind that this is a
> subject that may affect security - I think the documentation should
> reflect the current behavior at least. I thus see this as a
> documentation issue.
> 
> Kind regards,
> Jan Behrens
> 
> 

-- 
Adrian Klaver
[email protected]







^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-01 18:19     ` David G. Johnston <[email protected]>
  2025-01-02 10:37       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  1 sibling, 1 reply; 33+ messages in thread

From: David G. Johnston @ 2025-01-01 18:19 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: [email protected] <[email protected]>

On Wed, Jan 1, 2025 at 10:55 AM Jan Behrens <[email protected]> wrote:

> On Sat, 28 Dec 2024 00:40:09 +0100
> Jan Behrens <[email protected]> wrote:
>
> > On Fri, 27 Dec 2024 13:26:28 -0700
> > "David G. Johnston" <[email protected]> wrote:
> >
> > > > Or is it documented somewhere?
> > >
> > >
> https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
> >
> > I can't find any notes regarding functions and schemas in that section.
>
>
"Because PL/pgSQL saves prepared statements and sometimes execution plans
in this way, SQL commands that appear directly in a PL/pgSQL function must
refer to the same tables and columns on every execution; that is, you
cannot use a parameter as the name of a table or column in an SQL command."

Changing search_path is just one possible way to change out which object a
name tries to refer to so it is not called out explicitly.


> "SQL-language and PL-language functions provided by extensions are at
> risk of search-path-based attacks when they are executed, since parsing
> of these functions occurs at execution time not creation time."
>


> Moreover, it isn't true for all
> SQL-language functions, as can be demonstrated with the following code:
>

Yeah, when we added a second method to write an SQL-language function, one
that doesn't simply accept a string body, we didn't update that section to
point out that is the string input variant of create function that is
affected in this manner, the non-string (atomic) variant stores the result
of parsing the inline code as opposed to storing the raw text.

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 18:19     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
@ 2025-01-02 10:37       ` Jan Behrens <[email protected]>
  2025-01-02 11:40         ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: Jan Behrens @ 2025-01-02 10:37 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: [email protected] <[email protected]>

On Wed, 1 Jan 2025 11:19:32 -0700
"David G. Johnston" <[email protected]> wrote:

> On Wed, Jan 1, 2025 at 10:55 AM Jan Behrens <[email protected]> wrote:
> 
> > On Sat, 28 Dec 2024 00:40:09 +0100
> > Jan Behrens <[email protected]> wrote:
> >
> > > On Fri, 27 Dec 2024 13:26:28 -0700
> > > "David G. Johnston" <[email protected]> wrote:
> > >
> > > > > Or is it documented somewhere?
> > > >
> > > >
> > https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
> > >
> > > I can't find any notes regarding functions and schemas in that section.
> >
> >
> "Because PL/pgSQL saves prepared statements and sometimes execution plans
> in this way, SQL commands that appear directly in a PL/pgSQL function must
> refer to the same tables and columns on every execution; that is, you
> cannot use a parameter as the name of a table or column in an SQL command."
> 
> Changing search_path is just one possible way to change out which object a
> name tries to refer to so it is not called out explicitly.

The first part of the cited sentence seems helpful ("you must always
refer to the same tables and columns on every execution"). I would thus
conclude that using a dynamic search_path when running functions or
procedures is *always* considered errorneous (even though not reported
by the database as an error), except when using EXECUTE.

I wonder if the database could/should generate an error (or at least a
warning?) when a function or procedure without a "SET search_path"
statement uses a non-qualified name? According to the documentation
using a dynamic search_path to refer to different entities in the
database is a case that "must" not happen.

But following through, this might lead to more warnings one might
expect, e.g. when using simple operators such as "=" or the "IN" or
"CASE expression WHEN" statements, as these rely on the search_path as
well. Should such code be considered non-idiomatic, dangerous, or even
errorneous if a "SET search_path" option is missing in the
function's/procedure's definition?

Maybe I'm overthinking this. But in practice, I've been running into
surprising issues whenever functions and schemas are involved, and I'm
not sure if every programmer will be aware of how important it is to
properly set a search_path in the function's defintion after reading
the documentation. (Besides, it's not always possible in procedures.)

> 
> > "SQL-language and PL-language functions provided by extensions are at
> > risk of search-path-based attacks when they are executed, since parsing
> > of these functions occurs at execution time not creation time."
> 
> > Moreover, it isn't true for all
> > SQL-language functions, as can be demonstrated with the following code:
> 
> Yeah, when we added a second method to write an SQL-language function, one
> that doesn't simply accept a string body, we didn't update that section to
> point out that is the string input variant of create function that is
> affected in this manner, the non-string (atomic) variant stores the result
> of parsing the inline code as opposed to storing the raw text.
> 
> David J.

I missed that other part in the manual (which is in a totally different
section). Should I report the missing update in section 36.17.6.1. of
the documentation as a documentation issue, or is it not necessary?

Kind regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 18:19     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-02 10:37       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-02 11:40         ` Pavel Stehule <[email protected]>
  2025-01-02 12:15           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: Pavel Stehule @ 2025-01-02 11:40 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

Hi

čt 2. 1. 2025 v 11:37 odesílatel Jan Behrens <[email protected]>
napsal:

> On Wed, 1 Jan 2025 11:19:32 -0700
> "David G. Johnston" <[email protected]> wrote:
>
> > On Wed, Jan 1, 2025 at 10:55 AM Jan Behrens <[email protected]>
> wrote:
> >
> > > On Sat, 28 Dec 2024 00:40:09 +0100
> > > Jan Behrens <[email protected]> wrote:
> > >
> > > > On Fri, 27 Dec 2024 13:26:28 -0700
> > > > "David G. Johnston" <[email protected]> wrote:
> > > >
> > > > > > Or is it documented somewhere?
> > > > >
> > > > >
> > >
> https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
> > > >
> > > > I can't find any notes regarding functions and schemas in that
> section.
> > >
> > >
> > "Because PL/pgSQL saves prepared statements and sometimes execution plans
> > in this way, SQL commands that appear directly in a PL/pgSQL function
> must
> > refer to the same tables and columns on every execution; that is, you
> > cannot use a parameter as the name of a table or column in an SQL
> command."
> >
> > Changing search_path is just one possible way to change out which object
> a
> > name tries to refer to so it is not called out explicitly.
>
> The first part of the cited sentence seems helpful ("you must always
> refer to the same tables and columns on every execution"). I would thus
> conclude that using a dynamic search_path when running functions or
> procedures is *always* considered errorneous (even though not reported
> by the database as an error), except when using EXECUTE.
>
> I wonder if the database could/should generate an error (or at least a
> warning?) when a function or procedure without a "SET search_path"
> statement uses a non-qualified name? According to the documentation
> using a dynamic search_path to refer to different entities in the
> database is a case that "must" not happen.
>
> But following through, this might lead to more warnings one might
> expect, e.g. when using simple operators such as "=" or the "IN" or
> "CASE expression WHEN" statements, as these rely on the search_path as
> well. Should such code be considered non-idiomatic, dangerous, or even
> errorneous if a "SET search_path" option is missing in the
> function's/procedure's definition?
>
> Maybe I'm overthinking this. But in practice, I've been running into
> surprising issues whenever functions and schemas are involved, and I'm
> not sure if every programmer will be aware of how important it is to
> properly set a search_path in the function's defintion after reading
> the documentation. (Besides, it's not always possible in procedures.)
>

How can you identify unwanted usage of non qualified identifiers from
wanted usage of non qualified identifiers? It is a common pattern for
sharding. Using not qualified identifiers of operators, functions is common
when you are using orafce extensions, etc.

Using qualified identifiers everywhere strongly reduces readability. There
are no aliases to the schema, so aliases cannot help.

you can identify the functions where search_path is not explicitly assigned

select oid::regprocedure
  from pg_proc
where pronamespace::regnamespace not in ('pg_catalog',
'information_schema')
   and not exists(select 1 from unnest(proconfig) g(v) where  v ~
'^search_path');


Regards

Pavel


> >
> > > "SQL-language and PL-language functions provided by extensions are at
> > > risk of search-path-based attacks when they are executed, since parsing
> > > of these functions occurs at execution time not creation time."
> >
> > > Moreover, it isn't true for all
> > > SQL-language functions, as can be demonstrated with the following code:
> >
> > Yeah, when we added a second method to write an SQL-language function,
> one
> > that doesn't simply accept a string body, we didn't update that section
> to
> > point out that is the string input variant of create function that is
> > affected in this manner, the non-string (atomic) variant stores the
> result
> > of parsing the inline code as opposed to storing the raw text.
> >
> > David J.
>
> I missed that other part in the manual (which is in a totally different
> section). Should I report the missing update in section 36.17.6.1. of
> the documentation as a documentation issue, or is it not necessary?
>
> Kind regards,
> Jan Behrens
>
>
>


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 18:19     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-02 10:37       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-02 11:40         ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
@ 2025-01-02 12:15           ` Jan Behrens <[email protected]>
  2025-01-02 12:48             ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: Jan Behrens @ 2025-01-02 12:15 UTC (permalink / raw)
  To: Pavel Stehule <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Thu, 2 Jan 2025 12:40:59 +0100
Pavel Stehule <[email protected]> wrote:

> How can you identify unwanted usage of non qualified identifiers from
> wanted usage of non qualified identifiers? It is a common pattern for
> sharding. Using not qualified identifiers of operators, functions is common
> when you are using orafce extensions, etc.

I don't fully understand the use-case. Could you elaborate?

As I understand, even if identifiers are not fully-qualified, it is
forbidden to use the search_path to refer to different database
entities at run-time (as David pointed out).

So I don't understand how a dynamic "search_path" could be used in any
scenario within functions except when EXECUTE is involved.

> 
> Using qualified identifiers everywhere strongly reduces readability. There
> are no aliases to the schema, so aliases cannot help.

Yes, I agree on that. Using "SET search_path" in the function's
definition fixes that problem, but it's easy to miss how important this
is from reading the documentation:

The manual regarding "CREATE FUNCTION" refers to "search_path" only
within the "Writing SECURITY DEFINER Functions Safely" section. It's
easy to skip that part unless you use that feature. Moreover, that
section alone doesn't explain the weird behavior of four different
outcomes of a function with only two schemas involved which I brought
up in the beginning of this thread.

The part on "SET configuration_parameter" part in the "CREATE FUNCTION"
documentation doesn't mention the search_path or schemas. And I don't
think you can expect every programmer will read the "Plan Caching"
subsection in the "PL/pgSQL under the Hood" section. But even then, the
information is just provided indirectly.

Searching for "schema" in "CREATE FUNCTION"'s documentation doesn't
give any hint either.

I think (assuming that the behavior isn't fixed) that some slighly more
prominent warning would be reasonable.

> 
> you can identify the functions where search_path is not explicitly assigned
> 
> select oid::regprocedure
>   from pg_proc
> where pronamespace::regnamespace not in ('pg_catalog',
> 'information_schema')
>    and not exists(select 1 from unnest(proconfig) g(v) where  v ~
> '^search_path');
> 
> 
> Regards
> 
> Pavel

Kind regards,
Jan






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 18:19     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-02 10:37       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-02 11:40         ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  2025-01-02 12:15           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-02 12:48             ` Pavel Stehule <[email protected]>
  2025-01-02 15:34               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: Pavel Stehule @ 2025-01-02 12:48 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

čt 2. 1. 2025 v 13:15 odesílatel Jan Behrens <[email protected]>
napsal:

> On Thu, 2 Jan 2025 12:40:59 +0100
> Pavel Stehule <[email protected]> wrote:
>
> > How can you identify unwanted usage of non qualified identifiers from
> > wanted usage of non qualified identifiers? It is a common pattern for
> > sharding. Using not qualified identifiers of operators, functions is
> common
> > when you are using orafce extensions, etc.
>
> I don't fully understand the use-case. Could you elaborate?
>
> As I understand, even if identifiers are not fully-qualified, it is
> forbidden to use the search_path to refer to different database
> entities at run-time (as David pointed out).
>
> So I don't understand how a dynamic "search_path" could be used in any
> scenario within functions except when EXECUTE is involved.
>

you don't need more databases

schema one - customer x
schema two - customer y

create table one.t1(..); create table one.t2(..);
create table two.t1(..); create table two.t2(..);

set search_path to one;
-- work with data set of customer x

set search_path to two;
-- work wit data set of customer y

some times can be pretty ineffective to have database per customer - more
connect, disconnect in postgres is much more expensive than SET search_path
TO .. and maybe RESET plans;




>
> >
> > Using qualified identifiers everywhere strongly reduces readability.
> There
> > are no aliases to the schema, so aliases cannot help.
>
> Yes, I agree on that. Using "SET search_path" in the function's
> definition fixes that problem, but it's easy to miss how important this
> is from reading the documentation:
>
> The manual regarding "CREATE FUNCTION" refers to "search_path" only
> within the "Writing SECURITY DEFINER Functions Safely" section. It's
> easy to skip that part unless you use that feature. Moreover, that
> section alone doesn't explain the weird behavior of four different
> outcomes of a function with only two schemas involved which I brought
> up in the beginning of this thread.
>
> The part on "SET configuration_parameter" part in the "CREATE FUNCTION"
> documentation doesn't mention the search_path or schemas. And I don't
> think you can expect every programmer will read the "Plan Caching"
> subsection in the "PL/pgSQL under the Hood" section. But even then, the
> information is just provided indirectly.
>
>
yes, probably nobody reads the plan caching doc. And if they read it, then
because they have performance problems.



> Searching for "schema" in "CREATE FUNCTION"'s documentation doesn't
> give any hint either.
>

This is a question - this is a generic feature in Postgres.  Every query
can be impacted by setting of search_path.

From my perspective, there can be a note in the documentation related to
copy types and row types.

https://www.postgresql.org/docs/current/plpgsql-declarations.html#PLPGSQL-DECLARATION-TYPE

The problem that you found is not just about the change of search_path.
Same problem can be found after altering the table.

Regards

Pavel


> I think (assuming that the behavior isn't fixed) that some slighly more
> prominent warning would be reasonable.
>
> >
> > you can identify the functions where search_path is not explicitly
> assigned
> >
> > select oid::regprocedure
> >   from pg_proc
> > where pronamespace::regnamespace not in ('pg_catalog',
> > 'information_schema')
> >    and not exists(select 1 from unnest(proconfig) g(v) where  v ~
> > '^search_path');
> >
> >
> > Regards
> >
> > Pavel
>
> Kind regards,
> Jan
>


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 18:19     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-02 10:37       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-02 11:40         ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  2025-01-02 12:15           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-02 12:48             ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
@ 2025-01-02 15:34               ` Jan Behrens <[email protected]>
  2025-01-02 16:20                 ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: Jan Behrens @ 2025-01-02 15:34 UTC (permalink / raw)
  To: Pavel Stehule <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Thu, 2 Jan 2025 13:48:29 +0100
Pavel Stehule <[email protected]> wrote:

> čt 2. 1. 2025 v 13:15 odesílatel Jan Behrens <[email protected]>
> napsal:
> 
> > On Thu, 2 Jan 2025 12:40:59 +0100
> > Pavel Stehule <[email protected]> wrote:
> >
> > > How can you identify unwanted usage of non qualified identifiers from
> > > wanted usage of non qualified identifiers? It is a common pattern for
> > > sharding. Using not qualified identifiers of operators, functions is
> > common
> > > when you are using orafce extensions, etc.
> >
> > I don't fully understand the use-case. Could you elaborate?
> >
> > As I understand, even if identifiers are not fully-qualified, it is
> > forbidden to use the search_path to refer to different database
> > entities at run-time (as David pointed out).
> >
> > So I don't understand how a dynamic "search_path" could be used in any
> > scenario within functions except when EXECUTE is involved.
> >
> 
> you don't need more databases
> 
> schema one - customer x
> schema two - customer y
> 
> create table one.t1(..); create table one.t2(..);
> create table two.t1(..); create table two.t2(..);
> 
> set search_path to one;
> -- work with data set of customer x
> 
> set search_path to two;
> -- work wit data set of customer y
> 
> some times can be pretty ineffective to have database per customer - more
> connect, disconnect in postgres is much more expensive than SET search_path
> TO .. and maybe RESET plans;

I guess that means there is a practical application where search_path
MAY change at runtime IF done in different sessions or if the cache is
reset using the DISCARD command:

https://www.postgresql.org/docs/17/sql-discard.html

I assume DISCARD PLANS would be the right command?

This seems to be a very special case though. I think there should be a
warning in the documentation of CREATE FUNCTION with regard to schemas
anyway, though.

Regards,
Jan






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 17:55   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-01 18:19     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-02 10:37       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-02 11:40         ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  2025-01-02 12:15           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-02 12:48             ` Re: search_path for PL/pgSQL functions partially cached? Pavel Stehule <[email protected]>
  2025-01-02 15:34               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-02 16:20                 ` Pavel Stehule <[email protected]>
  0 siblings, 0 replies; 33+ messages in thread

From: Pavel Stehule @ 2025-01-02 16:20 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

Hi


> >
> > some times can be pretty ineffective to have database per customer - more
> > connect, disconnect in postgres is much more expensive than SET
> search_path
> > TO .. and maybe RESET plans;
>
> I guess that means there is a practical application where search_path
> MAY change at runtime IF done in different sessions or if the cache is
> reset using the DISCARD command:
>
> https://www.postgresql.org/docs/17/sql-discard.html
>
> I assume DISCARD PLANS would be the right command?
>

that depends. plan inside plan cache is invalidated when search_path is
different. You use RESET plans because you want to release all plans
quickly.

Unfortunately, the types assigned to plpgsql variables are not invalidated.
This is the source of problems. It is a classical problem - it is hard to
say when you should invalidate cache.
Current design is not ideal - but it is almost a good enough compromise
between correctness and performance. It is true, so nobody did some work to
fix it. So maybe the impact to performance should not be too bad, but it is
not an easy issue. plans are isolated - and the impact of one plan to the
second plan is zero. For variables it is exactly opposite.


>
> This seems to be a very special case though. I think there should be a
> warning in the documentation of CREATE FUNCTION with regard to schemas
> anyway, though.
>

I am not sure. If you want to use this warning, then it should be
everywhere where any non-qualified identifier can be used. Maybe in plpgsql
can be more accented so almost everything in plpgsql depends on the current
setting of search_path. Lot of people don't understand, so every expression
in plpgsql is SQL and every expression is executed like part of a query.
And unfortunately there are some different caches - plpgsql cache and plan
cache and both caches are invalidated at different times (I think so
plpgsql cache is not resetted by RESET PLANS). Maybe it is better to
explain how plpgsql works. It is a little bit different from well known
interpreted languages.


> Regards,
> Jan
>


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-03 12:53   ` Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2 siblings, 1 reply; 33+ messages in thread

From: Jan Behrens @ 2025-01-03 12:53 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: [email protected] <[email protected]>

On Sat, 28 Dec 2024 00:40:09 +0100
Jan Behrens <[email protected]> wrote:

> > Add qualification or attach a “set search_path” clause to “create
> > function”.  Code stored in the server should not rely on the session
> > search_path.
> > 
> > David J.

I have been trying to adjust some of my code, and I still have cases
where I have to rely on the session's search_path. I'll provide an
example below.

> 
> [...]
> 
> My question is: Am I safe if I use fully-qualified types in the DECLARE
> section only? Or do I need to provide full qualification also in the
> code below (after SET search_path TO 'myschema')?
> 
> And bonus question: Is it documented somewhere?
>
> [...]
> 
> Kind Regards
> Jan Behrens

The following code is taken from a project I'm currently working on:

============

-- Let's assume we don't know the name of the schema in which the
-- "pgratio" extension with the RATIONAL data type is installed.
CREATE SCHEMA "qwertyuiop";
CREATE EXTENSION "pgratio" WITH SCHEMA "qwertyuiop";

-- This installs schema "myschema" with some dynamic function:
BEGIN;

CREATE SCHEMA "myschema";
SET LOCAL search_path TO "myschema";

-- Append schema of "pgratio" extension, which provides the RATIONAL
-- data type, to search_path:
SELECT set_config(
  'search_path',
  current_setting('search_path') || ', ' || quote_ident(nspname),
  TRUE
) FROM pg_namespace, pg_extension
  WHERE pg_namespace.oid = extnamespace AND extname = 'pgratio';

CREATE DOMAIN "rational_wrapper" AS RATIONAL;

CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS RATIONAL

  --------------------------------------------------------------------
  -- I cannot use SET search_path FROM CURRENT here, because "query_p"
  -- shall refer to tables in the search_path of the caller.
  --------------------------------------------------------------------

  LANGUAGE plpgsql AS $$
    DECLARE
      "old_search_path" TEXT;

      ----------------------------------------------------------------
      -- I have to fully qualify the following type.
      -- Moreover, I can't use RATIONAL as I don't know its schema.
      ----------------------------------------------------------------

      "result" "myschema"."rational_wrapper";
    BEGIN
      SELECT current_setting('search_path') INTO "old_search_path";
      PERFORM set_config(
        'search_path',
        'myschema, ' || quote_ident(nspname) || ', pg_temp, ' ||
        "old_search_path",
        TRUE
      ) FROM pg_namespace, pg_extension
        WHERE pg_namespace.oid = extnamespace AND extname = 'pgratio';

      ----------------------------------------------------------------
      -- Is it safe to not fully qualify type RATIONAL below?
      -- And, if yes, where in the documentation is this explained?
      ----------------------------------------------------------------

      CREATE TEMP TABLE "mytemptable" ("val" RATIONAL);
      EXECUTE 'INSERT INTO "mytemptable" '
        'SELECT "query"."a" * "query"."b" '
        'FROM (' || "query_p" || ') AS "query"';
      -- Do some things here.
      SELECT sum("val") INTO "result" FROM "mytemptable";
      PERFORM set_config('search_path', "old_search_path", TRUE);
      RETURN "result";
    END;
  $$;

COMMIT;

CREATE TABLE "tbl" ("foo" INT8, "bar" INT8);
INSERT INTO "tbl" VALUES (5, 7), (1, 10);

SELECT "myschema"."some_function"(
  'SELECT "foo" AS "a", "bar" AS "b" FROM "tbl"'
);
\c
SELECT "myschema"."some_function"(
  'SELECT "foo" AS "a", "bar" AS "b" FROM "tbl"'
);

============

The code for the pgratio extension that provides the RATIONAL data type
is found here: https://www.public-software-group.org/pgratio

Running that code on my machine correctly gives:

some_function 
---------------
 45
(1 row)

You are now connected to database "jbe" as user "jbe".
 some_function 
---------------
 45
(1 row)

Because extensions can only be installed in one schema, it may be a bad
idea to have a component requiring an extension to be installed in a
particular schema (because if different components have different
expectations on the schema name, e.g. some might expect "pgratio" to be
installed in "public" and others might expect it in "pgratio" or some
other schema such as "qwertyuiop", this would lead to an unresolvable
conflict).

I would like to know if the above example is correct. It seems overall
bulky, but I haven't found a better way, assuming that it can be
unknown where a particular extension has been installed to. In
particular I feel a bit insecure about where I have to fully qualify,
and where not. See the comments in the code above.

Note that I want the function to accept a query that makes sense in the
caller's search_path. Thus using "SET search_path FROM CURRENT" is not
an option for me, I believe.

Regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-03 15:34     ` David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: David G. Johnston @ 2025-01-03 15:34 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: [email protected] <[email protected]>

On Friday, January 3, 2025, Jan Behrens <[email protected]> wrote:
>
> I would like to know if the above example is correct. It seems overall
> bulky, but I haven't found a better way, assuming that it can be
> unknown where a particular extension has been installed to. In
> particular I feel a bit insecure about where I have to fully qualify,
> and where not. See the comments in the code above.


Short answer, you cannot looking at a definition and know the answer -
whether the code is going to be executed in a sanitized search_path is what
matters.  Anything that would be executed during pg_restore has to be made
safe.  Therefore, code that is only ever executed by applications directly
can use swarch_path.

I’d probably modify the function signature to take search_path as a second
optional argument and then invoke a set search_path within the function.
At worse the caller can place current_setting(search_path) as the value of
that argument though being explicit would be recommended.

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
@ 2025-01-03 16:48       ` Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: Jan Behrens @ 2025-01-03 16:48 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: [email protected] <[email protected]>

On Fri, 3 Jan 2025 08:34:57 -0700
"David G. Johnston" <[email protected]> wrote:

> On Friday, January 3, 2025, Jan Behrens <[email protected]> wrote:
> >
> > I would like to know if the above example is correct. It seems overall
> > bulky, but I haven't found a better way, assuming that it can be
> > unknown where a particular extension has been installed to. In
> > particular I feel a bit insecure about where I have to fully qualify,
> > and where not. See the comments in the code above.
> 
> 
> Short answer, you cannot looking at a definition and know the answer -
> whether the code is going to be executed in a sanitized search_path is what
> matters.

I don't understand. Do you mean my last example is wrong / insecure?
If so, why?

> Anything that would be executed during pg_restore has to be made
> safe.  Therefore, code that is only ever executed by applications directly
> can use swarch_path.

Why should the function be executed during pg_restore?

> 
> I’d probably modify the function signature to take search_path as a second
> optional argument and then invoke a set search_path within the function.
> At worse the caller can place current_setting(search_path) as the value of
> that argument though being explicit would be recommended.
> 
> David J.

I could do that, but I would like to understand if that is really
necessary as it makes the interface more complicated, and I would like
to avoid unnecessary complexity in my interface.

Is it really impossible to have functions without SET search_path in
the definition of a PL/pgSQL function if I fully-qualify all types in
the DECLARE section and if all other non-qualified identifiers occur
after set_config('search_path', ...)?

Kind regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-03 17:16         ` David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: David G. Johnston @ 2025-01-03 17:16 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: [email protected] <[email protected]>

On Fri, Jan 3, 2025 at 9:48 AM Jan Behrens <[email protected]> wrote:

> On Fri, 3 Jan 2025 08:34:57 -0700
> "David G. Johnston" <[email protected]> wrote:
>
> > On Friday, January 3, 2025, Jan Behrens <[email protected]> wrote:
> > >
> > > I would like to know if the above example is correct. It seems overall
> > > bulky, but I haven't found a better way, assuming that it can be
> > > unknown where a particular extension has been installed to. In
> > > particular I feel a bit insecure about where I have to fully qualify,
> > > and where not. See the comments in the code above.
> >
> >
> > Short answer, you cannot looking at a definition and know the answer -
> > whether the code is going to be executed in a sanitized search_path is
> what
> > matters.
>
> I don't understand. Do you mean my last example is wrong / insecure?
> If so, why?
>

It is at risk because it depends on the session search_path.  That is all.
Whether that risk turns into a failure to execute depends on how/when it is
executed.  I'm not that comfortable talking about security risks in this
context though the current design goal is to mitigate such security issues
by setting things up so the function execution fails rather than is
executed insecurely.  This is presently mainly done by setting the
search_path to just effectively pg_catalog before executing the query,
breaking any code depending on other schemas existing in the search_path.


> > Anything that would be executed during pg_restore has to be made
> > safe.  Therefore, code that is only ever executed by applications
> directly
> > can use swarch_path.
>
> Why should the function be executed during pg_restore?


If the function is used in building an index, or a materialized view, are
the common cases.  Trigger functions too.

Note, this is talking about evaluating functions generally, not the one
provided here specifically.




> I could do that, but I would like to understand if that is really
> necessary as it makes the interface more complicated, and I would like
> to avoid unnecessary complexity in my interface.
>
> Is it really impossible to have functions without SET search_path in
> the definition of a PL/pgSQL function if I fully-qualify all types in
> the DECLARE section and if all other non-qualified identifiers occur
> after set_config('search_path', ...)?
>
> If you add a set_config to the body of the function then you indeed avoid
the problem.  It is basically equivalent to adding a SET clause to the
create function command.  In this case even when the function is executed
in a sanitized search_path environment (such as the one established by
pg_restore) you are not relying on it.  That non-reliance is all that
really matters.

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
@ 2025-01-03 21:33           ` Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 22:13             ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  0 siblings, 2 replies; 33+ messages in thread

From: Jan Behrens @ 2025-01-03 21:33 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: [email protected] <[email protected]>

On Fri, 3 Jan 2025 10:16:15 -0700
"David G. Johnston" <[email protected]> wrote:

> It is at risk because it depends on the session search_path.  That is all.
> Whether that risk turns into a failure to execute depends on how/when it is
> executed.  I'm not that comfortable talking about security risks in this
> context though the current design goal is to mitigate such security issues
> by setting things up so the function execution fails rather than is
> executed insecurely.  This is presently mainly done by setting the
> search_path to just effectively pg_catalog before executing the query,
> breaking any code depending on other schemas existing in the search_path.

I'm not sure if there is a misunderstanding. In my last example (e-mail
dated Fri, 3 Jan 2025 13:53:32 +0100), the user who has control over
the contents of the "query_p" argument is an application programmer,
not a real end-user. The function is also *not* marked as SECURITY
DEFINER, so it always runs with the privileges of the caller. I don't
see any specific security risk here, except that I'm unsure if the
function is written properly with regard to qualification of the used
types after PL/pgSQL's BEGIN. As I learned, I must fully-qualify types
*before* the BEGIN, i.e. in the DECLARE section. But does this also
hold for types after the BEGIN when I previously ensure that the
search_path is correctly set (set within the function's body)?

> 
> > > Anything that would be executed during pg_restore has to be made
> > > safe.  Therefore, code that is only ever executed by applications
> > directly
> > > can use swarch_path.
> >
> > Why should the function be executed during pg_restore?
> 
> 
> If the function is used in building an index, or a materialized view, are
> the common cases.  Trigger functions too.
> 
> Note, this is talking about evaluating functions generally, not the one
> provided here specifically.

I don't think my function would be evaluated during a pg_restore then.

> 
> > I could do that, but I would like to understand if that is really
> > necessary as it makes the interface more complicated, and I would like
> > to avoid unnecessary complexity in my interface.
> >
> > Is it really impossible to have functions without SET search_path in
> > the definition of a PL/pgSQL function if I fully-qualify all types in
> > the DECLARE section and if all other non-qualified identifiers occur
> > after set_config('search_path', ...)?
> >
> If you add a set_config to the body of the function then you indeed avoid
> the problem.  It is basically equivalent to adding a SET clause to the
> create function command.  In this case even when the function is executed
> in a sanitized search_path environment (such as the one established by
> pg_restore) you are not relying on it.  That non-reliance is all that
> really matters.
> 
> David J.

But if I use "SET search_path FROM CURRENT", then the called function
won't know the search_path that is set at the caller's side (which is
what I need to make my interface nice to use).

I would prefer my current solution, but I would like to be sure that my
example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is
correct. I still am not sure about that.

Kind Regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-03 21:56             ` Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  1 sibling, 1 reply; 33+ messages in thread

From: Adrian Klaver @ 2025-01-03 21:56 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; David G. Johnston <[email protected]>; +Cc: [email protected] <[email protected]>

On 1/3/25 13:33, Jan Behrens wrote:
> On Fri, 3 Jan 2025 10:16:15 -0700
> "David G. Johnston" <[email protected]> wrote:
> 

> 
> But if I use "SET search_path FROM CURRENT", then the called function
> won't know the search_path that is set at the caller's side (which is
> what I need to make my interface nice to use).

At this point I am lost as to what the overall goal of this is.

Can you provide a 10000 ft view if what it is you are trying to achieve?


> 
> I would prefer my current solution, but I would like to be sure that my
> example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is
> correct. I still am not sure about that.
> 
> Kind Regards,
> Jan Behrens
> 
> 

-- 
Adrian Klaver
[email protected]







^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
@ 2025-01-03 23:22               ` Jan Behrens <[email protected]>
  2025-01-03 23:23                 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 23:36                 ` Re: search_path for PL/pgSQL functions partially cached? Isaac Morland <[email protected]>
  2025-01-04 17:37                 ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  0 siblings, 3 replies; 33+ messages in thread

From: Jan Behrens @ 2025-01-03 23:22 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Fri, 3 Jan 2025 13:56:02 -0800
Adrian Klaver <[email protected]> wrote:

> At this point I am lost as to what the overall goal of this is.
> 
> Can you provide a 10000 ft view if what it is you are trying to achieve?

Sure! I would like to create a component (e.g. a PostgreSQL extension)
that provides a function which processes some complex data, without
making any requirements regarding where the data is stored. To pass
this data to the function, I could use arrays of composite types, but
that seems to be very bulky. Another option would be to use cursors,
but that didn't turn out to work very smooth either.

Instead, I plan to expect the function to receive a query string that
will get the data that is being processed by the function.

That query string should be allowed to refer to tables in the
search_path at the caller's side.

Therefore, I cannot use the "SET search_path FROM CURRENT" in my
"CREATE FUNCTION" statement, because it would overwrite the current
search_path on each call of the function.

Thus my idea is to do this (simplified):

CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS "some_type"

  --------------------------------------------------------------------
  -- I cannot use SET search_path FROM CURRENT here, because "query_p"
  -- shall refer to tables in the search_path of the caller.
  --------------------------------------------------------------------

  LANGUAGE plpgsql AS $$
    DECLARE
      "old_search_path" TEXT;

      ----------------------------------------------------------------
      -- I have to fully qualify types in the DECLARE section.
      ----------------------------------------------------------------

      "some_variable" "some_schema"."some_type";
    BEGIN
      SELECT current_setting('search_path') INTO "old_search_path";
      PERFORM set_config(
        'search_path',
        'some_schema, pg_temp, ' || "old_search_path",
        TRUE
      );

      ----------------------------------------------------------------
      -- Do I have to fully qualify types and operators from
      -- "myschema" here? Or is it safe to not fully qualify them?
      ----------------------------------------------------------------
    END;
  $$;

That is my overall idea.

My problem is that I'm confused about WHEN EXACTLY I have to qualify
tables/types, etc. It is very hard to understand from reading (just) the
documentation.

Kind Regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-03 23:23                 ` Jan Behrens <[email protected]>
  2 siblings, 0 replies; 33+ messages in thread

From: Jan Behrens @ 2025-01-03 23:23 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Sat, 4 Jan 2025 00:22:03 +0100
Jan Behrens <[email protected]> wrote:

>       ----------------------------------------------------------------
>       -- Do I have to fully qualify types and operators from
>       -- "myschema" here? Or is it safe to not fully qualify them?
>       ----------------------------------------------------------------

This was meant to read:

...from "some_schema" here.






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-03 23:36                 ` Isaac Morland <[email protected]>
  2025-01-04 13:23                   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2 siblings, 1 reply; 33+ messages in thread

From: Isaac Morland @ 2025-01-03 23:36 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: Adrian Klaver <[email protected]>; David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Fri, 3 Jan 2025 at 18:22, Jan Behrens <[email protected]> wrote:


> Instead, I plan to expect the function to receive a query string that
> will get the data that is being processed by the function.
>
> That query string should be allowed to refer to tables in the
> search_path at the caller's side.
>
> Therefore, I cannot use the "SET search_path FROM CURRENT" in my
> "CREATE FUNCTION" statement, because it would overwrite the current
> search_path on each call of the function.
>

 I wonder if it would help if EXECUTE took an optional search_path to use
while executing the query.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 23:36                 ` Re: search_path for PL/pgSQL functions partially cached? Isaac Morland <[email protected]>
@ 2025-01-04 13:23                   ` Jan Behrens <[email protected]>
  2025-01-05 01:19                     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  0 siblings, 1 reply; 33+ messages in thread

From: Jan Behrens @ 2025-01-04 13:23 UTC (permalink / raw)
  To: Isaac Morland <[email protected]>; +Cc: Adrian Klaver <[email protected]>; David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Fri, 3 Jan 2025 18:36:13 -0500
Isaac Morland <[email protected]> wrote:

> On Fri, 3 Jan 2025 at 18:22, Jan Behrens <[email protected]> wrote:
> 
> 
> > Instead, I plan to expect the function to receive a query string that
> > will get the data that is being processed by the function.
> >
> > That query string should be allowed to refer to tables in the
> > search_path at the caller's side.
> >
> > Therefore, I cannot use the "SET search_path FROM CURRENT" in my
> > "CREATE FUNCTION" statement, because it would overwrite the current
> > search_path on each call of the function.
> >
> 
>  I wonder if it would help if EXECUTE took an optional search_path to use
> while executing the query.

That wouldn't solve my problem, because the function that includes the
EXECUTE still needs to know the search_path set on the caller side.

This only works if I omit the "SET search_path FROM CURRENT" option in
the function's definition OR if I pass a search_path as an argument. I
guess I could write a wrapper:

============

BEGIN;

CREATE SCHEMA "some_schema";
SET LOCAL search_path TO "some_schema";

CREATE TYPE "some_type" AS ("name" TEXT, "height" INT8);

CREATE FUNCTION "foo_impl"("query_p" TEXT, "search_path_p" TEXT)
  RETURNS "some_type"
  LANGUAGE plpgsql SET search_path FROM CURRENT AS $$
    DECLARE
      "old_search_path" TEXT;
      "result" "some_type";
    BEGIN
      "old_search_path" = current_setting('search_path');
      PERFORM set_config('search_path', "search_path_p", TRUE);
      EXECUTE "query_p" INTO "result";
      PERFORM set_config('search_path', "old_search_path", TRUE);
      RETURN "result";
    END;
  $$;

CREATE FUNCTION "foo"("query_p" TEXT)
  RETURNS "some_type"
  RETURN "foo_impl"("query_p", current_setting('search_path'));

COMMIT;

CREATE TABLE "tbl" ("id" SERIAL8, "name" TEXT, "height" INT8);
INSERT INTO "tbl" ("name", "height") VALUES ('Alice', 200);

SELECT * FROM "some_schema"."foo"('SELECT "name" FROM "tbl"');

============

Not sure which variant (this or my previous attempt) is better and if
either is safe/correct.

Regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 23:36                 ` Re: search_path for PL/pgSQL functions partially cached? Isaac Morland <[email protected]>
  2025-01-04 13:23                   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-05 01:19                     ` David G. Johnston <[email protected]>
  0 siblings, 0 replies; 33+ messages in thread

From: David G. Johnston @ 2025-01-05 01:19 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: Isaac Morland <[email protected]>; Adrian Klaver <[email protected]>; [email protected] <[email protected]>

On Saturday, January 4, 2025, Jan Behrens <[email protected]> wrote:

>
> CREATE FUNCTION "foo_impl"("query_p" TEXT, "search_path_p" TEXT)
>   RETURNS "some_type"
>   LANGUAGE plpgsql SET search_path FROM CURRENT AS $$
>     DECLARE
>       "old_search_path" TEXT;
>       "result" "some_type";
>     BEGIN
>       "old_search_path" = current_setting('search_path');
>       PERFORM set_config('search_path', "search_path_p", TRUE);
>       EXECUTE "query_p" INTO "result";
>       PERFORM set_config('search_path', "old_search_path", TRUE);
>       RETURN "result";
>     END;
>   $$;
>

You might consider adding a polymorphic argument for the result type.  Then
if you call the function with two different typed inputs it will be cached
once for each.

“ Likewise, functions having polymorphic argument types have a separate
statement cache for each combination of actual argument types they have
been invoked for, so that data type differences do not cause unexpected
failures.”

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-04 17:37                 ` Adrian Klaver <[email protected]>
  2025-01-04 23:12                   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2 siblings, 1 reply; 33+ messages in thread

From: Adrian Klaver @ 2025-01-04 17:37 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

On 1/3/25 15:22, Jan Behrens wrote:
> On Fri, 3 Jan 2025 13:56:02 -0800
> Adrian Klaver <[email protected]> wrote:
> 
>> At this point I am lost as to what the overall goal of this is.
>>
>> Can you provide a 10000 ft view if what it is you are trying to achieve?
> 
> Sure! I would like to create a component (e.g. a PostgreSQL extension)
> that provides a function which processes some complex data, without
> making any requirements regarding where the data is stored. To pass
> this data to the function, I could use arrays of composite types, but
> that seems to be very bulky. Another option would be to use cursors,
> but that didn't turn out to work very smooth either.
> 
> Instead, I plan to expect the function to receive a query string that
> will get the data that is being processed by the function.
> 
> That query string should be allowed to refer to tables in the
> search_path at the caller's side.
> 
> Therefore, I cannot use the "SET search_path FROM CURRENT" in my
> "CREATE FUNCTION" statement, because it would overwrite the current
> search_path on each call of the function.
> 
> Thus my idea is to do this (simplified):
> 
> CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS "some_type"
> 
>    --------------------------------------------------------------------
>    -- I cannot use SET search_path FROM CURRENT here, because "query_p"
>    -- shall refer to tables in the search_path of the caller.
>    --------------------------------------------------------------------
> 
>    LANGUAGE plpgsql AS $$
>      DECLARE
>        "old_search_path" TEXT;
> 
>        ----------------------------------------------------------------
>        -- I have to fully qualify types in the DECLARE section.
>        ----------------------------------------------------------------
> 
>        "some_variable" "some_schema"."some_type";
>      BEGIN
>        SELECT current_setting('search_path') INTO "old_search_path";
>        PERFORM set_config(
>          'search_path',
>          'some_schema, pg_temp, ' || "old_search_path",
>          TRUE
>        );
> 
>        ----------------------------------------------------------------
>        -- Do I have to fully qualify types and operators from
>        -- "myschema" here? Or is it safe to not fully qualify them?
>        ----------------------------------------------------------------
>      END;
>    $$;
> 
> That is my overall idea.

Is 'some_schema' a known item when installing?

Once you have the search_path defined and assuming all the objects you 
want are in that path, then yes you can drop the schema qualification.

> 
> My problem is that I'm confused about WHEN EXACTLY I have to qualify
> tables/types, etc. It is very hard to understand from reading (just) the
> documentation.

If you are doing this as an extension then I suspect you want the 
processes shown here:

https://www.postgresql.org/docs/17/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION


> 
> Kind Regards,
> Jan Behrens

-- 
Adrian Klaver
[email protected]







^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-04 17:37                 ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
@ 2025-01-04 23:12                   ` Jan Behrens <[email protected]>
  2025-01-05 00:04                     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-05 01:40                     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-05 06:48                     ` Re: search_path for PL/pgSQL functions partially cached? Laurenz Albe <[email protected]>
  0 siblings, 3 replies; 33+ messages in thread

From: Jan Behrens @ 2025-01-04 23:12 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; Tom Lane <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Sat, 4 Jan 2025 09:37:14 -0800
Adrian Klaver <[email protected]> wrote:

> On 1/3/25 15:22, Jan Behrens wrote:
> > On Fri, 3 Jan 2025 13:56:02 -0800
> > Adrian Klaver <[email protected]> wrote:
> > 
> >> At this point I am lost as to what the overall goal of this is.
> >>
> >> Can you provide a 10000 ft view if what it is you are trying to achieve?
> > 
> > Sure! I would like to create a component (e.g. a PostgreSQL extension)
> > that provides a function which processes some complex data, without
> > making any requirements regarding where the data is stored. To pass
> > this data to the function, I could use arrays of composite types, but
> > that seems to be very bulky. Another option would be to use cursors,
> > but that didn't turn out to work very smooth either.
> > 
> > Instead, I plan to expect the function to receive a query string that
> > will get the data that is being processed by the function.
> > 
> > That query string should be allowed to refer to tables in the
> > search_path at the caller's side.
> > 
> > Therefore, I cannot use the "SET search_path FROM CURRENT" in my
> > "CREATE FUNCTION" statement, because it would overwrite the current
> > search_path on each call of the function.
> > 
> > Thus my idea is to do this (simplified):
> > 
> > CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS "some_type"
> > 
> >    --------------------------------------------------------------------
> >    -- I cannot use SET search_path FROM CURRENT here, because "query_p"
> >    -- shall refer to tables in the search_path of the caller.
> >    --------------------------------------------------------------------
> > 
> >    LANGUAGE plpgsql AS $$
> >      DECLARE
> >        "old_search_path" TEXT;
> > 
> >        ----------------------------------------------------------------
> >        -- I have to fully qualify types in the DECLARE section.
> >        ----------------------------------------------------------------
> > 
> >        "some_variable" "some_schema"."some_type";
> >      BEGIN
> >        SELECT current_setting('search_path') INTO "old_search_path";
> >        PERFORM set_config(
> >          'search_path',
> >          'some_schema, pg_temp, ' || "old_search_path",
> >          TRUE
> >        );
> > 
> >        ----------------------------------------------------------------
> >        -- Do I have to fully qualify types and operators from
> >        -- "myschema" here? Or is it safe to not fully qualify them?
 (correction: "some_schema")
> >        ----------------------------------------------------------------
> >      END;
> >    $$;
> > 
> > That is my overall idea.
> 
> Is 'some_schema' a known item when installing?

Yes, fortunately "some_schema" is a fixed name.

> 
> Once you have the search_path defined and assuming all the objects you 
> want are in that path, then yes you can drop the schema qualification.

That would be nice, but it doesn't seem to be the case. At least not
always. I constructed the following new example:

============

CREATE TABLE "tbl" ("col" NUMERIC(15, 0));

CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
  BEGIN
    RETURN '2.4';
  END;
$$;

BEGIN;

CREATE SCHEMA "myschema";
SET LOCAL search_path TO 'myschema';

CREATE TABLE "tbl" ("col" NUMERIC);

CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
  BEGIN
    RETURN '5.4';
  END;
$$;

CREATE FUNCTION "run"() RETURNS TEXT LANGUAGE plpgsql AS $$
  DECLARE
    "old_search_path" TEXT;
  BEGIN
    "old_search_path" := current_setting('search_path');
    SET LOCAL search_path TO "myschema";
    -- At this point, search_path is always set to 'myschema'!
    DECLARE
      "variable" "tbl"."col"%TYPE;
    BEGIN
      "variable" := "foo"();
      RETURN "variable";
    END;
    PERFORM set_config('search_path', "old_search_path", TRUE);
  END;
$$;

COMMIT;

SELECT "myschema"."run"(); -- returns '5.4' (when run in the same session)

-- reconnect to database here:
\c

SELECT "myschema"."run"(); -- returns '5'
SET search_path TO 'myschema';
SELECT "myschema"."run"(); -- returns '5'

-- reconnect to database again:
\c

SET search_path TO 'myschema';
SELECT "myschema"."run"(); -- returns '5.4'
SET search_path TO 'public';
SELECT "myschema"."run"(); -- returns '5.4'

============

Even if

DECLARE "variable" "tbl"."col"%TYPE;

follows *after* the schema is set to "myschema" in the example above, I
still get differing results, depending on how the search_path was set
when the function was first called.

I think this has to do with the fact that the overall structure and
probably types(?) are parsed first?

As Tom Lane wrote on Fri, 27 Dec 2024 16:03:17 -0500, "the types of
plpgsql variables are only looked up on the first use (within a
session)."

Does this apply to *all* types (e.g. types used in type-casts in
statements after BEGIN)? Or does it only apply to types in the DECLARE
section?

Maybe my most recent example is somewhat "crafted", but it makes me
feel insecure about what I can rely on. Could someone explain to me
what the exact rules are, or where to find them? I don't seem to
understand the exact behavior from reading the docs.

I re-read section 41.11.2. on Plan Caching:

"The PL/pgSQL interpreter parses the function's source text and
produces an internal binary instruction tree the first time the
function is called (within each session). The instruction tree fully
translates the PL/pgSQL statement structure, but individual SQL
expressions and SQL commands used in the function are not translated
immediately.
As each expression and SQL command is first executed in the function,
the PL/pgSQL interpreter parses and analyzes the command to create a
prepared statement, using the SPI manager's SPI_prepare function.
Subsequent visits to that expression or command reuse the prepared
statement. [...]"

It isn't specific about how DECLARE blocks are handled.

> 
> > 
> > My problem is that I'm confused about WHEN EXACTLY I have to qualify
> > tables/types, etc. It is very hard to understand from reading (just) the
> > documentation.
> 
> If you are doing this as an extension then I suspect you want the 
> processes shown here:
> 
> https://www.postgresql.org/docs/17/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION

Yes, I'm aware of that, but the code defining the function is not part
of an extension in my case (only the RATIONAL type is part of an
extension). But thank you for pointing this out. I have been using
@extschema@ in extension code before, and recently also learned about
the @extschema:name@ syntax.

> -- 
> Adrian Klaver
> [email protected]

Many thanks already for all your input.

Regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-04 17:37                 ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-04 23:12                   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-05 00:04                     ` David G. Johnston <[email protected]>
  2 siblings, 0 replies; 33+ messages in thread

From: David G. Johnston @ 2025-01-05 00:04 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: Adrian Klaver <[email protected]>; Tom Lane <[email protected]>; [email protected] <[email protected]>

On Saturday, January 4, 2025, Jan Behrens <[email protected]> wrote:
>
>
> Even if
>
> DECLARE "variable" "tbl"."col"%TYPE;
>
> follows *after* the schema is set to "myschema" in the example above, I
> still get differing results, depending on how the search_path was set
> when the function was first called.
>
> I think this has to do with the fact that the overall structure and
> probably types(?) are parsed first?
>

I concur that this dynamic doesn’t seem to be discussed.  Namely that in
the presence of nested blocks the parse phase resolves placeholders for all
declared variables without executing any expressions in the body of the
function; therefore all types will be resolved seeing the same search_path,
namely that of the calling session or established using SET.  Changing the
search_path within an outer function body block will not affect
declarations within an inner block. (I am not sure whether the for-loop
cases are exceptional in this.)

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-04 17:37                 ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-04 23:12                   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-05 01:40                     ` David G. Johnston <[email protected]>
  2 siblings, 0 replies; 33+ messages in thread

From: David G. Johnston @ 2025-01-05 01:40 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: Adrian Klaver <[email protected]>; Tom Lane <[email protected]>; [email protected] <[email protected]>

On Saturday, January 4, 2025, Jan Behrens <[email protected]> wrote:

>
> I re-read section 41.11.2. on Plan Caching:
>
> "The PL/pgSQL interpreter parses the function's source text and
> produces an internal binary instruction tree the first time the
> function is called (within each session). The instruction tree fully
> translates the PL/pgSQL statement structure,
>

The type of a plpgsql variable is by definition its structure; established
in a statement, so this is actually covered by that paragraph.  But I would
be for adding a bit more specific terminology here.

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-04 17:37                 ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-04 23:12                   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-05 06:48                     ` Laurenz Albe <[email protected]>
  2025-01-05 10:43                       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2 siblings, 1 reply; 33+ messages in thread

From: Laurenz Albe @ 2025-01-05 06:48 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; Adrian Klaver <[email protected]>; Tom Lane <[email protected]>; +Cc: David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Sun, 2025-01-05 at 00:12 +0100, Jan Behrens wrote:
> I constructed the following new example:
> 
> ============
> 
> CREATE TABLE "tbl" ("col" NUMERIC(15, 0));
> 
> CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
>   BEGIN
>     RETURN '2.4';
>   END;
> $$;
> 
> BEGIN;
> 
> CREATE SCHEMA "myschema";
> SET LOCAL search_path TO 'myschema';
> 
> CREATE TABLE "tbl" ("col" NUMERIC);
> 
> CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
>   BEGIN
>     RETURN '5.4';
>   END;
> $$;
> 
> CREATE FUNCTION "run"() RETURNS TEXT LANGUAGE plpgsql AS $$
>   DECLARE
>     "old_search_path" TEXT;
>   BEGIN
>     "old_search_path" := current_setting('search_path');
>     SET LOCAL search_path TO "myschema";
>     -- At this point, search_path is always set to 'myschema'!
>     DECLARE
>       "variable" "tbl"."col"%TYPE;
>     BEGIN
>       "variable" := "foo"();
>       RETURN "variable";
>     END;
>     PERFORM set_config('search_path', "old_search_path", TRUE);
>   END;
> $$;
> 
> COMMIT;
> 
> Even if
> 
> DECLARE "variable" "tbl"."col"%TYPE;
> 
> follows *after* the schema is set to "myschema" in the example above, I
> still get differing results, depending on how the search_path was set
> when the function was first called.

So what you should do is set the "search_path" *on* the function, not *in*
the function:

CREATE FUNCTION "run"() RETURNS TEXT LANGUAGE plpgsql
SET search_path = myschema
AS $$
  DECLARE
    "variable" "tbl"."col"%TYPE;
  BEGIN
    "variable" := "foo"();
    RETURN "variable";
  END;
$$;

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 21:56             ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-03 23:22               ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-04 17:37                 ` Re: search_path for PL/pgSQL functions partially cached? Adrian Klaver <[email protected]>
  2025-01-04 23:12                   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-05 06:48                     ` Re: search_path for PL/pgSQL functions partially cached? Laurenz Albe <[email protected]>
@ 2025-01-05 10:43                       ` Jan Behrens <[email protected]>
  0 siblings, 0 replies; 33+ messages in thread

From: Jan Behrens @ 2025-01-05 10:43 UTC (permalink / raw)
  To: Laurenz Albe <[email protected]>; +Cc: Adrian Klaver <[email protected]>; Tom Lane <[email protected]>; David G. Johnston <[email protected]>; [email protected] <[email protected]>

On Sun, 05 Jan 2025 07:48:56 +0100
Laurenz Albe <[email protected]> wrote:

> So what you should do is set the "search_path" *on* the function, not *in*
> the function:
> 
> CREATE FUNCTION "run"() RETURNS TEXT LANGUAGE plpgsql
> SET search_path = myschema
> AS $$
>   DECLARE
>     "variable" "tbl"."col"%TYPE;
>   BEGIN
>     "variable" := "foo"();
>     RETURN "variable";
>   END;
> $$;
> 
> Yours,
> Laurenz Albe

Yes, that's what works and what I would also do whenever possible
(probably in the form "SET search_path FROM CURRENT").

Summarizing the remaining thread, some issues are:

* The documentation isn't providing a prominent warning that behavior
  can be surprising if "SET search_path" is not used in the function's
  or procedure's defintion. (E.g. searching for "schema" in the
  documentation page for "CREATE FUNCTION" doesn't give any helpful
  hints or warning.)

* Things get more complicated when it's impossible to use
  "SET search_path" in the function's/procedure's definition, for which
  there are two scenarios:

  Scenario 1: The function or procedure needs or wants to access or use
              the search_path of the caller.

  Scenario 2: A procedure wants to execute transactional statements
              such as COMMIT or ROLLBACK within its body.

  In scenario 1, using "SET search_path" will overwrite the caller's
  search_path at runtime, so I cannot access it. (In my post from Sat,
  4 Jan 2025 14:23:10 +0100, I have proposed a wrapper function to work
  around that.)

  In scenario 2, using "SET search_path" is simply not possible and
  will be rejected by PostgreSQL.

* It is a bit unclear how the exact behavior is when I set a
  search_path from within the functions body (e.g. due to one of the two
  scenarios above). There are some examples that show some quite
  surprising behavior, at least if you don't fully understand the
  plan caching mechanism that is used.

Kind regards,
Jan Behrens






^ permalink  raw  reply  [nested|flat] 33+ messages in thread

* Re: search_path for PL/pgSQL functions partially cached?
  2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2024-12-27 23:40 ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 12:53   ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 15:34     ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 16:48       ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
  2025-01-03 17:16         ` Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
  2025-01-03 21:33           ` Re: search_path for PL/pgSQL functions partially cached? Jan Behrens <[email protected]>
@ 2025-01-03 22:13             ` David G. Johnston <[email protected]>
  1 sibling, 0 replies; 33+ messages in thread

From: David G. Johnston @ 2025-01-03 22:13 UTC (permalink / raw)
  To: Jan Behrens <[email protected]>; +Cc: [email protected] <[email protected]>

On Fri, Jan 3, 2025 at 2:33 PM Jan Behrens <[email protected]> wrote:

> I would prefer my current solution, but I would like to be sure that my
> example (the one in my e-mail dated Fri, 3 Jan 2025 13:53:32 +0100) is
> correct. I still am not sure about that.
>
>
If it does what you want and it is only ever executed by application code
over a client connection you should be fine.  Your client connection will
always have whatever search_path you arrange to have in place and the
application developer will know whether their environment is correct or not
when they test it, and have the ability to change their environment as
needed.

David J.


^ permalink  raw  reply  [nested|flat] 33+ messages in thread


end of thread, other threads:[~2025-01-05 10:43 UTC | newest]

Thread overview: 33+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-12-27 20:26 Re: search_path for PL/pgSQL functions partially cached? David G. Johnston <[email protected]>
2024-12-27 20:57 ` Pavel Stehule <[email protected]>
2024-12-27 21:03 ` Tom Lane <[email protected]>
2024-12-27 21:23   ` Pavel Stehule <[email protected]>
2024-12-27 23:40 ` Jan Behrens <[email protected]>
2024-12-28 05:34   ` Pavel Stehule <[email protected]>
2025-01-01 17:55   ` Jan Behrens <[email protected]>
2025-01-01 18:12     ` Adrian Klaver <[email protected]>
2025-01-01 18:19     ` David G. Johnston <[email protected]>
2025-01-02 10:37       ` Jan Behrens <[email protected]>
2025-01-02 11:40         ` Pavel Stehule <[email protected]>
2025-01-02 12:15           ` Jan Behrens <[email protected]>
2025-01-02 12:48             ` Pavel Stehule <[email protected]>
2025-01-02 15:34               ` Jan Behrens <[email protected]>
2025-01-02 16:20                 ` Pavel Stehule <[email protected]>
2025-01-03 12:53   ` Jan Behrens <[email protected]>
2025-01-03 15:34     ` David G. Johnston <[email protected]>
2025-01-03 16:48       ` Jan Behrens <[email protected]>
2025-01-03 17:16         ` David G. Johnston <[email protected]>
2025-01-03 21:33           ` Jan Behrens <[email protected]>
2025-01-03 21:56             ` Adrian Klaver <[email protected]>
2025-01-03 23:22               ` Jan Behrens <[email protected]>
2025-01-03 23:23                 ` Jan Behrens <[email protected]>
2025-01-03 23:36                 ` Isaac Morland <[email protected]>
2025-01-04 13:23                   ` Jan Behrens <[email protected]>
2025-01-05 01:19                     ` David G. Johnston <[email protected]>
2025-01-04 17:37                 ` Adrian Klaver <[email protected]>
2025-01-04 23:12                   ` Jan Behrens <[email protected]>
2025-01-05 00:04                     ` David G. Johnston <[email protected]>
2025-01-05 01:40                     ` David G. Johnston <[email protected]>
2025-01-05 06:48                     ` Laurenz Albe <[email protected]>
2025-01-05 10:43                       ` Jan Behrens <[email protected]>
2025-01-03 22:13             ` David G. Johnston <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox