Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tTKdW-004ojh-37 for pgsql-general@arkaria.postgresql.org; Thu, 02 Jan 2025 12:49:14 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tTKdT-00FCdS-Ek for pgsql-general@arkaria.postgresql.org; Thu, 02 Jan 2025 12:49:11 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tTKdS-00FCdI-TF for pgsql-general@lists.postgresql.org; Thu, 02 Jan 2025 12:49:10 +0000 Received: from mail-yb1-xb31.google.com ([2607:f8b0:4864:20::b31]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tTKdQ-001ipm-1Y for pgsql-general@lists.postgresql.org; Thu, 02 Jan 2025 12:49:09 +0000 Received: by mail-yb1-xb31.google.com with SMTP id 3f1490d57ef6-e53ef7462b6so9987623276.3 for ; Thu, 02 Jan 2025 04:49:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1735822148; x=1736426948; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=R6XVMQxuekgtpn8zEVF7oajvOfkPe917PnxFJ1Hslik=; b=IoMGIOrKDLY4mFxskjsQZ8uilb29lV0gBugpDsckLsmpEFd7+3OsoHoVC67r8PWEtn aDV97Ij4pF4aN07v2TBTM7IAKSkfUYkXlJKQhw/5ZpmibPAmby3KGi7SNg1sX8OM1Ehd 5b3O7QMVCOoyTtedzX8vWF7GjLHhJc7fti5M2BywcxRPy/p1cedVKaq7htUM7YyrmwE3 mKCZdAZikP1+FxQNgpTaJMO1LbvPB1oT/N5yjV1Ctedq3kYthBrIcn03/quauMWzmk31 JeV36XXsisJSLgfsUZlRpbopMYLBighF5gjiNJrlh7ALkrDuhfxWECMMLbNd1KKFRKqB Rg9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735822148; x=1736426948; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=R6XVMQxuekgtpn8zEVF7oajvOfkPe917PnxFJ1Hslik=; b=Fy3r6++rknIlwq+Nt57ZPdIB36uzhF9jCY73W56GigyDgvV71nqlMwLEMPITyqtkwq gCZfdPHRi+uhYeYDjhkVtpgeCYtkdeMBoV1li/ttvOpPdh5rgCWyAT2gXf8eOxxNQF2t /dKDVkQ79A0f1pqlsxDD1KYzqMZeOnjNd3Lmf3ZitDhp4EsjuRKBCui4ySdJJOKBfCsa PqVmZrrn5DPuP1GJdIu2O2sEhI7rdOhxNbG+SBKgKelgbmq3gmSIX1YOMTqFHLFW/bkM Okm+9nxi92iNyAbcS6hhOuf36eLfcXPP5XmOUR6IcccLK+2vxTdNojosPJSoVqyOywDQ jf9g== X-Forwarded-Encrypted: i=1; AJvYcCXuFaP/OeEM5n5DJm0O7ZBtdbSGcj4eMqtMin5/rpU2dkjzzJ/QpKTiWlruHFXJ3d8+/Fqhy6IZjkN0Ttku@lists.postgresql.org X-Gm-Message-State: AOJu0YxMHNdyfzCzMeKjOyIq0tFmOkprEfEojX7mC2hT/aBcR7YLdd29 JMmQYQaZDffNQIFNsLPVUs+eamwHhmoYoVBCD+RtXO7uVrNURyGA42HVBj86HWAbSGOtJyvorKE kA2F7FbRGIJUVuisdHsHVItihKTI= X-Gm-Gg: ASbGncuzfiT8KP4v3cs1uYSnwQS5TLNwFCktpqNkUlbVK9kUBsK+ztIsr/acpyXVlZj HB55c1P2fOG+1NP+OBO52Mv5qXFyVLK/iZHSrS8uV+SFBoQlFule5zdvvF7hTDIAcjC8PAe4= X-Google-Smtp-Source: AGHT+IFND28dLU3Tuot5XUReNqBtc/s/SC2coxjMGLNpAsySIsTCTMHPtAONc/I/Xj+KPxt/jfUiCsfrNxoiWae1Bs8= X-Received: by 2002:a05:690c:6203:b0:6ef:6bdf:36d8 with SMTP id 00721157ae682-6f3f823949amr335207147b3.36.1735822146187; Thu, 02 Jan 2025 04:49:06 -0800 (PST) MIME-Version: 1.0 References: <20241227205025.1d059f72c7c08d23c9648c26@magnetkern.de> <20241228004009.267f21b78394c934f27f9974@magnetkern.de> <20250101185504.3d50c571c3448512e94288e8@magnetkern.de> <20250102113727.1574b14fd677d164c32160bc@magnetkern.de> <20250102131547.f47d7eeb27c7a8256a8cd189@magnetkern.de> In-Reply-To: <20250102131547.f47d7eeb27c7a8256a8cd189@magnetkern.de> From: Pavel Stehule Date: Thu, 2 Jan 2025 13:48:29 +0100 Message-ID: Subject: Re: search_path for PL/pgSQL functions partially cached? To: Jan Behrens Cc: "David G. Johnston" , "pgsql-general@lists.postgresql.org" Content-Type: multipart/alternative; boundary="0000000000006ab8a4062ab89647" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000006ab8a4062ab89647 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =C4=8Dt 2. 1. 2025 v 13:15 odes=C3=ADlatel Jan Behrens napsal: > On Thu, 2 Jan 2025 12:40:59 +0100 > Pavel Stehule wrote: > > > How can you identify unwanted usage of non qualified identifiers from > > wanted usage of non qualified identifiers? It is a common pattern for > > sharding. Using not qualified identifiers of operators, functions is > common > > when you are using orafce extensions, etc. > > I don't fully understand the use-case. Could you elaborate? > > As I understand, even if identifiers are not fully-qualified, it is > forbidden to use the search_path to refer to different database > entities at run-time (as David pointed out). > > So I don't understand how a dynamic "search_path" could be used in any > scenario within functions except when EXECUTE is involved. > you don't need more databases schema one - customer x schema two - customer y create table one.t1(..); create table one.t2(..); create table two.t1(..); create table two.t2(..); set search_path to one; -- work with data set of customer x set search_path to two; -- work wit data set of customer y some times can be pretty ineffective to have database per customer - more connect, disconnect in postgres is much more expensive than SET search_path TO .. and maybe RESET plans; > > > > > Using qualified identifiers everywhere strongly reduces readability. > There > > are no aliases to the schema, so aliases cannot help. > > Yes, I agree on that. Using "SET search_path" in the function's > definition fixes that problem, but it's easy to miss how important this > is from reading the documentation: > > The manual regarding "CREATE FUNCTION" refers to "search_path" only > within the "Writing SECURITY DEFINER Functions Safely" section. It's > easy to skip that part unless you use that feature. Moreover, that > section alone doesn't explain the weird behavior of four different > outcomes of a function with only two schemas involved which I brought > up in the beginning of this thread. > > The part on "SET configuration_parameter" part in the "CREATE FUNCTION" > documentation doesn't mention the search_path or schemas. And I don't > think you can expect every programmer will read the "Plan Caching" > subsection in the "PL/pgSQL under the Hood" section. But even then, the > information is just provided indirectly. > > yes, probably nobody reads the plan caching doc. And if they read it, then because they have performance problems. > Searching for "schema" in "CREATE FUNCTION"'s documentation doesn't > give any hint either. > This is a question - this is a generic feature in Postgres. Every query can be impacted by setting of search_path. From my perspective, there can be a note in the documentation related to copy types and row types. https://www.postgresql.org/docs/current/plpgsql-declarations.html#PLPGSQL-D= ECLARATION-TYPE The problem that you found is not just about the change of search_path. Same problem can be found after altering the table. Regards Pavel > I think (assuming that the behavior isn't fixed) that some slighly more > prominent warning would be reasonable. > > > > > you can identify the functions where search_path is not explicitly > assigned > > > > select oid::regprocedure > > from pg_proc > > where pronamespace::regnamespace not in ('pg_catalog', > > 'information_schema') > > and not exists(select 1 from unnest(proconfig) g(v) where v ~ > > '^search_path'); > > > > > > Regards > > > > Pavel > > Kind regards, > Jan > --0000000000006ab8a4062ab89647 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=C4=8Dt 2. 1. 2= 025 v=C2=A013:15 odes=C3=ADlatel Jan Behrens <jbe-mlist@magnetkern.de> napsal:
On Thu, 2 Jan 2025 12:40:59 +0100<= br> Pavel Stehule <pavel.stehule@gmail.com> wrote:

> How can you identify unwanted usage of non qualified identifiers from<= br> > wanted usage of non qualified identifiers? It is a common pattern for<= br> > sharding. Using not qualified identifiers of operators, functions is c= ommon
> when you are using orafce extensions, etc.

I don't fully understand the use-case. Could you elaborate?

As I understand, even if identifiers are not fully-qualified, it is
forbidden to use the search_path to refer to different database
entities at run-time (as David pointed out).

So I don't understand how a dynamic "search_path" could be us= ed in any
scenario within functions except when EXECUTE is involved.
=

you don't need more databases

<= div>schema one - customer x
schema two - customer y
create table one.t1(..); create table one.t2(..);
create table two.t1(..); create table two.t2(..);

set search_path to one;
-- work with data set of customer = x

set search_path to two;
-- work wit da= ta set of customer y

some times can be pretty inef= fective to have database per customer - more connect, disconnect in postgre= s is much more expensive than SET search_path TO .. and maybe RESET plans;<= /div>


=C2=A0

>
> Using qualified identifiers everywhere strongly reduces readability. T= here
> are no aliases to the schema, so aliases cannot help.

Yes, I agree on that. Using "SET search_path" in the function'= ;s
definition fixes that problem, but it's easy to miss how important this=
is from reading the documentation:

The manual regarding "CREATE FUNCTION" refers to "search_pat= h" only
within the "Writing SECURITY DEFINER Functions Safely" section. I= t's
easy to skip that part unless you use that feature. Moreover, that
section alone doesn't explain the weird behavior of four different
outcomes of a function with only two schemas involved which I brought
up in the beginning of this thread.

The part on "SET configuration_parameter" part in the "CREAT= E FUNCTION"
documentation doesn't mention the search_path or schemas. And I don'= ;t
think you can expect every programmer will read the "Plan Caching"= ;
subsection in the "PL/pgSQL under the Hood" section. But even the= n, the
information is just provided indirectly.


yes, probably nobody reads the plan ca= ching doc. And if they read it, then because they have performance problems= .

=C2=A0
Searching for "schema" in "CREATE FUNCTION"'s docum= entation doesn't
give any hint either.

This is a questio= n - this is a generic feature in Postgres.=C2=A0 Every query can be impacte= d by setting of search_path.

From my perspective, = there can be a note in the documentation related to copy types and row type= s.


The problem that you found is not just about= the change of search_path. Same problem can be found after altering the ta= ble.

Regards

Pavel


I think (assuming that the behavior isn't fixed) that some slighly more=
prominent warning would be reasonable.

>
> you can identify the functions where search_path is not explicitly ass= igned
>
> select oid::regprocedure
>=C2=A0 =C2=A0from pg_proc
> where pronamespace::regnamespace not in ('pg_catalog',
> 'information_schema')
>=C2=A0 =C2=A0 and not exists(select 1 from unnest(proconfig) g(v) where= =C2=A0 v ~
> '^search_path');
>
>
> Regards
>
> Pavel

Kind regards,
Jan
--0000000000006ab8a4062ab89647--