Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tTJb7-004hWI-8X for pgsql-general@arkaria.postgresql.org; Thu, 02 Jan 2025 11:42:41 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tTJa6-00EZuU-SQ for pgsql-general@arkaria.postgresql.org; Thu, 02 Jan 2025 11:41:38 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tTJa6-00EZtu-Az for pgsql-general@lists.postgresql.org; Thu, 02 Jan 2025 11:41:38 +0000 Received: from mail-yb1-xb31.google.com ([2607:f8b0:4864:20::b31]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tTJa4-001hxQ-1U for pgsql-general@lists.postgresql.org; Thu, 02 Jan 2025 11:41:37 +0000 Received: by mail-yb1-xb31.google.com with SMTP id 3f1490d57ef6-e46ac799015so12906508276.0 for ; Thu, 02 Jan 2025 03:41:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1735818096; x=1736422896; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=O9T5fJ2SdBJqxkvuQmzMjnnZTQMcNgVG4Lv+Rpcnn3c=; b=lAxBNd09yHPo9SGKLzjeY3eK9ezM2egApBLfjf4kfdqrlIIhX+B4Uf/qKHq1U8/4M7 ChvHAgg4PjaXTZ9digFifzhVWzJhb92PuMZIAjs3vdk8x8PhwM02TRnNe0LsqSEhc8GK C4I0b4njactCJdkKYbrtbBIt1eEResQLk6OB03g46mGAnsQ/0Q7Ic4xu0D9uf1+U6/8S 0yv2vmHHPRd6jsU7duKoEEYROfBmIoirn7gM8shbnLT75PlXQKbaFsC9Mr/cy4z1+FQE i7OSYKXhpBz6l5J7HEX4DXcH9x3Aos0IWGlFPTHh3OP192vEPZoG//eyDQE1nsGQsI76 /fbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735818096; x=1736422896; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=O9T5fJ2SdBJqxkvuQmzMjnnZTQMcNgVG4Lv+Rpcnn3c=; b=Jdnc20xPa/cL3FnlBwwTqen94xAGIDlA2H8Dj8T6H72yRFxx0yqyxpemiHj/vVMoWW h8gjdCFSa0+2GwTUviLie7p1BzpQSruSp4+tJBT91jd4pPtPeDLLEfPP/a+kyvalLI3q OK6g+G9FsbLQA/AGatV1pckG+Immb8frorrB8biznUHdXrO+KmkxnBtFUu3k7akYr/M+ Rc4MGNVEf4UrdaJMhVIlJwSVlS7IB9qkCSe+2NRR82JN3BLPGPmr+2/mouNGfaRs6Vmd gciH5rXYEQuIhPylglm+zvqY7sJdAy0EgA87PIBnJ8CS89PH/q4fmoJgpnsAhf9KF90q eoUA== X-Forwarded-Encrypted: i=1; AJvYcCVPvC329p2TmQzG+3TgTIEr5WKW9X91s+n+s0BXj+osEIMZvywjOsZyOWn2OHtH85O8GgBReym73TxdVCvj@lists.postgresql.org X-Gm-Message-State: AOJu0Yz7Dn13KsyaLuSAv/bbIqyZDFpX/bpl+o5Q+lZ14Io3i3FOpaH0 j22wiwxTz2XWZAFjw0brdhmxQsk8UVna8p8BM63XN42QApsT+BUWaxZ3g3iyG3zB30si2Z9Lq81 sDfhmm30K2auGRL/DuhvQdOuc0yo= X-Gm-Gg: ASbGncsznr0VK28zx7TqhIzdbiFGVljGqISTgydyAzCN9ZKbn2ZcY90ahvhve9EoH41 AY06m5V4pSRbyofM/wZpgHMacCXeJgQJxL8DUksNHPAS2pgFlSOpf+SgODZoW4CQMhXS7zd0= X-Google-Smtp-Source: AGHT+IFr+eukZkXcGldwLWZGQCnPodCXmJgWNgIu/tqRfajbMG7+yZOx/a0P/gP+VjjBCHsWGUpF9z6uUPQMI1CHHa0= X-Received: by 2002:a05:690c:6188:b0:6ec:b10a:22a4 with SMTP id 00721157ae682-6f3f814987fmr309495877b3.25.1735818095660; Thu, 02 Jan 2025 03:41:35 -0800 (PST) MIME-Version: 1.0 References: <20241227205025.1d059f72c7c08d23c9648c26@magnetkern.de> <20241228004009.267f21b78394c934f27f9974@magnetkern.de> <20250101185504.3d50c571c3448512e94288e8@magnetkern.de> <20250102113727.1574b14fd677d164c32160bc@magnetkern.de> In-Reply-To: <20250102113727.1574b14fd677d164c32160bc@magnetkern.de> From: Pavel Stehule Date: Thu, 2 Jan 2025 12:40:59 +0100 Message-ID: Subject: Re: search_path for PL/pgSQL functions partially cached? To: Jan Behrens Cc: "David G. Johnston" , "pgsql-general@lists.postgresql.org" Content-Type: multipart/alternative; boundary="000000000000fc9baa062ab7a426" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000fc9baa062ab7a426 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi =C4=8Dt 2. 1. 2025 v 11:37 odes=C3=ADlatel Jan Behrens napsal: > On Wed, 1 Jan 2025 11:19:32 -0700 > "David G. Johnston" wrote: > > > On Wed, Jan 1, 2025 at 10:55=E2=80=AFAM Jan Behrens > wrote: > > > > > On Sat, 28 Dec 2024 00:40:09 +0100 > > > Jan Behrens wrote: > > > > > > > On Fri, 27 Dec 2024 13:26:28 -0700 > > > > "David G. Johnston" wrote: > > > > > > > > > > Or is it documented somewhere? > > > > > > > > > > > > > > https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGS= QL-PLAN-CACHING > > > > > > > > I can't find any notes regarding functions and schemas in that > section. > > > > > > > > "Because PL/pgSQL saves prepared statements and sometimes execution pla= ns > > in this way, SQL commands that appear directly in a PL/pgSQL function > must > > refer to the same tables and columns on every execution; that is, you > > cannot use a parameter as the name of a table or column in an SQL > command." > > > > Changing search_path is just one possible way to change out which objec= t > a > > name tries to refer to so it is not called out explicitly. > > The first part of the cited sentence seems helpful ("you must always > refer to the same tables and columns on every execution"). I would thus > conclude that using a dynamic search_path when running functions or > procedures is *always* considered errorneous (even though not reported > by the database as an error), except when using EXECUTE. > > I wonder if the database could/should generate an error (or at least a > warning?) when a function or procedure without a "SET search_path" > statement uses a non-qualified name? According to the documentation > using a dynamic search_path to refer to different entities in the > database is a case that "must" not happen. > > But following through, this might lead to more warnings one might > expect, e.g. when using simple operators such as "=3D" or the "IN" or > "CASE expression WHEN" statements, as these rely on the search_path as > well. Should such code be considered non-idiomatic, dangerous, or even > errorneous if a "SET search_path" option is missing in the > function's/procedure's definition? > > Maybe I'm overthinking this. But in practice, I've been running into > surprising issues whenever functions and schemas are involved, and I'm > not sure if every programmer will be aware of how important it is to > properly set a search_path in the function's defintion after reading > the documentation. (Besides, it's not always possible in procedures.) > How can you identify unwanted usage of non qualified identifiers from wanted usage of non qualified identifiers? It is a common pattern for sharding. Using not qualified identifiers of operators, functions is common when you are using orafce extensions, etc. Using qualified identifiers everywhere strongly reduces readability. There are no aliases to the schema, so aliases cannot help. you can identify the functions where search_path is not explicitly assigned select oid::regprocedure from pg_proc where pronamespace::regnamespace not in ('pg_catalog', 'information_schema') and not exists(select 1 from unnest(proconfig) g(v) where v ~ '^search_path'); Regards Pavel > > > > > "SQL-language and PL-language functions provided by extensions are at > > > risk of search-path-based attacks when they are executed, since parsi= ng > > > of these functions occurs at execution time not creation time." > > > > > Moreover, it isn't true for all > > > SQL-language functions, as can be demonstrated with the following cod= e: > > > > Yeah, when we added a second method to write an SQL-language function, > one > > that doesn't simply accept a string body, we didn't update that section > to > > point out that is the string input variant of create function that is > > affected in this manner, the non-string (atomic) variant stores the > result > > of parsing the inline code as opposed to storing the raw text. > > > > David J. > > I missed that other part in the manual (which is in a totally different > section). Should I report the missing update in section 36.17.6.1. of > the documentation as a documentation issue, or is it not necessary? > > Kind regards, > Jan Behrens > > > --000000000000fc9baa062ab7a426 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi

=C4=8Dt 2. 1. 2025 v=C2= =A011:37 odes=C3=ADlatel Jan Behrens <jbe-mlist@magnetkern.de> napsal:
On Wed, 1 Jan 2025 11:19:32 -0700
"David G. Johnston" <david.g.johnston@gmail.com> wrote:

> On Wed, Jan 1, 2025 at 10:55=E2=80=AFAM Jan Behrens <jbe-mlist@magnetkern.de&= gt; wrote:
>
> > On Sat, 28 Dec 2024 00:40:09 +0100
> > Jan Behrens <jbe-mlist@magnetkern.de> wrote:
> >
> > > On Fri, 27 Dec 2024 13:26:28 -0700
> > > "David G. Johnston" <david.g.johnston@gmail.com> wr= ote:
> > >
> > > > > Or is it documented somewhere?
> > > >
> > > >
> > ht= tps://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-P= LAN-CACHING
> > >
> > > I can't find any notes regarding functions and schemas i= n that section.
> >
> >
> "Because PL/pgSQL saves prepared statements and sometimes executi= on plans
> in this way, SQL commands that appear directly in a PL/pgSQL function = must
> refer to the same tables and columns on every execution; that is, you<= br> > cannot use a parameter as the name of a table or column in an SQL comm= and."
>
> Changing search_path is just one possible way to change out which obje= ct a
> name tries to refer to so it is not called out explicitly.

The first part of the cited sentence seems helpful ("you must always refer to the same tables and columns on every execution"). I would thu= s
conclude that using a dynamic search_path when running functions or
procedures is *always* considered errorneous (even though not reported
by the database as an error), except when using EXECUTE.

I wonder if the database could/should generate an error (or at least a
warning?) when a function or procedure without a "SET search_path"= ;
statement uses a non-qualified name? According to the documentation
using a dynamic search_path to refer to different entities in the
database is a case that "must" not happen.

But following through, this might lead to more warnings one might
expect, e.g. when using simple operators such as "=3D" or the &qu= ot;IN" or
"CASE expression WHEN" statements, as these rely on the search_pa= th as
well. Should such code be considered non-idiomatic, dangerous, or even
errorneous if a "SET search_path" option is missing in the
function's/procedure's definition?

Maybe I'm overthinking this. But in practice, I've been running int= o
surprising issues whenever functions and schemas are involved, and I'm<= br> not sure if every programmer will be aware of how important it is to
properly set a search_path in the function's defintion after reading the documentation. (Besides, it's not always possible in procedures.)

How can you identify unwanted usage of n= on qualified identifiers from wanted usage of non qualified identifiers? It= is a common pattern for sharding. Using not qualified identifiers of opera= tors, functions is common when you are using orafce extensions, etc.

Using qualified identifiers everywhere strongly redu= ces readability. There are no aliases to the schema, so aliases cannot help= .

you can identify the functions where search_path= is not explicitly assigned

select oid::regprocedu= re=C2=A0
=C2=A0 from pg_proc=C2=A0
where pronamespace::= regnamespace not in ('pg_catalog', 'information_schema')=C2= =A0
=C2=A0=C2=A0 and not exists(select 1 from unnest(proconfig) g= (v) where =C2=A0v ~ '^search_path');


<= /div>
Regards

Pavel


>
> > "SQL-language and PL-language functions provided by extensio= ns are at
> > risk of search-path-based attacks when they are executed, since p= arsing
> > of these functions occurs at execution time not creation time.&qu= ot;
>
> > Moreover, it isn't true for all
> > SQL-language functions, as can be demonstrated with the following= code:
>
> Yeah, when we added a second method to write an SQL-language function,= one
> that doesn't simply accept a string body, we didn't update tha= t section to
> point out that is the string input variant of create function that is<= br> > affected in this manner, the non-string (atomic) variant stores the re= sult
> of parsing the inline code as opposed to storing the raw text.
>
> David J.

I missed that other part in the manual (which is in a totally different
section). Should I report the missing update in section 36.17.6.1. of
the documentation as a documentation issue, or is it not necessary?

Kind regards,
Jan Behrens


--000000000000fc9baa062ab7a426--