Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vbE0Y-0033Dg-2A for pgsql-hackers@arkaria.postgresql.org; Thu, 01 Jan 2026 08:26:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vbE0V-007u7r-2D for pgsql-hackers@arkaria.postgresql.org; Thu, 01 Jan 2026 08:26:08 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vbE0V-007u7j-0B for pgsql-hackers@lists.postgresql.org; Thu, 01 Jan 2026 08:26:08 +0000 Received: from mail-qt1-x82d.google.com ([2607:f8b0:4864:20::82d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vbE0Q-003cqB-0p for pgsql-hackers@lists.postgresql.org; Thu, 01 Jan 2026 08:26:06 +0000 Received: by mail-qt1-x82d.google.com with SMTP id d75a77b69052e-4eda6c385c0so87041381cf.3 for ; Thu, 01 Jan 2026 00:26:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767255961; x=1767860761; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Jl7nBfrhgvUududG0rYe5fjRNchIGaLMw/OMqSztmxo=; b=XKx3PfT51CjE4N178TnYvrxF/3YpFehGoDwudcKsdVknJXLg23P0emPa/uuU6v4wzh sDImbkXiVBEWfFqo1Hd9etxagUB4LULQTYP1qdxQQoTd1JJrnDnnBeuv/Iwk+vtcJAoX 1hRvfOiz3vZjlq5zPw72nvC4yoa4iZT+H9djQvyuxc3vnlsPNutUUB832/BRxCDBL2n2 v4v2yYw3xV0XI6T9D6C86nUN8XxJn8fF319mk3mfSt3B1fzHCx4xDxTYnZpkSvTnoQmq gcN33C3o79jw/azYw8V15/IgZP/cHIw0wXJ5BkQf2IIaz03CsoNHTtLNc0kN126DTs8P 3N8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767255961; x=1767860761; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Jl7nBfrhgvUududG0rYe5fjRNchIGaLMw/OMqSztmxo=; b=npWAbLN4jqo/DXclIK4fGPGZW9NNEHr527qtVsF2JULkjQioqQY276aPdqs9Y5M0SD GiB8i0tiFGM3DzQg873U4dbFIvxs9m1Vh2d6BZ7I7xTDPQ3kt6gm4EVhjxrImQK4Yt7g SZuHrX5qxu5GnDBMJGXzTfgEn6NoxkiYdudqkpjE5BCRNeYfXeNZi6dNVqqaFBJAjM59 RiGGxgEkBDdlwRK9/5synwM1Nl2hhxcqHi45e8LLREmxX4d57o1K0yPhtl6BZ+EOKKsa 1R0kMPLEDSrD9T3Q592I4IVx7HNVUj9ySLupwvykLQ5XxNBPpIZHzLVK+9Be8Y660Jxo Pntw== X-Gm-Message-State: AOJu0Yz3Muf/ojE++xfsKxDyJ3BZom9ZDLNRiobdnDhzH5NBkZxFot91 TYj7cJMSOWcFozXPP/5y/bT94BvkETS7LfqKFGTZzE+AWhi1+sylMMAiRdtW87YJgnBs0XCDKt4 17XSi4QzsbkqVBSzJhmAWAB+nHkZN634= X-Gm-Gg: AY/fxX4ArhzxqJMDmLzoctF2H4B9x4iuV5QmRS3PmOZS7hdr6M0w2/qwAGNRPZYh0Ri UleiltiH/ubmzK0u5kR4AW1x0LbAYu2Xmm8JrNc9KemJ8satyuJmMSaLZDPmAVt7Fm8vaMElVtY 3m+W3rtdx5EzftQ2911W3uyhcg9/XVRO924TJHZ3kJhYgfzOcKYtFCdvB0hUoRcUfhObWT1+5Te cgfkYvc3sy1p7UVwAlDe8u9anr0B4/dI+LMaixNOAB3JHirue2Lyho5BPnmNLFE74TEz1ZRlgun qwXUeU9tUvtS950w29i6BDplKtdNww34mQ== X-Google-Smtp-Source: AGHT+IGUq86I+0C3JTicArINDT52pKXk48MQ9S54oESS//4YoiYE1EFaKIL6C2L6BFa7rKmr7uKR1+PKqmJXPjTP2UQ= X-Received: by 2002:a05:622a:4a11:b0:4eb:a4fc:6095 with SMTP id d75a77b69052e-4f4abdc5e37mr468434021cf.68.1767255960991; Thu, 01 Jan 2026 00:26:00 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Kirill Reshke Date: Thu, 1 Jan 2026 13:25:49 +0500 X-Gm-Features: AQt7F2qUhk5dk4K15lKHi14VA0fvjb3bqI1kn961rZxXq0BvwBPcw3T2AJpt_9c Message-ID: Subject: Re: WIP - xmlvalidate implementation from TODO list To: Marcos Magueta Cc: PostgreSQL Hackers Content-Type: multipart/alternative; boundary="000000000000c83cd606474f575d" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000c83cd606474f575d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 1 Jan 2026, 01:27 Marcos Magueta, wrote: > Hello again! > > Is there any interest in this? I understand PostgreSQL has bigger fish to > fry, but I would like to at least know; in case this was just forgotten. > > Regards! > > Em sex., 19 de dez. de 2025 =C3=A0s 00:25, Marcos Magueta < > maguetamarcos@gmail.com> escreveu: > >> Hello again! >> >> I took some time to actually finish this feature. I think the answers >> for the previous questions are now clearer. I checked the >> initialization and the protections are indeed in place since commit >> a4b0c0aaf093a015bebe83a24c183e10a66c8c39, which specifically states: >> >> > Prevent access to external files/URLs via XML entity references. >> >> > xml_parse() would attempt to fetch external files or URLs as needed to >> > resolve DTD and entity references in an XML value, thus allowing >> > unprivileged database users to attempt to fetch data with the privileg= es >> > of the database server. While the external data wouldn't get returned >> > directly to the user, portions of it could be exposed in error message= s >> > if the data didn't parse as valid XML; and in any case the mere abilit= y >> > to check existence of a file might be useful to an attacker. >> > >> > The ideal solution to this would still allow fetching of references th= at >> > are listed in the host system's XML catalogs, so that documents can be >> > validated according to installed DTDs. However, doing that with the >> > available libxml2 APIs appears complex and error-prone, so we're not >> going >> > to risk it in a security patch that necessarily hasn't gotten wide >> review. >> > So this patch merely shuts off all access, causing any external fetch = to >> > silently expand to an empty string. A future patch may improve this. >> >> With that, the obvious affordance on the xmlvalidate implementation >> was to not rely on external schema sources on the host >> catalog. Therefore the implementation relies solely on expressions >> that necessarily evaluate to a schema in plain text. >> >> I added the requested documentation and a bunch of tests for each >> scenario. I would appreciate another round of reviews whenever someone >> has the time and patience. >> >> At last, to nourish the curiosity: I had issues with make check, as >> stated above on the e-mail thread. These got resolved when I changed >> `execl` to `execlp` on `pg_regress.c`. I of course did not commit >> such, but more people I know have had the very same issue while >> relying on immutable package managers. >> > Hi! First of all, please do not top post =F0=9F=99=8F . Use down-posting. About general interest in feature - I suspect that we as a community generally interested in implementing items from TODO list. This feature also increases SQL standard compatibility. But I am myself not a big SQL/XML user, so I can only give limited review here. I also did not have much time last month. I will try to find my cycles to give another look here. > --000000000000c83cd606474f575d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Thu, 1 Jan 2026,= 01:27 Marcos Magueta, <mague= tamarcos@gmail.com> wrote:
<= div dir=3D"ltr">
Hello again!

Is there any interest i= n this? I understand PostgreSQL has bigger fish to fry, but I would like to= at least know; in case this was just forgotten.

Regards!
Em se= x., 19 de dez. de 2025 =C3=A0s 00:25, Marcos Magueta <maguetamarcos= @gmail.com> escreveu:
Hello again!

I took some time to actua= lly finish this feature. I think the answers
for the previous questions = are now clearer. I checked the
initialization and the protections are in= deed in place since commit
a4b0c0aaf093a015bebe83a24c183e10a66c8c39, whi= ch specifically states:

> Prevent access to external files/URLs v= ia XML entity references.

> xml_parse() would attempt to fetch ex= ternal files or URLs as needed to
> resolve DTD and entity references= in an XML value, thus allowing
> unprivileged database users to atte= mpt to fetch data with the privileges
> of the database server.=C2=A0= While the external data wouldn't get returned
> directly to the = user, portions of it could be exposed in error messages
> if the data= didn't parse as valid XML; and in any case the mere ability
> to= check existence of a file might be useful to an attacker.
>
>= The ideal solution to this would still allow fetching of references that> are listed in the host system's XML catalogs, so that documents = can be
> validated according to installed DTDs.=C2=A0 However, doing = that with the
> available libxml2 APIs appears complex and error-pron= e, so we're not going
> to risk it in a security patch that neces= sarily hasn't gotten wide review.
> So this patch merely shuts of= f all access, causing any external fetch to
> silently expand to an e= mpty string.=C2=A0 A future patch may improve this.

With that, the o= bvious affordance on the xmlvalidate implementation
was to not rely on e= xternal schema sources on the host
catalog. Therefore the implementation= relies solely on expressions
that necessarily evaluate to a schema in p= lain text.

I added the requested documentation and a bunch of tests = for each
scenario. I would appreciate another round of reviews whenever = someone
has the time and patience.

At last, to nourish the curios= ity: I had issues with make check, as
stated above on the e-mail thread.= These got resolved when I changed
`execl` to `execlp` on `pg_regress.c`= . I of course did not commit
such, but more people I know have had the v= ery same issue while
relying on immutable package managers.


Hi!
First of all, please do not top post=C2=A0 = =F0=9F=99=8F . Use down-posting.

About general interest in = feature - I suspect that we as a community generally interested in implemen= ting items from TODO list. This feature also increases SQL standard compati= bility. But I am myself not a big SQL/XML user, so I can only give limited = review here. I also did not have much time last month. I will try to find m= y cycles to give another look here.=C2=A0
--000000000000c83cd606474f575d--