Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vdBP3-00F057-02 for pgsql-hackers@arkaria.postgresql.org; Tue, 06 Jan 2026 18:03:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vdBP1-009kFs-2v for pgsql-hackers@arkaria.postgresql.org; Tue, 06 Jan 2026 18:03:32 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vdBP1-009kFk-1R for pgsql-hackers@lists.postgresql.org; Tue, 06 Jan 2026 18:03:32 +0000 Received: from mail-lj1-x233.google.com ([2a00:1450:4864:20::233]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vdBP0-004ZS2-0s for pgsql-hackers@lists.postgresql.org; Tue, 06 Jan 2026 18:03:31 +0000 Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-382fb275271so3533801fa.0 for ; Tue, 06 Jan 2026 10:03:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767722608; x=1768327408; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Mjj1NahEnbTuD5FZKz8UNmabzvw0Fc5SWHFvhEAWb5g=; b=J0oVDwbRYh5aMFnStU8hLtvdtBr/LIww5KoCx9NN5U4huZsqDGGCAz+qwzG/MiThB+ /MAiktS//3W/u5kN8drvOvZ+N1TI//8odX50GS8XVXloqQVfeedDMhf0w1E91ycBZIx3 seJvggNjcMv/EIuuDcKyPpFNMb7NVLoe6yx2gWVwfsxQbrW/VGonIhWxaWWUpTlsAU3W Di9OG0AGvJ6nrQUHlltdr0ny1ADn1d9WzX8rNwV0+83PkbPG9ojIRop06+GfJjaoffNv j51ueQAGCPYzKmgHXVfPZxm4KrqbrjTLJYhXZ3CnVuhox9aqRAf3lQUOGG2Ptq2SpAPA QkVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767722608; x=1768327408; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Mjj1NahEnbTuD5FZKz8UNmabzvw0Fc5SWHFvhEAWb5g=; b=kRKLxX25+tY6wGTpCHkCQKCfDRiOu2RnBwOv51T12Du7FlYJ2aup+z0xDx6hQoXRdp uUfVAtl0tzSpjroPFQXJstINggetgMw5gtb9myVZvPhU47nAq7vijyZwGlaU8fdlIM00 sV2ADWTCiiY4iTxpxCDlztuMOM1H+qXI/lUCUNY1TfO2m41CYfkIRY09wy0MP7DqI/9c vKHjzHvluAU+CKHWY7WCMXyd3G2QNZPvKnhGykTaaQT9zLfmZZAYfdhrN9Af4rzIfrGy zwI9tFvQaemnL6qFcBPfzPExCXdoB8DFC0BfQYymOMS/d8z5TAWK13PetpXRlzX/m01l YGrg== X-Forwarded-Encrypted: i=1; AJvYcCVbJE0ezcJjaVEUGBLR+J27QibxSz6lZ3sOuj7qDN5b7eSGkCb4S9nMCmdV25dlEgMFgTfx22BcUxzX7As3@lists.postgresql.org X-Gm-Message-State: AOJu0YyJbkTR0YinMDFwyZeYn3hDEdN1K/zeQ6Et6BPi3ozlhYX3+fHN 30xjz1RbCz8TnaeiKOYJw4BzpHNW/+GIe7cdjnczfzJ3a9FVAQVyrlKRgUsFFPo8xFrmRqfRl/E iVaZZZ5UTd3Vwf2C5D/LMDmBrQQg0HUo= X-Gm-Gg: AY/fxX5hLYvL3wlvuvGo+Z9jPt+CX/bHiVLP2UTCgj90FE4YdJPfUhJlTWKdSjC+cLB vCNZWd25dKbttPQjmUM6HHrHUQhPAGpcQX5nqI8dx47lNuyL/PSoZdj9qmWuNlmzCPncsv/oFHY Mdy4Y95RULieju/+yiLQhExJ219GN+cuQ4fV/rBYUPnBILkaFMvIMjSe4XanD8ZXfEyB5Xzhwez 4eWtcNwu+w1x+ZQDqNXLd58ID0KH6MjL0lViB72BHRDrYEaHjykuTciylVEIXL4IzhLx5XrYr9J yWPg7HQBHExneTCRUGeJuuSvmhv9 X-Google-Smtp-Source: AGHT+IF5ZZJTgXuZva+KIXVmZ/JVrL+Pq/ZFNLcOV7liW+CNAwU2cmpSrJ9g0mAB96OO8ibkSTOs7ciR9X395tfR9Is= X-Received: by 2002:a2e:be8e:0:b0:37d:1fa0:92bc with SMTP id 38308e7fff4ca-382eaacdbdbmr10006671fa.29.1767722606861; Tue, 06 Jan 2026 10:03:26 -0800 (PST) MIME-Version: 1.0 References: <89DE974B-F318-4D0A-A60B-51EDE84054E2@gmail.com> <9A074422-2308-4BD0-9FFA-0B6D70989935@yandex-team.ru> <70c72cb1-a39f-41b3-bfe3-e32ee7fda9c4@uni-muenster.de> <68a012d3-121b-418a-913b-aa0aaf32915d@uni-muenster.de> In-Reply-To: <68a012d3-121b-418a-913b-aa0aaf32915d@uni-muenster.de> From: Marcos Magueta Date: Tue, 6 Jan 2026 15:03:15 -0300 X-Gm-Features: AQt7F2rJGWUTcaIaqIO1c0lBuajnF2ulPzZO8ATMyPwyjUmhpTEy34_xzXyGCiM Message-ID: Subject: Re: WIP - xmlvalidate implementation from TODO list To: Jim Jones Cc: Andrey Borodin , Kirill Reshke , PostgreSQL Hackers Content-Type: multipart/alternative; boundary="0000000000000b20fa0647bbfe87" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000000b20fa0647bbfe87 Content-Type: text/plain; charset="UTF-8" Hey Jim! On 06.01.26, Jim Jones wrote: > The result of is R. That was an oversight on my behalf, I had a hard time understanding the standard, but now the validation of DOCUMENT and CONTENT being accepted makes more sense. The current patch has some issues. > xmloption is document_or_content. But xmlvalidate_text_schema() always validates as a document. As Andrey noticed, we should indeed support both a document and content. Which entails into an iterative validation (for each node provided) on content mode, so I should likely add the xmloption back. The fact it worked with the example I created was actually luck. Also, I am not sure if some variables used inside of the PG_TRY are memory safe -- notice that none right now is set to volatile, despite being accessed in different parts of the block; other functions in xml.c do handle such correctly it seems (like xml_parse). About the syntax proposal by Jim, I have no problems with complying to it. It does increase considerably the scope from what I originally intended, but that's the price to have something actually nice. I can think of several useful extensions we could consider in a further implementation: Schema Dependencies/Imports CREATE XMLSCHEMA base AS '...'; CREATE XMLSCHEMA extended IMPORTS base AS '...'; Schema Versioning CREATE XMLSCHEMA patient VERSION '1.0' AS '...'; CREATE XMLSCHEMA patient VERSION '2.0' AS '...'; XMLVALIDATE(doc ACCORDING TO XMLSCHEMA patient VERSION '2.0') Custom Error Messages CREATE XMLSCHEMA patient AS '...' ERROR MESSAGE 'Patient record does not match schema v2.0'; Schema inference from samples (if the lib supports it, that is) CREATE XMLSCHEMA patient INFER FROM (SELECT data FROM patient_samples); And much more, but perhaps that's already too ambitious for a first version. I'll wait for the others to ring their bells. Regards, Magueta. --0000000000000b20fa0647bbfe87 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey Jim!

On 06.01.26, Jim Jones = <jim.jones@uni-muenster.de<= /a>> wrote:


As Andrey noticed, = we should indeed support both a document and content. Which entails into an= iterative validation (for each node provided) on content mode, so I should= likely add the xmloption back. The fact it worked with the example I creat= ed was actually luck.

Also, I am not sure if some variables used ins= ide of the PG_TRY are memory safe -- notice that none right now is set to v= olatile, despite being accessed in different parts of the block; other func= tions in xml.c do handle such correctly it seems (like xml_parse).

<= /div>
About the syntax proposal by Jim, I have no = problems with complying to it. It does increase considerably the scope from= what I originally intended, but that's the price to have something act= ually nice.

I can think of several u= seful extensions we could consider in a further implementation:

Sche= ma Dependencies/Imports
CREATE XMLSCHEMA base AS '...';
CREAT= E XMLSCHEMA extended
=C2=A0 IMPORTS base
=C2=A0 AS '...';
=
Schema Versioning
CREATE XMLSCHEMA patient VERSION '1.0' AS = '...';
CREATE XMLSCHEMA patient VERSION '2.0' AS '..= .';
XMLVALIDATE(doc ACCORDING TO XMLSCHEMA patient VERSION '2.0&= #39;)

Custom Error Messages
CREATE XMLSCHEMA patient
=C2=A0 AS= '...'
=C2=A0 ERROR MESSAGE 'Patient record does not match s= chema v2.0';

Schema inference from samples (if the lib supports = it, that is)
CREATE XMLSCHEMA patient
=C2=A0 =C2=A0 INFER FROM (SELEC= T data FROM patient_samples);

And mu= ch more, but perhaps that's already too ambitious for a first version.<= br>
I'll wait for the others to ring= their bells.

Regards, Magueta.
--0000000000000b20fa0647bbfe87--