public inbox for [email protected]
help / color / mirror / Atom feedFrom: Marcos Magueta <[email protected]>
To: Kirill Reshke <[email protected]>
Cc: [email protected]
Subject: Re: WIP - xmlvalidate implementation from TODO list
Date: Fri, 19 Dec 2025 00:25:51 -0300
Message-ID: <CAN3aFCfvVgXr77o=dB_E2kSCY+EgckSQbSBdd_N9n-LauWuQLw@mail.gmail.com> (raw)
In-Reply-To: <CAN3aFCesNDiL-iZg4imC0n+NgT3JywqZYkuGH83u8ssLjJ-p5Q@mail.gmail.com>
References: <CAN3aFCdx8AapWSVpJ1kaC7OC_v7QwbjgbGw9WfPBBY2GMyOadQ@mail.gmail.com>
<CALdSSPjxLU+zhWx+CgwN+VHoHTso33trY6mse1A6Jks7hWAdrA@mail.gmail.com>
<CAN3aFCesNDiL-iZg4imC0n+NgT3JywqZYkuGH83u8ssLjJ-p5Q@mail.gmail.com>
Hello again!
I took some time to actually finish this feature. I think the answers
for the previous questions are now clearer. I checked the
initialization and the protections are indeed in place since commit
a4b0c0aaf093a015bebe83a24c183e10a66c8c39, which specifically states:
> Prevent access to external files/URLs via XML entity references.
> xml_parse() would attempt to fetch external files or URLs as needed to
> resolve DTD and entity references in an XML value, thus allowing
> unprivileged database users to attempt to fetch data with the privileges
> of the database server. While the external data wouldn't get returned
> directly to the user, portions of it could be exposed in error messages
> if the data didn't parse as valid XML; and in any case the mere ability
> to check existence of a file might be useful to an attacker.
>
> The ideal solution to this would still allow fetching of references that
> are listed in the host system's XML catalogs, so that documents can be
> validated according to installed DTDs. However, doing that with the
> available libxml2 APIs appears complex and error-prone, so we're not going
> to risk it in a security patch that necessarily hasn't gotten wide review.
> So this patch merely shuts off all access, causing any external fetch to
> silently expand to an empty string. A future patch may improve this.
With that, the obvious affordance on the xmlvalidate implementation
was to not rely on external schema sources on the host
catalog. Therefore the implementation relies solely on expressions
that necessarily evaluate to a schema in plain text.
I added the requested documentation and a bunch of tests for each
scenario. I would appreciate another round of reviews whenever someone
has the time and patience.
At last, to nourish the curiosity: I had issues with make check, as
stated above on the e-mail thread. These got resolved when I changed
`execl` to `execlp` on `pg_regress.c`. I of course did not commit
such, but more people I know have had the very same issue while
relying on immutable package managers.
Attachments:
[application/octet-stream] 0001-full-xmlvalidate-text-schema-implementation.patch (33.6K, 3-0001-full-xmlvalidate-text-schema-implementation.patch)
download | inline diff:
From fb56c90da3cab9d85e2f6262fdaf1f16d5778d28 Mon Sep 17 00:00:00 2001
From: Marcos Magueta <[email protected]>
Date: Thu, 18 Dec 2025 23:17:26 -0300
Subject: [PATCH 1/1] full xmlvalidate text schema implementation
---
doc/src/sgml/func/func-xml.sgml | 119 +++++++++++
src/backend/executor/execExprInterp.c | 24 +++
src/backend/nodes/nodeFuncs.c | 2 +
src/backend/parser/gram.y | 21 +-
src/backend/parser/parse_expr.c | 8 +
src/backend/parser/parse_target.c | 3 +
src/backend/utils/adt/ruleutils.c | 16 +-
src/backend/utils/adt/xml.c | 108 +++++++++-
src/include/nodes/primnodes.h | 1 +
src/include/parser/kwlist.h | 3 +
src/include/utils/xml.h | 1 +
src/test/regress/expected/xml.out | 280 ++++++++++++++++++++++++++
src/test/regress/sql/xml.sql | 216 ++++++++++++++++++++
13 files changed, 798 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/func/func-xml.sgml b/doc/src/sgml/func/func-xml.sgml
index 511bc90852a..b02db6c9b5b 100644
--- a/doc/src/sgml/func/func-xml.sgml
+++ b/doc/src/sgml/func/func-xml.sgml
@@ -1010,6 +1010,125 @@ SELECT xmltable.*
]]></screen>
</para>
</sect3>
+
+ <sect3 id="functions-xml-processing-xmlvalidate">
+ <title><literal>xmlvalidate</literal></title>
+
+ <indexterm>
+ <primary>xmlvalidate</primary>
+ </indexterm>
+
+<synopsis>
+<function>XMLVALIDATE</function> ( {<literal>DOCUMENT</literal>|<literal>CONTENT</literal>} <replaceable>xml_value</replaceable> <literal>ACCORDING TO XMLSCHEMA</literal> <replaceable>schema_text</replaceable> ) <returnvalue>boolean</returnvalue>
+</synopsis>
+
+ <para>
+ The <function>xmlvalidate</function> function validates an XML value
+ against an XML Schema (XSD). It returns <literal>true</literal> if the
+ XML is valid according to the schema, <literal>false</literal> if it is
+ invalid, or <literal>NULL</literal> if either argument is
+ <literal>NULL</literal>.
+ </para>
+
+ <para>
+ The first argument specifies whether to validate the XML as a
+ <literal>DOCUMENT</literal> (a complete XML document with a single root
+ element) or as <literal>CONTENT</literal> (an XML content fragment).
+ </para>
+
+ <para>
+ The <replaceable>schema_text</replaceable> argument should be a
+ <type>text</type> value containing a valid XML Schema Definition (XSD).
+ For security reasons, the schema is treated as plain text and parsed
+ in-memory only. This prevents malicious imports or external file access
+ through schema location references (such as <literal>xs:import</literal>,
+ <literal>xs:include</literal>, or <literal>schemaLocation</literal>
+ attributes). Any attempts to reference external resources will be
+ ignored, resulting in an empty schema reference.
+ </para>
+
+ <para>
+ Examples:
+<screen><![CDATA[
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name><age>30</age></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+ xmlvalidate
+-------------
+ t
+(1 row)
+]]></screen>
+ </para>
+
+ <para>
+ This example shows a validation failure due to a missing required element:
+<screen><![CDATA[
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+ xmlvalidate
+-------------
+ f
+(1 row)
+]]></screen>
+ </para>
+
+ <para>
+ The schema can also validate attributes:
+<screen><![CDATA[
+SELECT xmlvalidate(DOCUMENT '<product id="123"><name>Widget</name><price>9.99</price></product>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="product">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="price" type="xs:decimal"/>
+ </xs:sequence>
+ <xs:attribute name="id" type="xs:string" use="required"/>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+ xmlvalidate
+-------------
+ t
+(1 row)
+]]></screen>
+ </para>
+
+ <note>
+ <para>
+ For security, <function>xmlvalidate</function> processes the XML Schema
+ as an in-memory text value. External resource references such as
+ <literal>schemaLocation</literal> in <literal>xs:import</literal> or
+ <literal>xs:include</literal> directives are not followed and are
+ treated as empty schema references. This design prevents potential
+ security vulnerabilities where malicious schemas could attempt to
+ access external files or network resources.
+ </para>
+ </note>
+ </sect3>
</sect2>
<sect2 id="functions-xml-mapping">
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 5e7bd933afc..e1aed4f7ff4 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -4626,6 +4626,30 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
}
break;
+ case IS_XMLVALIDATE:
+ {
+ Datum *argvalue = op->d.xmlexpr.argvalue;
+ bool *argnull = op->d.xmlexpr.argnull;
+ xmltype *data;
+ text *schema;
+
+ /* arguments are known to be xml, text */
+ Assert(list_length(xexpr->args) == 2);
+
+ if (argnull[0] || argnull[1])
+ {
+ *op->resnull = true;
+ return;
+ }
+
+ data = DatumGetXmlP(argvalue[0]);
+ schema = DatumGetTextPP(argvalue[1]);
+
+ *op->resvalue = BoolGetDatum(xmlvalidate_text_schema(data, schema));
+ *op->resnull = false;
+ }
+ break;
+
case IS_DOCUMENT:
{
Datum *argvalue = op->d.xmlexpr.argvalue;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 024a2b2fd84..acb67f38b82 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -218,6 +218,8 @@ exprType(const Node *expr)
case T_XmlExpr:
if (((const XmlExpr *) expr)->op == IS_DOCUMENT)
type = BOOLOID;
+ else if (((const XmlExpr *) expr)->op == IS_XMLVALIDATE)
+ type = BOOLOID;
else if (((const XmlExpr *) expr)->op == IS_XMLSERIALIZE)
type = TEXTOID;
else
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7856ce9d78f..d1e878a96ed 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -704,7 +704,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
*/
/* ordinary key words in alphabetical order */
-%token <keyword> ABORT_P ABSENT ABSOLUTE_P ACCESS ACTION ADD_P ADMIN AFTER
+%token <keyword> ABORT_P ABSENT ABSOLUTE_P ACCESS ACCORDING ACTION ADD_P ADMIN AFTER
AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC
ASENSITIVE ASSERTION ASSIGNMENT ASYMMETRIC ATOMIC AT ATTACH ATTRIBUTE AUTHORIZATION
@@ -795,7 +795,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
WAIT WHEN WHERE WHITESPACE_P WINDOW WITH WITHIN WITHOUT WORK WRAPPER WRITE
XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLNAMESPACES
- XMLPARSE XMLPI XMLROOT XMLSERIALIZE XMLTABLE
+ XMLPARSE XMLPI XMLROOT XMLSCHEMA XMLSERIALIZE XMLTABLE XMLVALIDATE
YEAR_P YES_P
@@ -16231,6 +16231,17 @@ func_expr_common_subexpr:
n->location = @1;
$$ = (Node *) n;
}
+ | XMLVALIDATE '(' document_or_content a_expr ACCORDING TO XMLSCHEMA a_expr ')'
+ {
+ XmlExpr *x = (XmlExpr *)
+ makeXmlExpr(IS_XMLVALIDATE, NULL, NIL,
+ list_make2($4, $8),
+ @1);
+
+ x->xmloption = $3;
+ x->location = @1;
+ $$ = (Node *) x;
+ }
| JSON_OBJECT '(' func_arg_list ')'
{
/* Support for legacy (non-standard) json_object() */
@@ -17841,6 +17852,7 @@ unreserved_keyword:
| ABSENT
| ABSOLUTE_P
| ACCESS
+ | ACCORDING
| ACTION
| ADD_P
| ADMIN
@@ -18168,6 +18180,7 @@ unreserved_keyword:
| WRAPPER
| WRITE
| XML_P
+ | XMLSCHEMA
| YEAR_P
| YES_P
| ZONE
@@ -18247,6 +18260,7 @@ col_name_keyword:
| XMLROOT
| XMLSERIALIZE
| XMLTABLE
+ | XMLVALIDATE
;
/* Type/function identifier --- keywords that can be type or function names.
@@ -18386,6 +18400,7 @@ bare_label_keyword:
| ABSENT
| ABSOLUTE_P
| ACCESS
+ | ACCORDING
| ACTION
| ADD_P
| ADMIN
@@ -18835,8 +18850,10 @@ bare_label_keyword:
| XMLPARSE
| XMLPI
| XMLROOT
+ | XMLSCHEMA
| XMLSERIALIZE
| XMLTABLE
+ | XMLVALIDATE
| YES_P
| ZONE
;
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 6b8fa15fca3..97c84acc17c 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -2471,6 +2471,14 @@ transformXmlExpr(ParseState *pstate, XmlExpr *x)
/* not handled here */
Assert(false);
break;
+ case IS_XMLVALIDATE:
+ if (i == 0)
+ newe = coerce_to_specific_type(pstate, newe, XMLOID,
+ "XMLVALIDATE");
+ else
+ newe = coerce_to_specific_type(pstate, newe, TEXTOID,
+ "XMLVALIDATE");
+ break;
case IS_DOCUMENT:
newe = coerce_to_specific_type(pstate, newe, XMLOID,
"IS DOCUMENT");
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index 905c975d83b..931df1e58b9 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1976,6 +1976,9 @@ FigureColnameInternal(Node *node, char **name)
case IS_XMLSERIALIZE:
*name = "xmlserialize";
return 2;
+ case IS_XMLVALIDATE:
+ *name = "xmlvalidate";
+ return 2;
case IS_DOCUMENT:
/* nothing */
break;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 9f85eb86da1..3c41ceae610 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -10119,10 +10119,13 @@ get_rule_expr(Node *node, deparse_context *context,
case IS_XMLSERIALIZE:
appendStringInfoString(buf, "XMLSERIALIZE(");
break;
+ case IS_XMLVALIDATE:
+ appendStringInfoString(buf, "XMLVALIDATE(");
+ break;
case IS_DOCUMENT:
break;
}
- if (xexpr->op == IS_XMLPARSE || xexpr->op == IS_XMLSERIALIZE)
+ if (xexpr->op == IS_XMLPARSE || xexpr->op == IS_XMLSERIALIZE || xexpr->op == IS_XMLVALIDATE)
{
if (xexpr->xmloption == XMLOPTION_DOCUMENT)
appendStringInfoString(buf, "DOCUMENT ");
@@ -10226,6 +10229,17 @@ get_rule_expr(Node *node, deparse_context *context,
}
}
break;
+ case IS_XMLVALIDATE:
+ Assert(list_length(xexpr->args) == 2);
+
+ get_rule_expr((Node *) linitial(xexpr->args),
+ context, true);
+
+ appendStringInfoString(buf, " ACCORDING TO XMLSCHEMA ");
+
+ get_rule_expr((Node *) lsecond(xexpr->args),
+ context, true);
+ break;
case IS_DOCUMENT:
get_rule_expr_paren((Node *) xexpr->args, context, false, node);
break;
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index c8ab9d61c68..7293c31dc97 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -58,6 +58,7 @@
#include <libxml/xmlwriter.h>
#include <libxml/xpath.h>
#include <libxml/xpathInternals.h>
+#include <libxml/xmlschemas.h>
/*
* We used to check for xmlStructuredErrorContext via a configure test; but
@@ -1158,10 +1159,116 @@ xmlvalidate(PG_FUNCTION_ARGS)
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("xmlvalidate is not implemented")));
+ errmsg("xmlvalidate is not implemented against generalized schema definitions")));
return 0;
}
+/*
+ * xmlvalidate - validate XML against an XML provided Schema (XSD)
+ *
+ * Returns true if the XML is valid according to the xml schema,
+ * false if it doesn't and NULL if any (xml schema or xml) are NULL.
+ * This implements the SQL:2008 XMLVALIDATE function with the limitation of
+ * not looking at import references. This function relies on the protections
+ * enforced by pg_xml_init.
+ */
+bool
+xmlvalidate_text_schema(xmltype *data, text *schema)
+{
+#ifdef USE_LIBXML
+ xmlDocPtr doc = NULL;
+ xmlSchemaParserCtxtPtr schema_parser_ctxt = NULL;
+ xmlSchemaPtr schema_ptr = NULL;
+ xmlSchemaValidCtxtPtr valid_ctxt = NULL;
+ char *datastr;
+ char *schemastr;
+ int result;
+ PgXmlErrorContext *xmlerrcxt;
+
+ datastr = text_to_cstring((text *) data);
+ schemastr = text_to_cstring(schema);
+ xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_WELLFORMED);
+
+ PG_TRY();
+ {
+ doc = xmlReadMemory(datastr, strlen(datastr), NULL, NULL, 0);
+ if (doc == NULL)
+ {
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ "invalid XML document");
+ }
+
+ schema_parser_ctxt = xmlSchemaNewMemParserCtxt(schemastr, strlen(schemastr));
+ if (schema_parser_ctxt == NULL)
+ {
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ "failed to create schema parser context");
+ }
+
+ schema_ptr = xmlSchemaParse(schema_parser_ctxt);
+ if (schema_ptr == NULL)
+ {
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ "failed to parse XML schema");
+ }
+
+ valid_ctxt = xmlSchemaNewValidCtxt(schema_ptr);
+ if (valid_ctxt == NULL)
+ {
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+ "failed to create schema validation context");
+ }
+
+ /* Validate the document - returns 0 if valid,
+ greater than 0 if invalid and < 0 if error */
+ result = xmlSchemaValidateDoc(valid_ctxt, doc);
+ if (result < 0)
+ {
+ xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
+ "internal error during schema validation");
+ } else if (result == 0) {
+ return true;
+ } else {
+ return false;
+ }
+ }
+ PG_CATCH();
+ {
+ if (valid_ctxt)
+ xmlSchemaFreeValidCtxt(valid_ctxt);
+ if (schema_ptr)
+ xmlSchemaFree(schema_ptr);
+ if (schema_parser_ctxt)
+ xmlSchemaFreeParserCtxt(schema_parser_ctxt);
+ if (doc)
+ xmlFreeDoc(doc);
+ pg_xml_done(xmlerrcxt, true);
+ pfree(datastr);
+ pfree(schemastr);
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+ if (valid_ctxt)
+ xmlSchemaFreeValidCtxt(valid_ctxt);
+ if (schema_ptr)
+ xmlSchemaFree(schema_ptr);
+ if (schema_parser_ctxt)
+ xmlSchemaFreeParserCtxt(schema_parser_ctxt);
+ if (doc)
+ xmlFreeDoc(doc);
+
+ pg_xml_done(xmlerrcxt, false);
+
+ pfree(datastr);
+ pfree(schemastr);
+ // Default case since nothing got returned
+ // out of the normal path for validation calls to libxml
+ return false;
+#else
+ NO_XML_SUPPORT();
+ return NULL;
+#endif
+}
bool
xml_is_document(xmltype *arg)
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1b4436f2ff6..5365a0012c1 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1610,6 +1610,7 @@ typedef enum XmlExprOp
IS_XMLROOT, /* XMLROOT(xml, version, standalone) */
IS_XMLSERIALIZE, /* XMLSERIALIZE(is_document, xmlval, indent) */
IS_DOCUMENT, /* xmlval IS DOCUMENT */
+ IS_XMLVALIDATE, /* XMLVALIDATE(xmlval, schema_text) */
} XmlExprOp;
typedef enum XmlOptionType
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 5d4fe27ef96..be7e8bab3f6 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -29,6 +29,7 @@ PG_KEYWORD("abort", ABORT_P, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("absent", ABSENT, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("absolute", ABSOLUTE_P, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("access", ACCESS, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("according", ACCORDING, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("action", ACTION, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("add", ADD_P, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("admin", ADMIN, UNRESERVED_KEYWORD, BARE_LABEL)
@@ -518,8 +519,10 @@ PG_KEYWORD("xmlnamespaces", XMLNAMESPACES, COL_NAME_KEYWORD, BARE_LABEL)
PG_KEYWORD("xmlparse", XMLPARSE, COL_NAME_KEYWORD, BARE_LABEL)
PG_KEYWORD("xmlpi", XMLPI, COL_NAME_KEYWORD, BARE_LABEL)
PG_KEYWORD("xmlroot", XMLROOT, COL_NAME_KEYWORD, BARE_LABEL)
+PG_KEYWORD("xmlschema", XMLSCHEMA, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("xmlserialize", XMLSERIALIZE, COL_NAME_KEYWORD, BARE_LABEL)
PG_KEYWORD("xmltable", XMLTABLE, COL_NAME_KEYWORD, BARE_LABEL)
+PG_KEYWORD("xmlvalidate", XMLVALIDATE, COL_NAME_KEYWORD, BARE_LABEL)
PG_KEYWORD("year", YEAR_P, UNRESERVED_KEYWORD, AS_LABEL)
PG_KEYWORD("yes", YES_P, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("zone", ZONE, UNRESERVED_KEYWORD, BARE_LABEL)
diff --git a/src/include/utils/xml.h b/src/include/utils/xml.h
index 732dac47bc4..6a64fd092a0 100644
--- a/src/include/utils/xml.h
+++ b/src/include/utils/xml.h
@@ -76,6 +76,7 @@ extern xmltype *xmlelement(XmlExpr *xexpr,
extern xmltype *xmlparse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace);
extern xmltype *xmlpi(const char *target, text *arg, bool arg_is_null, bool *result_is_null);
extern xmltype *xmlroot(xmltype *data, text *version, int standalone);
+extern bool xmlvalidate_text_schema(xmltype *data, text *schema);
extern bool xml_is_document(xmltype *arg);
extern text *xmltotext_with_options(xmltype *data, XmlOptionType xmloption_arg,
bool indent);
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 103a22a3b1d..7ea802412a5 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1881,3 +1881,283 @@ SELECT xmltext('x'|| '<P>73</P>'::xml || .42 || true || 'j'::char);
x<P>73</P>0.42truej
(1 row)
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name><age>30</age></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+ xmlvalidate
+-------------
+ t
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+ xmlvalidate
+-------------
+ f
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name><age>not-a-number</age></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+ xmlvalidate
+-------------
+ f
+(1 row)
+
+SELECT xmlvalidate(CONTENT '<book><title>PostgreSQL Internals</title></book>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="book">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="title" type="xs:string"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+ xmlvalidate
+-------------
+ t
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT NULL
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+ xmlvalidate
+-------------
+
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA NULL);
+ xmlvalidate
+-------------
+
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT NULL ACCORDING TO XMLSCHEMA NULL);
+ xmlvalidate
+-------------
+
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT '<product id="123"><name>Widget</name><price>9.99</price></product>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="product">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="price" type="xs:decimal"/>
+ </xs:sequence>
+ <xs:attribute name="id" type="xs:string" use="required"/>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+ xmlvalidate
+-------------
+ t
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT '<product><name>Widget</name><price>9.99</price></product>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="product">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="price" type="xs:decimal"/>
+ </xs:sequence>
+ <xs:attribute name="id" type="xs:string" use="required"/>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+ xmlvalidate
+-------------
+ f
+(1 row)
+
+SELECT xmlvalidate(DOCUMENT
+ '<company>
+ <employee>
+ <name>Alice</name>
+ <position>Developer</position>
+ <salary>75000</salary>
+ </employee>
+ </company>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="company">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="employee" maxOccurs="unbounded">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="position" type="xs:string"/>
+ <xs:element name="salary" type="xs:decimal"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+ xmlvalidate
+-------------
+ t
+(1 row)
+
+CREATE TABLE xml_validation_test (
+ xml_data xml,
+ xsd_schema text
+);
+INSERT INTO xml_validation_test VALUES
+ ('<number>42</number>',
+ '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="number" type="xs:integer"/>
+</xs:schema>'),
+ ('<number>not-a-number</number>',
+ '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="number" type="xs:integer"/>
+</xs:schema>'),
+ ('<text>Hello World</text>',
+ '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="text" type="xs:string"/>
+</xs:schema>');
+SELECT xmlvalidate(DOCUMENT xml_data ACCORDING TO XMLSCHEMA xsd_schema) AS is_valid
+FROM xml_validation_test;
+ is_valid
+----------
+ t
+ f
+ t
+(3 rows)
+
+DROP TABLE xml_validation_test;
+SELECT xmlvalidate(DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<this-is-not-valid-xsd>');
+ERROR: failed to parse XML schema
+DETAIL: line 1: Premature end of data in tag this-is-not-valid-xsd line 1
+<this-is-not-valid-xsd>
+ ^
+SELECT xmlvalidate(DOCUMENT '<unclosed-tag>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+ERROR: invalid XML content
+LINE 1: SELECT xmlvalidate(DOCUMENT '<unclosed-tag>'
+ ^
+DETAIL: line 1: Premature end of data in tag unclosed-tag line 1
+<unclosed-tag>
+ ^
+-- xs:import with external URL should be blocked
+SELECT xmlvalidate(
+ DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:import namespace="http://example.com/external" schemaLocation="http://example.com/malicious.xsd"/>
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+ERROR: failed to parse XML schema
+DETAIL: line 1: Document is empty
+
+^
+-- xs:include with external URL should be blocked
+SELECT xmlvalidate(
+ DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:include schemaLocation="http://example.com/external-schema.xsd"/>
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+ERROR: failed to parse XML schema
+DETAIL: line 1: Document is empty
+
+^
+-- file:// URL should be blocked
+SELECT xmlvalidate(
+ DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:include schemaLocation="file:///etc/passwd"/>
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+ERROR: failed to parse XML schema
+DETAIL: line 1: Document is empty
+
+^
+-- Test a_expr with subqueries
+CREATE TEMP TABLE xml_docs (id smallint, doc xml);
+CREATE TEMP TABLE xsd_schemas (id smallint, schema text);
+INSERT INTO xml_docs VALUES
+ (1, '<product><name>Widget</name><price>19.99</price></product>'),
+ (2, '<product><name>Gadget</name><price>not-a-number</price></product>');
+INSERT INTO xsd_schemas VALUES
+ (1, '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="product">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="price" type="xs:decimal"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+-- subquery for document argument
+SELECT xmlvalidate(
+ DOCUMENT (SELECT doc FROM xml_docs WHERE id = 1)
+ ACCORDING TO XMLSCHEMA (SELECT schema FROM xsd_schemas WHERE id = 1)
+) AS subquery_test_valid;
+ subquery_test_valid
+---------------------
+ t
+(1 row)
+
+-- subquery for an invalid document
+SELECT xmlvalidate(
+ DOCUMENT (SELECT doc FROM xml_docs WHERE id = 2)
+ ACCORDING TO XMLSCHEMA (SELECT schema FROM xsd_schemas WHERE id = 1)
+) AS subquery_test_invalid;
+ subquery_test_invalid
+-----------------------
+ f
+(1 row)
+
+DROP TABLE xml_docs;
+DROP TABLE xsd_schemas;
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index 0ea4f508837..c8ea2994eb2 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -679,3 +679,219 @@ SELECT xmltext(' ');
SELECT xmltext('foo `$_-+?=*^%!|/\()[]{}');
SELECT xmltext('foo & <"bar">');
SELECT xmltext('x'|| '<P>73</P>'::xml || .42 || true || 'j'::char);
+
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name><age>30</age></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+SELECT xmlvalidate(DOCUMENT '<person><name>John</name><age>not-a-number</age></person>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="person">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="age" type="xs:integer"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+SELECT xmlvalidate(CONTENT '<book><title>PostgreSQL Internals</title></book>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="book">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="title" type="xs:string"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+SELECT xmlvalidate(DOCUMENT NULL
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+
+SELECT xmlvalidate(DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA NULL);
+
+SELECT xmlvalidate(DOCUMENT NULL ACCORDING TO XMLSCHEMA NULL);
+
+SELECT xmlvalidate(DOCUMENT '<product id="123"><name>Widget</name><price>9.99</price></product>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="product">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="price" type="xs:decimal"/>
+ </xs:sequence>
+ <xs:attribute name="id" type="xs:string" use="required"/>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+SELECT xmlvalidate(DOCUMENT '<product><name>Widget</name><price>9.99</price></product>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="product">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="price" type="xs:decimal"/>
+ </xs:sequence>
+ <xs:attribute name="id" type="xs:string" use="required"/>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+SELECT xmlvalidate(DOCUMENT
+ '<company>
+ <employee>
+ <name>Alice</name>
+ <position>Developer</position>
+ <salary>75000</salary>
+ </employee>
+ </company>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="company">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="employee" maxOccurs="unbounded">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="position" type="xs:string"/>
+ <xs:element name="salary" type="xs:decimal"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+CREATE TABLE xml_validation_test (
+ xml_data xml,
+ xsd_schema text
+);
+
+INSERT INTO xml_validation_test VALUES
+ ('<number>42</number>',
+ '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="number" type="xs:integer"/>
+</xs:schema>'),
+ ('<number>not-a-number</number>',
+ '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="number" type="xs:integer"/>
+</xs:schema>'),
+ ('<text>Hello World</text>',
+ '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="text" type="xs:string"/>
+</xs:schema>');
+
+SELECT xmlvalidate(DOCUMENT xml_data ACCORDING TO XMLSCHEMA xsd_schema) AS is_valid
+FROM xml_validation_test;
+
+DROP TABLE xml_validation_test;
+
+SELECT xmlvalidate(DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<this-is-not-valid-xsd>');
+
+SELECT xmlvalidate(DOCUMENT '<unclosed-tag>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+
+-- xs:import with external URL should be blocked
+SELECT xmlvalidate(
+ DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:import namespace="http://example.com/external" schemaLocation="http://example.com/malicious.xsd"/>
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+
+-- xs:include with external URL should be blocked
+SELECT xmlvalidate(
+ DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:include schemaLocation="http://example.com/external-schema.xsd"/>
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+
+-- file:// URL should be blocked
+SELECT xmlvalidate(
+ DOCUMENT '<test>value</test>'
+ ACCORDING TO XMLSCHEMA '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:include schemaLocation="file:///etc/passwd"/>
+ <xs:element name="test" type="xs:string"/>
+</xs:schema>');
+
+-- Test a_expr with subqueries
+CREATE TEMP TABLE xml_docs (id smallint, doc xml);
+CREATE TEMP TABLE xsd_schemas (id smallint, schema text);
+
+INSERT INTO xml_docs VALUES
+ (1, '<product><name>Widget</name><price>19.99</price></product>'),
+ (2, '<product><name>Gadget</name><price>not-a-number</price></product>');
+
+INSERT INTO xsd_schemas VALUES
+ (1, '<?xml version="1.0"?>
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="product">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element name="name" type="xs:string"/>
+ <xs:element name="price" type="xs:decimal"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>');
+
+-- subquery for document argument
+SELECT xmlvalidate(
+ DOCUMENT (SELECT doc FROM xml_docs WHERE id = 1)
+ ACCORDING TO XMLSCHEMA (SELECT schema FROM xsd_schemas WHERE id = 1)
+) AS subquery_test_valid;
+
+-- subquery for an invalid document
+SELECT xmlvalidate(
+ DOCUMENT (SELECT doc FROM xml_docs WHERE id = 2)
+ ACCORDING TO XMLSCHEMA (SELECT schema FROM xsd_schemas WHERE id = 1)
+) AS subquery_test_invalid;
+
+DROP TABLE xml_docs;
+DROP TABLE xsd_schemas;
--
2.51.2
view thread (16+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: WIP - xmlvalidate implementation from TODO list
In-Reply-To: <CAN3aFCfvVgXr77o=dB_E2kSCY+EgckSQbSBdd_N9n-LauWuQLw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox