Re: WIP - xmlvalidate implementation from TODO list

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Jim Jones <[email protected]>
To: Marcos Magueta <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: WIP - xmlvalidate implementation from TODO list
Date: Fri, 23 Jan 2026 13:19:25 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAN3aFCcUFLbdVBoL6c2bMh4r5P9EnXM9eBsX8+ZyER7YBSDUtA@mail.gmail.com>
References: <CAN3aFCdx8AapWSVpJ1kaC7OC_v7QwbjgbGw9WfPBBY2GMyOadQ@mail.gmail.com>
	<CAN3aFCfvVgXr77o=dB_E2kSCY+EgckSQbSBdd_N9n-LauWuQLw@mail.gmail.com>
	<CAN3aFCcx_w5Ldb+SYurwd31es9hOJqLuKARQHHDOk7+5iOqBWQ@mail.gmail.com>
	<CALdSSPhFzYCp=Aa8AAboz6TQaTmjWciQGfrEJQeOOO+0pD1GGw@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CAN3aFCfdGp6TGTQNOVO1im1u2vO_E2jnTGVV2xhea7eNY7GtuQ@mail.gmail.com>
	<[email protected]>
	<CAN3aFCcXwS7BrU1gHRUEBH3G59EVf_7LUhLeEWqW2Sc9Vk5k-A@mail.gmail.com>
	<CAN3aFCe_cBshj0rb7J8yoT+fRHOBOZmk-m8V7DMLDe0ZjSgjcA@mail.gmail.com>
	<[email protected]>
	<CAN3aFCc2voQ=6+Nwy99NFJZwveYmwtCKAj6U9RhjxqQc25+Q_g@mail.gmail.com>
	<[email protected]>
	<CAN3aFCcUFLbdVBoL6c2bMh4r5P9EnXM9eBsX8+ZyER7YBSDUtA@mail.gmail.com>



On 21/01/2026 21:44, Marcos Magueta wrote:
>> Any particular reason for that? If not, take a look at other options,
> e.g. a_expr
> No particular reason apart from it being simpler since I didn't need to
> invoke an execution at the cmd. Changed it now.
> 
>> Why did you choose text over xml for schemadata?
> My original thought was that XML schemas require additional validation
> in contrast to normal XML, but it being additive, we would have
> redundant checks. But in reconsideration, perhaps keeping the field with
> an XML type is more intuitive for anyone introspecting over the catalog.
> Also applied the change on the latest version of the patch.

Data type for schemadata in pg_xmlschema is now xml.

postgres=# \d pg_xmlschema
               Table "pg_catalog.pg_xmlschema"
     Column      |   Type    | Collation | Nullable | Default
-----------------+-----------+-----------+----------+---------
 oid             | oid       |           | not null |
 schemaname      | name      |           | not null |
 schemanamespace | oid       |           | not null |
 schemaowner     | oid       |           | not null |
 schemadata      | xml       |           | not null |
 schemaacl       | aclitem[] |           |          |
Indexes:
    "pg_xmlschema_oid_index" PRIMARY KEY, btree (oid)
    "pg_xmlschema_name_nsp_index" UNIQUE CONSTRAINT, btree (schemaname,
schemanamespace)

I agree it's more intuitive this way. It also facilitates function calls
that require the parameter to be xml, e.g. xmlserialize

postgres=# CREATE XMLSCHEMA x AS
 '<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"><xs:element
name="duplicate" type="xs:string"/></xs:schema>';
CREATE XMLSCHEMA

postgres=# SELECT xmlserialize(DOCUMENT schemadata AS text INDENT) FROM
pg_xmlschema;
                      xmlserialize
---------------------------------------------------------
 <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">+
   <xs:element name="duplicate" type="xs:string"/>      +
 </xs:schema>
(1 row)


> I noticed DefineXmlSchema() calls IsThereXmlSchemaInNamespace() right
> after XmlSchemaCreate() returns a valid OID. Since XmlSchemaCreate()
> already inserted the tuple into the catalog (via CatalogTupleInsert at
> pg_xmlschema.c:166), wouldn't SearchSysCacheExists2() find it and always
> throw "already exists"? We all tested the original code and it worked
> fine, so I'm missing something about syscache visibility or timing; that
> was an early function I did to check for duplicates that ended up in the
> wrong place. I removed the call (and function) as I judged it to be
> redundant (the duplicate check already happens inside
> XmlSchemaCreate()), but is there something subtle about intra-command
> visibility I'm not understanding? If anyone knows, please let me know.

I couldn't find any IsThereXmlSchemaInNamespace call in DefineXmlSchema
in the current version, so I cannot say much here. But I agree that the
a further check is not necessary, since XmlSchemaCreate is already doing it.

> Also, I added tab completion on psql and fixed pg_dump.

Nice. pg_dump now exports CREATE XMLSCHEMA statements.

Tab completion for CREATE, ALTER, and DROP XMLSCHEMA now also works.

A few other comments

== patch version ==

You forgot to include the version to the patch name.

For instance, instead of
0001-Add-CREATE-ALTER-DROP-XMLSCHEMA-DDL-commands.patch the file could
be named v3-0001-Add-CREATE-ALTER-DROP-XMLSCHEMA-DDL-commands.patch

== IS_XMLVALIDATE dependency ==

The patches 0001, 0002, and 0003 depend on IS_XMLVALIDATE, which is only
introduced in 0004, so they cannot be compiled and tested independently.

== permissions ==

In the tests I see you added a few GRANTs to set the visibility of
certain xmlschemas:

GRANT USAGE ON XMLSCHEMA permission_test_schema TO regress_xmlschema_user2

I could not find anything regarding this in the docs. If we are to
support it, shouldn't we add it to grant.sgml?

As I mentioned upthread, I believe that schema registration and usage
should be privilege-controlled, for example via dedicated roles

GRANT pg_read_xmlschemas TO u;
GRANT pg_write_xmlschemas TO u;

What do you think?

But being able to grant or revoke access to a certain xmlschema also has
its appeal :)

Best, Jim

view thread (9+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: WIP - xmlvalidate implementation from TODO list
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox