pgjdbc/pgjdbc GitHub issues and pull requests (mirror)
help / color / mirror / Atom feed[pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers
6+ messages / 4 participants
[nested] [flat]
* [pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers
@ 2024-04-18 14:30 "demonti (@demonti)" <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: demonti (@demonti) @ 2024-04-18 14:30 UTC (permalink / raw)
To: pgjdbc/pgjdbc <[email protected]>
[pg-xml-test.tar.gz](https://github.com/pgjdbc/pgjdbc/files/15026375/pg-xml-test.tar.gz)
**Describe the issue**
Our software is still running on Java 8, but we are in the process of migrating it to Java 21. When doing performance tests, my colleague accidentally ran the tests using Java 21, and it showed a considerable performance degradation – it took two or three times longer.
I performed a profiling using the Linux "gprofng" tool. This showed me a large amount of time spent in specific JDBC calls in the context of the XML support, i.e. SQLXML, PreparedStatement.setSQLXML, ResultSet.getSQLXML, which is actually consumed by Java class loading code. I further used the Linux "strace" tool to validate my assumption, which I describe in the following.
In our project, we replaced the built-in XML parsers and transformers with the original Apache Xerces and the Saxon XSL/T processor. The PostgreSQL JDBC driver uses the build-in XML API to perform various conversions from user generated XML to the driver's internal representation and vice versa. To get actual implementations of the interfaces, the driver uses factories provided by Java itself. This seems to be done by the DefaultPGXmlFactoryFactory class.
It turns out that each time a *factory* is constructed (not the actual XML parser) via <class>.newInstance () methods, Java 21 performs a reload of the respective JAR(s) to some extent. From the profiling, it is obvious that the respective JAR files are loaded, uncompressed, their signatures verified again and again. This is a waste of resources and the source of the experienced low performance.
Performing the same test on Java 8 did not show a similar behaviour.
The reasoning behind creating a new factory when requesting a new parser or similar was not erroneous: In the Javadocs of old Java versions (e.g. 5), it is clearly noted that the APIs should not be considered thread-safe. However, not later than Java 8, these notes have been removed, indicating that from then on they may be considered as thread-safe. So implementations like DefaultPGXmlFactoryFactory could keep a copy of the respective factories and avoid the creation over and over again.
**Driver Version?**
42.7.3
**Java Version?**
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment (build 21+35-2513)
OpenJDK 64-Bit Server VM (build 21+35-2513, mixed mode, sharing)
**OS Version?**
Ubuntu 23.10 (x64)
**PostgreSQL Version?**
16.0
**To Reproduce**
Compile the provided example code. Let it run under Java 21 and measure the time.
Repeat this with a Java 8. Compare the times.
Alternatively run the code with strace. The strace will show a large number of read accesses of the Saxon JAR file. It relates to the number of iterations done in the test program.
**Expected behaviour**
The third party XML libraries should be loaded only once. This is of course also an issue for the Java VM developers at Oracle. Perhaps I will file an issue there also, but to my experience the resolution of issues takes quite long.
**Logs**
Logs are omitted due to their size (up to a Gigabyte) and due to non-public content.
This is a summary of Linux "stat" system calls for JAR files in the our software during a load test containing the topmost candiates:
6 stat("/home/klaus/projects/tango/svn/tango.2/apps/srs/build/install/srs/lib/regex-21.1.0.jar"
7 stat("/home/klaus/projects/tango/svn/tango.2/apps/srs/build/install/srs/lib/js-21.1.0.jar"
19 stat("/home/klaus/projects/tango/svn/tango.2/apps/srs/build/install/srs/lib/woodstox-core-6.2.6.jar"
20 stat("/home/klaus/projects/tango/svn/tango.2/apps/srs/build/install/srs/lib/truffle-api-21.1.0.jar"
75015 stat("/home/klaus/projects/tango/svn/tango.2/apps/srs/build/install/srs/lib/Saxon-HE-9.8.0-15.jar"
163383 stat("/home/klaus/projects/tango/svn/tango.2/apps/srs/build/install/srs/lib/xercesImpl-2.12.1.jar"
In the provided test case, Java 8 performed 1653 read operations on the Saxon file, while Java 21 performed 40546 read operations.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers
@ 2024-04-19 15:57 ` "sehrope (@sehrope)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: sehrope (@sehrope) @ 2024-04-19 15:57 UTC (permalink / raw)
To: pgjdbc/pgjdbc <[email protected]>
@demonti
That's very interesting. I'm not sure about merging in JDK-specific behavior in the core driver, but luckily in this case you can handle it in the connection options without waiting for a new release.
There's connection property, `xmlFactoryFactory`, that let's you specify a factory class for instantiating the XML factory. The default is the built-in class, DefaultPGXmlFactoryFactory, that acts as you describe (creating a new instance for each request).
You can create or your class that implements `PGXmlFactoryFactory` with the caching behavior across requests. That should solve your jar loading performance issues as it could initialized statically and reused.
---
@davecramer I think we (or more accurately me...) forgot to add `xmlFactoryFactory` to the connection property descriptions on the website. It got added when we fixed that CVE to allow opt-in fallback to the old insecure behavior: https://github.com/pgjdbc/pgjdbc/commit/14b62aca4764d496813f55a43d050b017e01eb65
But I guess we never added it to the site itself so one would have to look at the enum or the release notes to know about it. A page that's generated off the driver's actual enum values would be nice as it already has the types and descriptions. That way it's always in sync.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers
@ 2024-04-19 16:33 ` "davecramer (@davecramer)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: davecramer (@davecramer) @ 2024-04-19 16:33 UTC (permalink / raw)
To: pgjdbc/pgjdbc <[email protected]>
> But I guess we never added it to the site itself so one would have to look at the enum or the release notes to know about it. A page that's generated off the driver's actual enum values would be nice as it already has the types and descriptions. That way it's always in sync.
Yes, it would ! :)
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers
@ 2024-04-23 11:05 ` "demonti (@demonti)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: demonti (@demonti) @ 2024-04-23 11:05 UTC (permalink / raw)
To: pgjdbc/pgjdbc <[email protected]>
For your information: I just created an issue in the Oracle Java Bug Database. As soon as this becomes accepted and public (somehow – don't know their process), I will post a link.
While the workaround is surely a solution, it might still be a good idea to solve the problem directly by reworking the code.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers
@ 2024-04-24 07:08 ` "demonti (@demonti)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: demonti (@demonti) @ 2024-04-24 07:08 UTC (permalink / raw)
To: pgjdbc/pgjdbc <[email protected]>
The bug report has been accepted by Oracle. Of course, they do not give any timeframe/priority about when they will take care. The bug ID at Oracle is 8331025, direct link [https://bugs.java.com/bugdatabase/view_bug?bug_id=8331025](https://bugs.java.com/bugdatabase/view_bug?bug_id=8331025).
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers
@ 2025-07-29 13:23 ` "eitch (@eitch)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: eitch (@eitch) @ 2025-07-29 13:23 UTC (permalink / raw)
To: pgjdbc/pgjdbc <[email protected]>
I'm running into a similar issue, as it seems a client has issues with their infrastructure where these frequent reloads leads to slow transactions. I've now tried to add a customer factory but i am running into a bug where it says my implementation is not implementing the interface. This is my class:
```java
import org.postgresql.xml.EmptyStringEntityResolver;
import org.postgresql.xml.NullErrorHandler;
import org.postgresql.xml.PGXmlFactoryFactory;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXTransformerFactory;
public class CachingPGXmlFactoryFactory implements PGXmlFactoryFactory {
@Override
public DocumentBuilder newDocumentBuilder() throws ParserConfigurationException {
return this.documentBuilder;
}
@Override
public TransformerFactory newTransformerFactory() {
return this.transformerFactory;
}
@Override
public SAXTransformerFactory newSAXTransformerFactory() {
return this.saxTransformerFactory;
}
@Override
public XMLInputFactory newXMLInputFactory() {
return this.xmlInputFactory;
}
@Override
public XMLOutputFactory newXMLOutputFactory() {
return this.xmlOutputFactory;
}
@Override
public XMLReader createXMLReader() throws SAXException {
return this.xmlReader;
}
}
```
and i get the following exception when using the class:
```
Connection property xmlFactoryFactory must implement PGXmlFactoryFactory: li.strolch.persistence.postgresql.CachingPGXmlFactoryFactory
```
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2025-07-29 13:23 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18 14:30 [pgjdbc/pgjdbc] issue #3221: Performance degradation of XML operations under Java 21 with custom parsers/transformers "demonti (@demonti)" <[email protected]>
2024-04-19 15:57 ` "sehrope (@sehrope)" <[email protected]>
2024-04-19 16:33 ` "davecramer (@davecramer)" <[email protected]>
2024-04-23 11:05 ` "demonti (@demonti)" <[email protected]>
2024-04-24 07:08 ` "demonti (@demonti)" <[email protected]>
2025-07-29 13:23 ` "eitch (@eitch)" <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox