pgjdbc/pgjdbc GitHub issues and pull requests (mirror)
help / color / mirror / Atom feedFrom: ilovethebeatles (@ilovethebeatles) <[email protected]>
To: pgjdbc/pgjdbc <[email protected]>
Subject: Re: [pgjdbc/pgjdbc] PR #3657: perf: cache composite and simple subqueries
Date: Sun, 22 Jun 2025 19:59:28 +0000
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
> Could you please measure the performance before and after the change?
Implemented benchmarks to measure throughput in the following cases:
1. Composite queries of different size, consisting of repeated subqueries, `prepareThreshold=1`
For k = 1…1000 (step 1), we build and execute a single SQL statement that repeats the subquery
"select val from bench_select where id = ?;"
k times (so first a 1-subquery statement, then 2-subquery, … up to 1000-subquery), cycling through these composite queries on one connection. With `threshold=1`, each subquery plan is cached on its first execution and then reused in all later composite queries.
2. Composite queries of different size, consisting of repeated subqueries, `prepareThreshold=5` (default)
Same loop, but with `threshold=5` (the default value for threshold). Subquery plans are only cached once the full composite statement has reached five executions, but because the 256-entry `statementCache` evicts those composite queries before they hit five runs, the subqueries never actually get cached. As a result, throughput stays the same as before.
3. Single parametrized `SELECT`
Just repeat the same simple `SELECT` under both thresholds to make sure the common case isn’t slower after the change.
Here are the results for the throughput in simple selects per second before and after my optimization:
| Scenario | Threshold | Before, selects/s | After , selects/s |
|-----------------------------------|:--------------:|:----------------:|:----------------:|
| Composite batch-select loop | 5 (default) | 33 374 ± 383 | 33 622 ± 495 |
| Composite batch-select loop | 1 | 29 355 ± 744 | 118 576 ± 450 |
| Single parametrized `SELECT` | 5 (default) | 13 643 ± 574 | 14 101 ± 68 |
| Single parametrized `SELECT` | 1 | 14 143 ± 151 | 13 629 ± 496 |
- **Threshold=5**: no real change because subqueries never reach the threshold, so the behaviour doesn't change
- **Threshold=1**:
- ~3.5× speedup in comparison to `threshold=5`, showing the benefit of immediate subquery caching in large composite queries with frequently repeated subqueries
- ~4× speedup over the stock driver, highlighting the impact of subquery deduplication.
- **Single-query**: performance within measurement noise, confirming no regression for the common case.
Further development might include tracking `executeCount` per subquery and adding parameter types to cache keys for better perfomance and accuracy.
view thread (16+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: github://pgjdbc/pgjdbc
Cc: [email protected], [email protected]
Subject: Re: [pgjdbc/pgjdbc] PR #3657: perf: cache composite and simple subqueries
In-Reply-To: <<[email protected]>>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox