Message-ID: From: "ilovethebeatles (@ilovethebeatles)" To: "pgjdbc/pgjdbc" Date: Sun, 22 Jun 2025 19:59:28 +0000 Subject: Re: [pgjdbc/pgjdbc] PR #3657: perf: cache composite and simple subqueries In-Reply-To: References: List-Id: X-GitHub-Author-Login: ilovethebeatles X-GitHub-Comment-Id: 2994415826 X-GitHub-Comment-Type: issue_comment X-GitHub-Issue: 3657 X-GitHub-Repo: pgjdbc/pgjdbc X-GitHub-Type: comment X-GitHub-Url: https://github.com/pgjdbc/pgjdbc/pull/3657#issuecomment-2994415826 Content-Type: text/plain; charset=utf-8 > Could you please measure the performance before and after the change? Implemented benchmarks to measure throughput in the following cases: 1. Composite queries of different size, consisting of repeated subqueries, `prepareThreshold=1` For k = 1…1000 (step 1), we build and execute a single SQL statement that repeats the subquery "select val from bench_select where id = ?;" k times (so first a 1-subquery statement, then 2-subquery, … up to 1000-subquery), cycling through these composite queries on one connection. With `threshold=1`, each subquery plan is cached on its first execution and then reused in all later composite queries. 2. Composite queries of different size, consisting of repeated subqueries, `prepareThreshold=5` (default) Same loop, but with `threshold=5` (the default value for threshold). Subquery plans are only cached once the full composite statement has reached five executions, but because the 256-entry `statementCache` evicts those composite queries before they hit five runs, the subqueries never actually get cached. As a result, throughput stays the same as before. 3. Single parametrized `SELECT` Just repeat the same simple `SELECT` under both thresholds to make sure the common case isn’t slower after the change. Here are the results for the throughput in simple selects per second before and after my optimization: | Scenario | Threshold | Before, selects/s | After , selects/s | |-----------------------------------|:--------------:|:----------------:|:----------------:| | Composite batch-select loop | 5 (default) | 33 374 ± 383 | 33 622 ± 495 | | Composite batch-select loop | 1 | 29 355 ± 744 | 118 576 ± 450 | | Single parametrized `SELECT` | 5 (default) | 13 643 ± 574 | 14 101 ± 68 | | Single parametrized `SELECT` | 1 | 14 143 ± 151 | 13 629 ± 496 | - **Threshold=5**: no real change because subqueries never reach the threshold, so the behaviour doesn't change - **Threshold=1**: - ~3.5× speedup in comparison to `threshold=5`, showing the benefit of immediate subquery caching in large composite queries with frequently repeated subqueries - ~4× speedup over the stock driver, highlighting the impact of subquery deduplication. - **Single-query**: performance within measurement noise, confirming no regression for the common case. Further development might include tracking `executeCount` per subquery and adding parameter types to cache keys for better perfomance and accuracy.