Message-ID: From: "vlsi (@vlsi)" To: "pgjdbc/pgjdbc" Date: Sat, 18 Oct 2025 11:35:13 +0000 Subject: Re: [pgjdbc/pgjdbc] issue #3840: Periodic latency spikes to Postgres in large scale deployment In-Reply-To: References: List-Id: X-GitHub-Author-Login: vlsi X-GitHub-Comment-Id: 3418299382 X-GitHub-Comment-Type: issue_comment X-GitHub-Issue: 3840 X-GitHub-Repo: pgjdbc/pgjdbc X-GitHub-Type: comment X-GitHub-Url: https://github.com/pgjdbc/pgjdbc/issues/3840#issuecomment-3418299382 Content-Type: text/plain; charset=utf-8 The driver uses a shared `SharedTimer` to implement statement cancellation via query timeout. OpenJDK uses `Object,wait()` in `java.util.TimerThread`. In other words, it is expected that `SharedTimer` would wait in `Object.wait` in case the application uses something like `java.sql.Statement#setQueryTimeout` and executes a query. I do not see how `TimerThread` code could result in "at 1-2 seconds past the minute mark" Frankly, so far it looks like a scheduled activity in the application code that executes every minute. The current `SharedTimer` does bother me a bit. For instance, it is suboptimal for executing a lot of queries with long timeouts and we currently use `purgeTimerTasks()` to remove expired tasks. However, I haven't observed workloads that would run into `purgeTimerTasks()` issue though. It would be interesting to get some stacktraces/threaddumps/jfs/async-profiler results regarding the issue. --- Technically speaking, `SharedTimer` indeed uses a single thread to fire its tasks. However, an unexpected slowness of a single task should not impact the latency for the rest: * The tasks execute without holding the timer lock * Individual app threads do not wait each task execution --- I would suggest capturing the stack traces for "at 1-2 seconds past the minute mark". For instance, async-profiler's heatmaps might help you: https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md