Message-ID: From: "vlsi (@vlsi)" To: "pgjdbc/pgjdbc" Date: Thu, 21 May 2026 07:58:26 +0000 Subject: [pgjdbc/pgjdbc] issue #4083: Secure Handshake Timeout Semantics: Revisit SSL/GSS Response-Only Timeouts List-Id: X-GitHub-Author-Id: 213894 X-GitHub-Author-Login: vlsi X-GitHub-Issue: 4083 X-GitHub-Repo: pgjdbc/pgjdbc X-GitHub-State: open X-GitHub-Type: issue X-GitHub-Url: https://github.com/pgjdbc/pgjdbc/issues/4083 Content-Type: text/plain; charset=utf-8 Revisit the timeout semantics of `sslResponseTimeout` and `gssResponseTimeout`. Today those properties bound only the one-byte server response to `SSLRequest` / `GSSENCRequest`, not the full secure-upgrade phase. That is technically valid but operationally surprising. ## Current behavior Current implementation is centered in: - `org.postgresql.core.v3.ConnectionFactoryImpl` - `org.postgresql.ssl.MakeSSL` - `org.postgresql.gss.MakeGSS` Observed behavior: - `sslResponseTimeout` / `gssResponseTimeout` temporarily set `SO_TIMEOUT` for the one-byte upgrade response - if a smaller active socket timeout is already present, the smaller value wins - after the server replies positively to SSL, JSSE handshake proceeds under a socket timeout derived from `connectTimeout` - these properties do not bound the rest of TLS/GSS handshake ## Why this is worth revisiting From an operator/SRE perspective, people typically think in terms of: - "how long can secure connection setup take?" not: - "how long can the single-byte pre-handshake response take?" This mismatch creates avoidable confusion and makes timeout configuration less intuitive during failover and degraded network scenarios. ## Candidate directions ### 1. Keep existing properties but document them as low-level probes Lowest-risk path: - explicitly treat `sslResponseTimeout` / `gssResponseTimeout` as narrow protocol-step timeouts - add a separate whole-phase property if broader semantics are desired ### 2. Add whole-phase handshake timeout properties Possible additive properties: - `sslHandshakeTimeout` - `gssHandshakeTimeout` These would cap the full secure-upgrade phase rather than only the first response byte. ### 3. Fold secure upgrade into a unified startup deadline model If the driver adopts a stronger startup deadline budget, SSL/GSS could simply consume from that shared budget, reducing the need for phase-specific operator tuning. ### 4. Review current use of `connectTimeout` during JSSE handshake If TLS handshake currently inherits timeout from `connectTimeout`, decide whether that is the right long-term semantic or just an implementation artifact. ## Questions to resolve - Should response-only semantics stay as-is for compatibility? - Should broader handshake timeout semantics be additive only? - How should future whole-phase timeout properties interact with `socketTimeout` and `loginTimeout`? - Is there any safe path to unify TLS and GSS upgrade timeout behavior more fully? ## Acceptance criteria - behavior is clearly defined at the phase level - operators can answer "how long can secure startup take?" without reading protocol code - timeout semantics are covered by focused tests for: - non-responsive upgrade response - stalled TLS handshake after positive SSL response - stalled GSS path - interaction with `loginTimeout` and `socketTimeout`