Message-ID: From: "davecramer (@davecramer)" To: "postgresql-interfaces/psqlodbc" Date: Tue, 28 Apr 2026 10:33:54 +0000 Subject: Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values In-Reply-To: References: List-Id: X-GitHub-Author-Login: davecramer X-GitHub-Comment-Id: 4334426790 X-GitHub-Comment-Type: issue_comment X-GitHub-Issue: 175 X-GitHub-Repo: postgresql-interfaces/psqlodbc X-GitHub-Type: comment X-GitHub-Url: https://github.com/postgresql-interfaces/psqlodbc/pull/175#issuecomment-4334426790 Content-Type: text/plain; charset=utf-8 Summary This PR fixes undefined behavior in decode() in dlg_specific.c. When a connection-string value contains a malformed percent escape (e.g., %, %A, %G1), conv_from_hex() reads past the valid data and performs a left-shift of a negative value — UB caught by UBSan. The fix validates that two hex digits follow % before calling conv_from_hex(), and passes malformed sequences through as literals. The Bug In the existing decode(): else if (inc == '%') { snprintf(&outs[o], ilen + 1 - o, "%c", conv_from_hex(&in[i])); o++; i += 2; } This unconditionally calls conv_from_hex(&in[i]), which reads s[1] and s[2] relative to the %. If the string is % at the end, s[1] is '\0' and s[2] is past the allocation. conv_from_hex computes val = '\0' - '0' = -48, then does y += val << 4 — left-shifting a negative value is undefined behavior per C11 §6.5.7. The Fix Three changes: 1. New validation function is_valid_percent_encoding(): static BOOL is_valid_percent_encoding(const char *s, size_t len) { if (len < 3) return FALSE; return isxdigit((unsigned char) s[1]) && isxdigit((unsigned char) s[2]); } 2. Guarded decode path — only calls conv_from_hex when valid; otherwise passes % through as a literal character. 3. Precondition comment on conv_from_hex(): /* Callers must ensure that s[1] and s[2] are hex digits. */ Assessment Correctness: ✅ Good The len < 3 check in is_valid_percent_encoding correctly handles both truncated (%, %A) and end-of-string cases by passing ilen - i as the remaining length. The isxdigit calls with (unsigned char) casts are correct — this avoids UB from negative char values on platforms where char is signed. The fallthrough to outs[o++] = inc (copying the literal %) is the right behavior for malformed escapes. Semantic change: ✅ Acknowledged and acceptable Previously %G1 would produce a garbage byte (from the UB computation). Now it produces the literal string %G1. This is strictly better — no real-world user depends on garbage output from UB. The PR description explicitly calls this out. Test: ✅ Well-designed The test covers four cases: - % — truncated at end of string - %A — only one hex digit - %G1 — non-hex character - %20 — valid escape (space) Each case goes through the full SQLDriverConnect() path and asserts success, not just crash-freedom. The test uses pqopt=application_name=... which exercises decode_or_remove_braces() → decode() without interfering with authentication. This is a smart test design choice. conv_from_hex still fragile: ⚠️ Noted but acceptable The precondition comment is the right minimal approach for this PR. A defensive isxdigit check inside conv_from_hex itself would be belt-and-suspenders, but it would also change the return type semantics (what does it return on invalid input?). The comment is sufficient — conv_from_hex is static and only called from decode(), so the blast radius of a future misuse is limited to this file. Issues found One minor concern with i += 2 on the valid path: In the valid branch: if (is_valid_percent_encoding(&in[i], ilen - i)) { snprintf(&outs[o], ilen + 1 - o, "%c", conv_from_hex(&in[i])); o++; i += 2; } The i += 2 skips past the two hex digits, and then the for loop does i++, so i advances by 3 total (past %XX). This is the same logic as the original code and is correct. No issue here. The snprintf for a single byte is wasteful (as noted by davecramer in the PR comments), but it's pre-existing code and out of scope for this fix. Agreed — not a blocker. Verdict Approve. This is a clean, well-scoped fix for real undefined behavior with a thoughtful regression test. The contributor responded well to review feedback (added the precondition comment, rebased onto latest main). The test/tests file now correctly appends after conn-settings-test from PR #176's branch. Ready to merge once CI passes.