postgresql-interfaces/psqlodbc GitHub issues and pull requests (mirror)
help / color / mirror / Atom feed[postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
6+ messages / 2 participants
[nested] [flat]
* [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 09:07 "jarvis24young (@jarvis24young)" <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: jarvis24young (@jarvis24young) @ 2026-04-23 09:07 UTC (permalink / raw)
To: postgresql-interfaces/psqlodbc <[email protected]>
This fixes sanitizer-detected undefined behavior in connection-string percent decoding.
Root cause:
- `decode()` treated every `%` in a connection-string value as the beginning of a `%xx` percent escape.
- It called `conv_from_hex(&in[i])` without first checking that two following hex digits were present.
- For a truncated value such as `pqopt=application_name=%`, `conv_from_hex()` reads the string terminator, computes a negative value, and left-shifts it. UBSan reports this as undefined behavior.
Fix:
- Decode only valid `%xx` escapes where both following characters are hex digits.
- Leave malformed or truncated percent sequences unchanged, preserving compatibility while avoiding undefined behavior.
Regression test:
- Adds `percent-decode-test`, which exercises the public `SQLDriverConnect()` path with `pqopt=application_name=%`, `%A`, `%G1`, and `%20`.
- The test requires `SQLDriverConnect()` to succeed for each case, so it verifies more than just "no crash".
Verification performed under ASan/UBSan:
```bash
cd ~/psqlodbc-build/test
export LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/13/libasan.so
ODBCSYSINI=. ODBCINSTINI=./odbcinst.ini ODBCINI=./odbc.ini \
ASAN_OPTIONS=halt_on_error=1:abort_on_error=1 \
UBSAN_OPTIONS=halt_on_error=1:print_stacktrace=1 \
./exe/percent-decode-test
```
Output:
```text
truncated percent: ok
one hex digit: ok
non-hex escape: ok
valid percent escape: ok
```
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 09:18 ` "jarvis24young (@jarvis24young)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: jarvis24young (@jarvis24young) @ 2026-04-23 09:18 UTC (permalink / raw)
To: postgresql-interfaces/psqlodbc <[email protected]>
I updated the regression test to make it stronger.
Instead of using `PWD=...` and accepting an authentication failure, the test now uses `pqopt=application_name=...`. That still exercises the same `decode_or_remove_braces()` / `decode()` path for connection-string percent decoding, but it does not override the DSN password, so `SQLDriverConnect()` is expected to succeed for each case.
For reference, before this patch the truncated percent case triggers UBSan in my local ASan/UBSan build:
```text
dlg_specific.c:1584:12: runtime error: left shift of negative value -48
#0 conv_from_hex /home/yjw/psqlodbc-build/dlg_specific.c:1584
#1 decode /home/yjw/psqlodbc-build/dlg_specific.c:1612
#2 decode_or_remove_braces /home/yjw/psqlodbc-build/dlg_specific.c:1658
#3 copyConnAttributes /home/yjw/psqlodbc-build/dlg_specific.c:639
#4 dconn_get_attributes /home/yjw/psqlodbc-build/drvconn.c:577
#5 PGAPI_DriverConnect /home/yjw/psqlodbc-build/drvconn.c:157
#6 SQLDriverConnect /home/yjw/psqlodbc-build/odbcapi.c:213
```
The `ok` output in the committed regression test represents the fixed behavior: malformed percent escapes no longer reach `conv_from_hex()`, and the connection path completes successfully.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 11:07 ` "davecramer (@davecramer)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: davecramer (@davecramer) @ 2026-04-23 11:07 UTC (permalink / raw)
To: postgresql-interfaces/psqlodbc <[email protected]>
1. `conv_from_hex()` itself is still fragile — the validation is only in decode(), but
conv_from_hex() is a separate function that could be called from elsewhere in the future without
the guard. Consider adding a defensive check inside conv_from_hex() itself, or at minimum adding a
comment that callers must validate input first. This isn't a blocker for this PR, but worth noting.
2. The `else` branch silently passes `%` through as a literal — this is the right behavior for
robustness, but it's a semantic change. Previously, %G1 would produce a garbage character; now it
produces literal %G1. This is strictly better, but the PR description should explicitly call out
this behavior change for anyone relying on the old (broken) behavior. The PR description does
mention this — good.
3. Missing `test/tests` in the diff context — the diff shows the test is appended after
exe/interval-overflow-test, which is from our local branch. On main, the last entry is
exe/primarykeys-include-test. This will cause a merge conflict if our interval-overflow change
lands first, but that's a trivial conflict to resolve.
4. Minor: `snprintf` for a single character — the existing code uses snprintf(&outs[o], ilen + 1 -
o, "%c", conv_from_hex(&in[i])) to write one byte. A simple outs[o] = (char) conv_from_hex(&in[i])
would be clearer and avoid the format string overhead. But this is pre-existing code, not
introduced by this PR.
Verdict
This is a clean, well-scoped fix for a real UB issue. The test is properly integrated and verifies
actual connection success rather than just crash-freedom. Ready to merge with one minor suggestion:
add a comment on conv_from_hex() noting that callers must ensure two hex digits follow s[0].
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 11:51 ` "jarvis24young (@jarvis24young)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: jarvis24young (@jarvis24young) @ 2026-04-23 11:51 UTC (permalink / raw)
To: postgresql-interfaces/psqlodbc <[email protected]>
Thanks for the review.
I addressed the minor suggestion by adding a precondition comment to `conv_from_hex()`:
```c
/* Callers must ensure that s[1] and s[2] are hex digits. */
```
I kept the implementation scoped to the current fix rather than changing `conv_from_hex()` behavior directly, since `decode()` now validates before calling it.
I also checked the `test/tests` context against the current upstream `main`; `interval-overflow-test` is already present there after #174, so the appended `percent-decode-test` is now based on the latest main branch.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-27 12:17 ` "jarvis24young (@jarvis24young)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: jarvis24young (@jarvis24young) @ 2026-04-27 12:17 UTC (permalink / raw)
To: postgresql-interfaces/psqlodbc <[email protected]>
@davecramer Hi, I rebased this PR onto the latest main and resolved the conflict in test/tests.
The PR is now mergeable, but the new GitHub Actions runs are waiting for maintainer approval because they show
action_required. Could you please approve/run the checks when you have a chance? Thanks.
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-28 10:33 ` "davecramer (@davecramer)" <[email protected]>
4 siblings, 0 replies; 6+ messages in thread
From: davecramer (@davecramer) @ 2026-04-28 10:33 UTC (permalink / raw)
To: postgresql-interfaces/psqlodbc <[email protected]>
Summary
This PR fixes undefined behavior in decode() in dlg_specific.c. When a connection-string value
contains a malformed percent escape (e.g., %, %A, %G1), conv_from_hex() reads past the valid data
and performs a left-shift of a negative value — UB caught by UBSan. The fix validates that two hex
digits follow % before calling conv_from_hex(), and passes malformed sequences through as literals.
The Bug
In the existing decode():
else if (inc == '%')
{
snprintf(&outs[o], ilen + 1 - o, "%c", conv_from_hex(&in[i]));
o++;
i += 2;
}
This unconditionally calls conv_from_hex(&in[i]), which reads s[1] and s[2] relative to the %. If
the string is % at the end, s[1] is '\0' and s[2] is past the allocation. conv_from_hex computes
val = '\0' - '0' = -48, then does y += val << 4 — left-shifting a negative value is undefined
behavior per C11 §6.5.7.
The Fix
Three changes:
1. New validation function is_valid_percent_encoding():
static BOOL
is_valid_percent_encoding(const char *s, size_t len)
{
if (len < 3)
return FALSE;
return isxdigit((unsigned char) s[1]) &&
isxdigit((unsigned char) s[2]);
}
2. Guarded decode path — only calls conv_from_hex when valid; otherwise passes % through as a
literal character.
3. Precondition comment on conv_from_hex(): /* Callers must ensure that s[1] and s[2] are hex
digits. */
Assessment
Correctness: ✅ Good
The len < 3 check in is_valid_percent_encoding correctly handles both truncated (%, %A) and
end-of-string cases by passing ilen - i as the remaining length. The isxdigit calls with (unsigned
char) casts are correct — this avoids UB from negative char values on platforms where char is
signed. The fallthrough to outs[o++] = inc (copying the literal %) is the right behavior for
malformed escapes.
Semantic change: ✅ Acknowledged and acceptable
Previously %G1 would produce a garbage byte (from the UB computation). Now it produces the literal
string %G1. This is strictly better — no real-world user depends on garbage output from UB. The PR
description explicitly calls this out.
Test: ✅ Well-designed
The test covers four cases:
- % — truncated at end of string
- %A — only one hex digit
- %G1 — non-hex character
- %20 — valid escape (space)
Each case goes through the full SQLDriverConnect() path and asserts success, not just
crash-freedom. The test uses pqopt=application_name=... which exercises decode_or_remove_braces() →
decode() without interfering with authentication. This is a smart test design choice.
conv_from_hex still fragile: ⚠️ Noted but acceptable
The precondition comment is the right minimal approach for this PR. A defensive isxdigit check
inside conv_from_hex itself would be belt-and-suspenders, but it would also change the return type
semantics (what does it return on invalid input?). The comment is sufficient — conv_from_hex is
static and only called from decode(), so the blast radius of a future misuse is limited to this
file.
Issues found
One minor concern with i += 2 on the valid path:
In the valid branch:
if (is_valid_percent_encoding(&in[i], ilen - i))
{
snprintf(&outs[o], ilen + 1 - o, "%c", conv_from_hex(&in[i]));
o++;
i += 2;
}
The i += 2 skips past the two hex digits, and then the for loop does i++, so i advances by 3 total
(past %XX). This is the same logic as the original code and is correct. No issue here.
The snprintf for a single byte is wasteful (as noted by davecramer in the PR comments), but it's
pre-existing code and out of scope for this fix. Agreed — not a blocker.
Verdict
Approve. This is a clean, well-scoped fix for real undefined behavior with a thoughtful regression
test. The contributor responded well to review feedback (added the precondition comment, rebased
onto latest main). The test/tests file now correctly appends after conn-settings-test from PR
#176's branch. Ready to merge once CI passes.
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-04-28 10:33 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-04-23 09:07 [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-23 09:18 ` "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-23 11:07 ` "davecramer (@davecramer)" <[email protected]>
2026-04-23 11:51 ` "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-27 12:17 ` "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-28 10:33 ` "davecramer (@davecramer)" <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox