postgresql-interfaces/psqlodbc GitHub issues and pull requests (mirror)  
help / color / mirror / Atom feed
[postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
6+ messages / 2 participants
[nested] [flat]

* [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 09:07  "jarvis24young (@jarvis24young)" <[email protected]>
  0 siblings, 0 replies; 6+ messages in thread

From: jarvis24young (@jarvis24young) @ 2026-04-23 09:07 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

This fixes sanitizer-detected undefined behavior in connection-string percent decoding.

Root cause:
- `decode()` treated every `%` in a connection-string value as the beginning of a `%xx` percent escape.
- It called `conv_from_hex(&in[i])` without first checking that two following hex digits were present.
- For a truncated value such as `pqopt=application_name=%`, `conv_from_hex()` reads the string terminator, computes a negative value, and left-shifts it. UBSan reports this as undefined behavior.

Fix:
- Decode only valid `%xx` escapes where both following characters are hex digits.
- Leave malformed or truncated percent sequences unchanged, preserving compatibility while avoiding undefined behavior.

Regression test:
- Adds `percent-decode-test`, which exercises the public `SQLDriverConnect()` path with `pqopt=application_name=%`, `%A`, `%G1`, and `%20`.
- The test requires `SQLDriverConnect()` to succeed for each case, so it verifies more than just "no crash".

Verification performed under ASan/UBSan:

```bash
cd ~/psqlodbc-build/test
export LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/13/libasan.so
ODBCSYSINI=. ODBCINSTINI=./odbcinst.ini ODBCINI=./odbc.ini \
ASAN_OPTIONS=halt_on_error=1:abort_on_error=1 \
UBSAN_OPTIONS=halt_on_error=1:print_stacktrace=1 \
./exe/percent-decode-test
```

Output:

```text
truncated percent: ok
one hex digit: ok
non-hex escape: ok
valid percent escape: ok
```

^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 09:18  "jarvis24young (@jarvis24young)" <[email protected]>
  4 siblings, 0 replies; 6+ messages in thread

From: jarvis24young (@jarvis24young) @ 2026-04-23 09:18 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

I updated the regression test to make it stronger.

Instead of using `PWD=...` and accepting an authentication failure, the test now uses `pqopt=application_name=...`. That still exercises the same `decode_or_remove_braces()` / `decode()` path for connection-string percent decoding, but it does not override the DSN password, so `SQLDriverConnect()` is expected to succeed for each case.

For reference, before this patch the truncated percent case triggers UBSan in my local ASan/UBSan build:

```text
dlg_specific.c:1584:12: runtime error: left shift of negative value -48
    #0 conv_from_hex /home/yjw/psqlodbc-build/dlg_specific.c:1584
    #1 decode /home/yjw/psqlodbc-build/dlg_specific.c:1612
    #2 decode_or_remove_braces /home/yjw/psqlodbc-build/dlg_specific.c:1658
    #3 copyConnAttributes /home/yjw/psqlodbc-build/dlg_specific.c:639
    #4 dconn_get_attributes /home/yjw/psqlodbc-build/drvconn.c:577
    #5 PGAPI_DriverConnect /home/yjw/psqlodbc-build/drvconn.c:157
    #6 SQLDriverConnect /home/yjw/psqlodbc-build/odbcapi.c:213
```

The `ok` output in the committed regression test represents the fixed behavior: malformed percent escapes no longer reach `conv_from_hex()`, and the connection path completes successfully.

^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 11:07  "davecramer (@davecramer)" <[email protected]>
  4 siblings, 0 replies; 6+ messages in thread

From: davecramer (@davecramer) @ 2026-04-23 11:07 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

1. `conv_from_hex()` itself is still fragile — the validation is only in decode(), but             
  conv_from_hex() is a separate function that could be called from elsewhere in the future without   
  the guard. Consider adding a defensive check inside conv_from_hex() itself, or at minimum adding a 
  comment that callers must validate input first. This isn't a blocker for this PR, but worth noting.
  2. The `else` branch silently passes `%` through as a literal — this is the right behavior for     
  robustness, but it's a semantic change. Previously, %G1 would produce a garbage character; now it  
  produces literal %G1. This is strictly better, but the PR description should explicitly call out   
  this behavior change for anyone relying on the old (broken) behavior. The PR description does      
  mention this — good.                                                                               
  3. Missing `test/tests` in the diff context — the diff shows the test is appended after            
  exe/interval-overflow-test, which is from our local branch. On main, the last entry is             
  exe/primarykeys-include-test. This will cause a merge conflict if our interval-overflow change     
  lands first, but that's a trivial conflict to resolve.                                             
  4. Minor: `snprintf` for a single character — the existing code uses snprintf(&outs[o], ilen + 1 - 
  o, "%c", conv_from_hex(&in[i])) to write one byte. A simple outs[o] = (char) conv_from_hex(&in[i]) 
  would be clearer and avoid the format string overhead. But this is pre-existing code, not          
  introduced by this PR.                                                                             
                                                                                                     
  Verdict                                                                                            
                                                                                                     
  This is a clean, well-scoped fix for a real UB issue. The test is properly integrated and verifies 
  actual connection success rather than just crash-freedom. Ready to merge with one minor suggestion:
  add a comment on conv_from_hex() noting that callers must ensure two hex digits follow s[0].  


^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-23 11:51  "jarvis24young (@jarvis24young)" <[email protected]>
  4 siblings, 0 replies; 6+ messages in thread

From: jarvis24young (@jarvis24young) @ 2026-04-23 11:51 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

Thanks for the review.

I addressed the minor suggestion by adding a precondition comment to `conv_from_hex()`:

```c
/* Callers must ensure that s[1] and s[2] are hex digits. */
```

I kept the implementation scoped to the current fix rather than changing `conv_from_hex()` behavior directly, since `decode()` now validates before calling it.

I also checked the `test/tests` context against the current upstream `main`; `interval-overflow-test` is already present there after #174, so the appended `percent-decode-test` is now based on the latest main branch.

^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-27 12:17  "jarvis24young (@jarvis24young)" <[email protected]>
  4 siblings, 0 replies; 6+ messages in thread

From: jarvis24young (@jarvis24young) @ 2026-04-27 12:17 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

@davecramer   Hi, I rebased this PR onto the latest main and resolved the conflict in test/tests.
  The PR is now mergeable, but the new GitHub Actions runs are waiting for maintainer approval because they show
  action_required. Could you please approve/run the checks when you have a chance? Thanks.

^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values
@ 2026-04-28 10:33  "davecramer (@davecramer)" <[email protected]>
  4 siblings, 0 replies; 6+ messages in thread

From: davecramer (@davecramer) @ 2026-04-28 10:33 UTC (permalink / raw)
  To: postgresql-interfaces/psqlodbc <[email protected]>

Summary
  
  This PR fixes undefined behavior in decode() in dlg_specific.c. When a connection-string value
  contains a malformed percent escape (e.g., %, %A, %G1), conv_from_hex() reads past the valid data
  and performs a left-shift of a negative value — UB caught by UBSan. The fix validates that two hex
  digits follow % before calling conv_from_hex(), and passes malformed sequences through as literals.
  
  The Bug
  
  In the existing decode():
  
  else if (inc == '%')
  {
      snprintf(&outs[o], ilen + 1 - o, "%c", conv_from_hex(&in[i]));
      o++;
      i += 2;
  }
  
  This unconditionally calls conv_from_hex(&in[i]), which reads s[1] and s[2] relative to the %. If
  the string is % at the end, s[1] is '\0' and s[2] is past the allocation. conv_from_hex computes
  val = '\0' - '0' = -48, then does y += val << 4 — left-shifting a negative value is undefined
  behavior per C11 §6.5.7.
  
  The Fix
  
  Three changes:
  
  1. New validation function is_valid_percent_encoding():
  
  static BOOL
  is_valid_percent_encoding(const char *s, size_t len)
  {
      if (len < 3)
          return FALSE;
      return isxdigit((unsigned char) s[1]) &&
             isxdigit((unsigned char) s[2]);
  }
  
  2. Guarded decode path — only calls conv_from_hex when valid; otherwise passes % through as a
  literal character.
  3. Precondition comment on conv_from_hex(): /* Callers must ensure that s[1] and s[2] are hex
  digits. */
  
  Assessment
  
  Correctness: ✅ Good
  
  The len < 3 check in is_valid_percent_encoding correctly handles both truncated (%, %A) and
  end-of-string cases by passing ilen - i as the remaining length. The isxdigit calls with (unsigned
  char) casts are correct — this avoids UB from negative char values on platforms where char is
  signed. The fallthrough to outs[o++] = inc (copying the literal %) is the right behavior for
  malformed escapes.
  
  Semantic change: ✅ Acknowledged and acceptable
  
  Previously %G1 would produce a garbage byte (from the UB computation). Now it produces the literal
  string %G1. This is strictly better — no real-world user depends on garbage output from UB. The PR
  description explicitly calls this out.
  
  Test: ✅ Well-designed
  
  The test covers four cases:
  
  - % — truncated at end of string
  - %A — only one hex digit
  - %G1 — non-hex character
  - %20 — valid escape (space)
  
  Each case goes through the full SQLDriverConnect() path and asserts success, not just
  crash-freedom. The test uses pqopt=application_name=... which exercises decode_or_remove_braces() →
  decode() without interfering with authentication. This is a smart test design choice.
  
  conv_from_hex still fragile: ⚠️ Noted but acceptable
  
  The precondition comment is the right minimal approach for this PR. A defensive isxdigit check
  inside conv_from_hex itself would be belt-and-suspenders, but it would also change the return type
  semantics (what does it return on invalid input?). The comment is sufficient — conv_from_hex is
  static and only called from decode(), so the blast radius of a future misuse is limited to this
  file.
  
  Issues found
  
  One minor concern with i += 2 on the valid path:
  
  In the valid branch:
  
  if (is_valid_percent_encoding(&in[i], ilen - i))
  {
      snprintf(&outs[o], ilen + 1 - o, "%c", conv_from_hex(&in[i]));
      o++;
      i += 2;
  }
  
  The i += 2 skips past the two hex digits, and then the for loop does i++, so i advances by 3 total
  (past %XX). This is the same logic as the original code and is correct. No issue here.
  
  The snprintf for a single byte is wasteful (as noted by davecramer in the PR comments), but it's
  pre-existing code and out of scope for this fix. Agreed — not a blocker.
  
  Verdict
  
  Approve. This is a clean, well-scoped fix for real undefined behavior with a thoughtful regression
  test. The contributor responded well to review feedback (added the precondition comment, rebased
  onto latest main). The test/tests file now correctly appends after conn-settings-test from PR
  #176's branch. Ready to merge once CI passes.

^ permalink  raw  reply  [nested|flat] 6+ messages in thread


end of thread, other threads:[~2026-04-28 10:33 UTC | newest]

Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-04-23 09:07 [postgresql-interfaces/psqlodbc] PR #175: Validate percent escapes before decoding connection-string values "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-23 09:18 ` "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-23 11:07 ` "davecramer (@davecramer)" <[email protected]>
2026-04-23 11:51 ` "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-27 12:17 ` "jarvis24young (@jarvis24young)" <[email protected]>
2026-04-28 10:33 ` "davecramer (@davecramer)" <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox