postgresql-interfaces/psqlodbc GitHub issues and pull requests (mirror)
help / color / mirror / Atom feedFrom: jarvis24young (@jarvis24young) <[email protected]>
To: postgresql-interfaces/psqlodbc <[email protected]>
Subject: [postgresql-interfaces/psqlodbc] PR #187: Validate UTF-16 surrogate pairs before combining
Date: Tue, 12 May 2026 03:42:49 +0000
Message-ID: <[email protected]> (raw)
SQLWCHAR-to-UTF-8 conversion currently treats any UTF-16 high surrogate as the start of a surrogate pair. It then advances to the next code unit and reads it unconditionally.
That can read past the caller-supplied length when a wide-character ODBC API receives a dangling high surrogate at the end of its input. The new regression test exercises this through the public `SQLPrepareW()` path with a guarded one-code-unit SQLWCHAR buffer, so the old implementation faults deterministically if it reads `wstr[1]`.
Fix this by only taking the surrogate-pair path when:
- the current code unit is a high surrogate,
- there is another code unit within `ilen`, and
- the next code unit is a low surrogate.
Otherwise the existing non-pair path is used, avoiding the out-of-bounds read.
Reproduction on the old implementation, using the same black-box test with ASan and a guarded buffer:
```text
ERROR: AddressSanitizer: SEGV on unknown address
The signal is caused by a READ memory access.
#0 ucs2_to_utf8 win_unicode.c:191
#1 SQLPrepareW odbcapiw.c:439
#2 SQLPrepareW libodbc.so.2
#3 main test/src/surrogate-pair-test.c:109
```
Tested after the fix:
```text
cd ~/psqlodbc-surrogate-oob-build/test
ODBCSYSINI=. ODBCINSTINI=./odbcinst.ini ODBCINI=./odbc.ini ./runsuite surrogate-pair --inputdir=.
TAP version 13
1..1
ok 1 - surrogate-pair
```
Also tested the target binary directly under ASan/UBSan with `detect_leaks=0`; it returns normally.
view thread (4+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: github://postgresql-interfaces/psqlodbc
Cc: [email protected], [email protected]
Subject: Re: [postgresql-interfaces/psqlodbc] PR #187: Validate UTF-16 surrogate pairs before combining
In-Reply-To: <<[email protected]>>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox