public inbox for [email protected]help / color / mirror / Atom feed
pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) 6+ messages / 1 participants [nested] [flat]
* pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) @ 2026-06-04 22:50 Michael Paquier <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Michael Paquier @ 2026-06-04 22:50 UTC (permalink / raw) To: [email protected] Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) The NFC recomposition incorrectly included TBASE as a valid T syllable, which is incorrect based on the Unicode specification (TBASE is one below the start of the range, range beginning at U+11A8). This would cause the TBASE to be silently swallowed in the normalization, leading to an incorrect result. A couple of regression tests are added to check more patterns with Hangul recomposition and decomposition, on top of a test to check the problem with TBASE. Diego has submitted the code fix, and I have written the tests. Author: Diego Frias <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14 Branch ------ REL_17_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/0c9cbbfb5be79d2061d7f897f6c6f4bccb886062 Modified Files -------------- src/common/unicode_norm.c | 2 +- src/test/regress/expected/unicode.out | 78 +++++++++++++++++++++++++++++++++++ src/test/regress/sql/unicode.sql | 20 +++++++++ 3 files changed, 99 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 6+ messages in thread
* pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) @ 2026-06-04 22:50 Michael Paquier <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Michael Paquier @ 2026-06-04 22:50 UTC (permalink / raw) To: [email protected] Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) The NFC recomposition incorrectly included TBASE as a valid T syllable, which is incorrect based on the Unicode specification (TBASE is one below the start of the range, range beginning at U+11A8). This would cause the TBASE to be silently swallowed in the normalization, leading to an incorrect result. A couple of regression tests are added to check more patterns with Hangul recomposition and decomposition, on top of a test to check the problem with TBASE. Diego has submitted the code fix, and I have written the tests. Author: Diego Frias <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14 Branch ------ REL_16_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/82116023e424cba4ac7adefd261bd382ad6e40c8 Modified Files -------------- src/common/unicode_norm.c | 2 +- src/test/regress/expected/unicode.out | 78 +++++++++++++++++++++++++++++++++++ src/test/regress/sql/unicode.sql | 20 +++++++++ 3 files changed, 99 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 6+ messages in thread
* pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) @ 2026-06-04 22:50 Michael Paquier <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Michael Paquier @ 2026-06-04 22:50 UTC (permalink / raw) To: [email protected] Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) The NFC recomposition incorrectly included TBASE as a valid T syllable, which is incorrect based on the Unicode specification (TBASE is one below the start of the range, range beginning at U+11A8). This would cause the TBASE to be silently swallowed in the normalization, leading to an incorrect result. A couple of regression tests are added to check more patterns with Hangul recomposition and decomposition, on top of a test to check the problem with TBASE. Diego has submitted the code fix, and I have written the tests. Author: Diego Frias <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14 Branch ------ REL_15_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/c391375ba4964dad9cbeedf700ae4df1d6af8b5b Modified Files -------------- src/common/unicode_norm.c | 2 +- src/test/regress/expected/unicode.out | 78 +++++++++++++++++++++++++++++++++++ src/test/regress/sql/unicode.sql | 20 +++++++++ 3 files changed, 99 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 6+ messages in thread
* pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) @ 2026-06-04 22:50 Michael Paquier <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Michael Paquier @ 2026-06-04 22:50 UTC (permalink / raw) To: [email protected] Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) The NFC recomposition incorrectly included TBASE as a valid T syllable, which is incorrect based on the Unicode specification (TBASE is one below the start of the range, range beginning at U+11A8). This would cause the TBASE to be silently swallowed in the normalization, leading to an incorrect result. A couple of regression tests are added to check more patterns with Hangul recomposition and decomposition, on top of a test to check the problem with TBASE. Diego has submitted the code fix, and I have written the tests. Author: Diego Frias <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14 Branch ------ REL_14_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/8bb935d619f6397ca91742195965d20b0ee5df6c Modified Files -------------- src/common/unicode_norm.c | 2 +- src/test/regress/expected/unicode.out | 78 +++++++++++++++++++++++++++++++++++ src/test/regress/sql/unicode.sql | 20 +++++++++ 3 files changed, 99 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 6+ messages in thread
* pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) @ 2026-06-04 22:50 Michael Paquier <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Michael Paquier @ 2026-06-04 22:50 UTC (permalink / raw) To: [email protected] Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) The NFC recomposition incorrectly included TBASE as a valid T syllable, which is incorrect based on the Unicode specification (TBASE is one below the start of the range, range beginning at U+11A8). This would cause the TBASE to be silently swallowed in the normalization, leading to an incorrect result. A couple of regression tests are added to check more patterns with Hangul recomposition and decomposition, on top of a test to check the problem with TBASE. Diego has submitted the code fix, and I have written the tests. Author: Diego Frias <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14 Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/f2ff15e4c37190e677437ecb76f706a05a645c6b Modified Files -------------- src/common/unicode_norm.c | 2 +- src/test/regress/expected/unicode.out | 78 +++++++++++++++++++++++++++++++++++ src/test/regress/sql/unicode.sql | 20 +++++++++ 3 files changed, 99 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 6+ messages in thread
* pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) @ 2026-06-04 22:50 Michael Paquier <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Michael Paquier @ 2026-06-04 22:50 UTC (permalink / raw) To: [email protected] Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) The NFC recomposition incorrectly included TBASE as a valid T syllable, which is incorrect based on the Unicode specification (TBASE is one below the start of the range, range beginning at U+11A8). This would cause the TBASE to be silently swallowed in the normalization, leading to an incorrect result. A couple of regression tests are added to check more patterns with Hangul recomposition and decomposition, on top of a test to check the problem with TBASE. Diego has submitted the code fix, and I have written the tests. Author: Diego Frias <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14 Branch ------ REL_18_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/273fe94852b3a7e34fd171e8abdf1481beb302fa Modified Files -------------- src/common/unicode_norm.c | 2 +- src/test/regress/expected/unicode.out | 78 +++++++++++++++++++++++++++++++++++ src/test/regress/sql/unicode.sql | 20 +++++++++ 3 files changed, 99 insertions(+), 1 deletion(-) ^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-06-04 22:50 UTC | newest] Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-06-04 22:50 pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) Michael Paquier <[email protected]> 2026-06-04 22:50 pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) Michael Paquier <[email protected]> 2026-06-04 22:50 pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) Michael Paquier <[email protected]> 2026-06-04 22:50 pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) Michael Paquier <[email protected]> 2026-06-04 22:50 pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) Michael Paquier <[email protected]> 2026-06-04 22:50 pgsql: Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) Michael Paquier <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox