Message-ID: From: "vlsi (@vlsi)" To: "pgjdbc/pgjdbc" Date: Tue, 26 May 2026 15:08:11 +0000 Subject: [pgjdbc/pgjdbc] PR #4115: i18n: convert ISO-8859-x .po files to UTF-8 List-Id: X-GitHub-Author-Id: 213894 X-GitHub-Author-Login: vlsi X-GitHub-Issue: 4115 X-GitHub-Labels: chore X-GitHub-Repo: pgjdbc/pgjdbc X-GitHub-State: merged X-GitHub-Type: pull_request X-GitHub-Url: https://github.com/pgjdbc/pgjdbc/pull/4115 Content-Type: text/plain; charset=utf-8 ## What Re-encode five translation files from legacy 8-bit encodings to UTF-8 and update the `Content-Type` charset declaration to match. No other changes. | file | was | now | | --- | --- | --- | | `cs.po` | ISO-8859-2 | UTF-8 | | `de.po` | ISO-8859-1 | UTF-8 | | `fr.po` | ISO-8859-1 | UTF-8 | | `it.po` | ISO-8859-1 | UTF-8 | | `nl.po` | ISO-8859-1 | UTF-8 (file body is pure ASCII; only the header changed) | The other 11 `.po` files were already UTF-8 with a matching header and are untouched. ## Why The translation sources used a mix of encodings, which makes tooling (editors, grep, diff viewers, future xgettext runs) treat the files inconsistently. Standardising on UTF-8 lets every `.po` file in the tree be opened, searched, and patched the same way, and is the de-facto modern default for gettext catalogues. GitHub diffs looks better for UTF-8 files. Here's an example from `cs.po`: cs po ## How to verify For each converted file, round-trip the new version back through `iconv` to its original encoding and diff against the parent commit. Only the single `Content-Type` line should differ: ```sh for entry in "cs.po:ISO-8859-2" "de.po:ISO-8859-1" "fr.po:ISO-8859-1" \ "it.po:ISO-8859-1" "nl.po:ISO-8859-1"; do f="${entry%%:*}"; src="${entry##*:}" cur="pgjdbc/src/main/java/org/postgresql/translation/$f" diff <(git show "HEAD~1:$cur") <(iconv -f UTF-8 -t "$src" "$cur") done ``` Each diff shows exactly one changed line: `charset=` -> `charset=UTF-8`.