public inbox for [email protected]  
help / color / mirror / Atom feed
From: Laurenz Albe <[email protected]>
To: Jean-Christophe BOGGIO <[email protected]>
To: [email protected] <[email protected]>
Subject: Re: Importing a Windows database (in en_GB.CP1252) to linux
Date: Tue, 02 Dec 2025 12:26:54 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>

On Tue, 2025-12-02 at 10:39 +0100, Jean-Christophe BOGGIO wrote:
> > That looks like pg_restore sets a wrong client_encoding, which is weird.
> > 
> > What do you get for
> > 
> >   pg_restore -p 5433 -t csakafl -s -f - imlocal20251127.backup | grep client_encoding
> 
> SET client_encoding = 'UTF8';
> 
> 
> > How did that happen?  How exactly did you take that dump?
> 
> This backup is a transfer from an iSeries DB2 database. It has been a nightmare to
> get this working (and took around 10 days to finalize). We set up a FDW Server using
> odbc_fdw, recreated all the tables (around 2k) and INSERTed the DB2 data to the PG tables.
> 
> Then we used PgAdmin that came with PostgreSQL 17 on the Windows machine.
> 
> I double-checked with the client: the database is in en_GB.CP1252.

The DB2 database or the PostgreSQL database?

It must be the DB2 database, because otherwise the dump would contain

  SET client_encoding = 'WIN1252';

That is, unless you created the dump with

  pg_dump --encoding=UTF8

But then, the dump couldn't contain non-UTF-8 characters.

Having used pgAdmin, you probably don't know the pg_dump command line that was used.


My best guess is that odbc_fdw has a bug that does not check if the strings are
properly encoded, and you somehow got corrupted data in your PostgreSQL database.
But I am not sure.

> > Did you do anything (like an encoding conversion) with the dump after you took it?
> 
> No, the backup is in custom format so I can't touch it (or at least I don't know how I could).
> 
> Where can I go from here?

You can try the following:

- convert the custom format dump into an SQL script with

  pg_restore -f script.sql imlocal20251127.backup

- edit script.sql and change the line to read

  SET client_encoding = 'WIN1252';

- restore that dump with "psql":

  psql -f script.sql -d newdb

That should work if *all* the strings are in WINDOWS-1252 encoding.

Yours,
Laurenz Albe





reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Importing a Windows database (in en_GB.CP1252) to linux
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox