public inbox for [email protected]  
help / color / mirror / Atom feed
From: Laurenz Albe <[email protected]>
To: Jean-Christophe BOGGIO <[email protected]>
To: [email protected] <[email protected]>
Subject: Re: Importing a Windows database (in en_GB.CP1252) to linux
Date: Mon, 01 Dec 2025 16:07:30 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

On Mon, 2025-12-01 at 14:37 +0100, Jean-Christophe BOGGIO wrote:
> I have a (custom) backup created on a Windows machine in en_GB.CP1252 encoding.
> And of course, some characters can't be imported because they don't exist in UTF-8.

Hm?  Which character can be encoded in WINDOWS-1252, but not in UTF-8?
I don't think that can be the problem.

> So I created a new cluster on PG18 port 5433 initialized in WIN1252 encoding:
> 
> $ \l imlocal 
>                                             List of databases 
>   Name   | Owner | Encoding | Locale Provider | Collate | Ctype | Locale | ICU Rules | Access privileges  
> ---------+-------+----------+-----------------+---------+-------+--------+-----------+------------------- 
>  imlocal | cat   | WIN1252  | libc            | C       | C     | ∅    | ∅       | ∅ 
>  (1 row)
> 
>   I am now trying to import the data in that database but I keep getting this error:
> 
> $ pg_restore -p 5433 -t csakafl -d imlocal imlocal20251127.backup 
>  pg_restore: error: COPY failed for table "csakafl": ERROR:  invalid byte sequence for encoding "UTF8": 0x92 
>  CONTEXT:  COPY csakafl, line 298
> 
>  So pg_restore still thinks I want to use UTF8.

That looks like pg_restore sets a wrong client_encoding, which is weird.

What do you get for

  pg_restore -p 5433 -t csakafl -s -f - imlocal20251127.backup | grep client_encoding

If the dump was taken from a WINDOWS-1252 encoded database, that line should
read

  SET client_encoding = 'WIN1252';

and everything should work fine.  But apparently, the client_encoding is set to
UTF-8 in your case.

How did that happen?  How exactly did you take that dump?
Did you do anything (like an encoding conversion) with the dump after you took it?

Yours,
Laurenz Albe





reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Importing a Windows database (in en_GB.CP1252) to linux
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox