Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1raHUV-008z2l-9k for psycopg@arkaria.postgresql.org; Wed, 14 Feb 2024 15:48:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1raHUR-006ujm-Js for psycopg@arkaria.postgresql.org; Wed, 14 Feb 2024 15:48:03 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1raHUR-006uje-DB for psycopg@lists.postgresql.org; Wed, 14 Feb 2024 15:48:03 +0000 Received: from janus.karlpinc.com ([173.161.46.12] helo=smtp.karlpinc.com) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1raHUO-006jMS-6X for psycopg@postgresql.org; Wed, 14 Feb 2024 15:48:01 +0000 Received: from slate.karlpinc.com (unknown [192.168.1.14]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.karlpinc.com (Postfix) with ESMTPS id E994EA76A; Wed, 14 Feb 2024 09:42:07 -0600 (CST) Received: from slate.karlpinc.com (localhost [IPv6:::1]) by slate.karlpinc.com (Postfix) with ESMTPS id 9A99E3FC4F; Wed, 14 Feb 2024 09:42:06 -0600 (CST) Date: Wed, 14 Feb 2024 09:42:03 -0600 From: "Karl O. Pinc" To: Daniele Varrazzo Cc: "psycopg@postgresql.org" Subject: Re: Reporting UnicodeEncodeError info on arbitrary data sent to PG with psycopg3 Message-ID: <20240214094203.52d7e22d@slate.karlpinc.com> In-Reply-To: References: <20240213193732.28cb8329@slate.karlpinc.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Daniele, On Wed, 14 Feb 2024 15:30:33 +0100 Daniele Varrazzo wrote: > Note however that if you just want to know the Python codec you can > find it in `conn.info.encoding` > (https://www.psycopg.org/psycopg3/docs/api/objects.html#psycopg.Connectio= nInfo.encoding): >=20 > >>> conn.info.encoding =20 > 'iso8859-1' > >>> "=E2=82=AC".encode(conn.info.encoding) =20 > ... > UnicodeEncodeError: 'latin-1' codec can't encode character > '\u20ac' in position 0: ordinal not in range(256) Thanks very much for the help. Working directly with the encoding of the server side, translated to python, is indeed a more direct approach. I did not use conn.info.encoding because the docs say that it contains the _client_ encoding, not the server-side encoding used to store the db content. =46rom the link above: ``` encoding The Python codec name of the connection=E2=80=99s client encoding. The value returned is always normalized to the Python codec name: conn.execute("SET client_encoding TO LATIN9") conn.info.encoding 'iso8859-15' ``` Confirming the encodings, connecting to the "latin1" db with psql shows: ``` $ psql -U kop latin1 psql (15.5 (Debian 15.5-0+deb12u1)) Type "help" for help. kop_latin1=3D> show client_encoding; client_encoding=20 ----------------- UTF8 (1 row) kop_latin1=3D> show server_encoding; server_encoding=20 ----------------- LATIN1 (1 row) ``` But, conn.info.encoding does return iso8859-1. So I think your documentation has confused client and server in this case. If you can confirm this for me I'll go ahead and use conn.info.encoding. Thanks again. Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein