Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1raGHm-008tDk-10 for psycopg@arkaria.postgresql.org; Wed, 14 Feb 2024 14:30:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1raGHk-006HNG-A6 for psycopg@arkaria.postgresql.org; Wed, 14 Feb 2024 14:30:52 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1raGHk-006HLS-2w for psycopg@lists.postgresql.org; Wed, 14 Feb 2024 14:30:52 +0000 Received: from mail-oa1-x2d.google.com ([2001:4860:4864:20::2d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1raGHe-006imC-QB for psycopg@postgresql.org; Wed, 14 Feb 2024 14:30:50 +0000 Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-2185739b64cso3515225fac.0 for ; Wed, 14 Feb 2024 06:30:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707921046; x=1708525846; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yckyt4FS4GdsF/2Ce/cif204cCNoEeKz3blJdpIMXJo=; b=Z31G233YDpcPXHB2VxLSyT++fs4hIL3qb+cBjdofJHHGuimju56KUYIhMiNpV379qU 3tV7u86AuZ0k3PdsU9z9b2lHt4RKU6R21hQAjbc6EaNCCFbp+RaOyplh8oxcMFlxW+JH 1vE6YQafwrfS55NwxxZNRUC0c5DKB+QroAMRBAAkCbX0zOwyvKKJnCKF/MIUmsbTvyg5 wl3FjL8iyDMLoMKZeqpBzKZffgiRMJOg05N4rL3b7Ww7QUKXWLVz3bo/yRUbfOnK6WN6 cYrw6QbcE89mLkrjs7gmNE7aULAtHEpGc63WdDNsgdM3nLjDP9Ojycxk8XRU5MdJLWD+ 2xgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707921046; x=1708525846; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yckyt4FS4GdsF/2Ce/cif204cCNoEeKz3blJdpIMXJo=; b=chVrTnAmOuU23/1rMq5HhefVQwXwmGaeCzJBK95A0syQnBkuLA7iXt5zudzyQkUk0k WJ79iqyTbpHELPT1Z7GgiKXZadqmGS+m3CAiwfL6rz/0F208VSIuHTC4gnjp89DfIiAU 2bLqwXuCdMB2OhtvrgorR11aMvabd1wendq3KR/Hzns3LvpzKq9Krp6iY4bRQK4p/3i+ cwebwTBkhmaKXTdEAVvyzglAqVFnghKf3rgtdx+mPfzsAkGKvf4QJgBOzlm9pNBBSQG4 jOzFnZJK0VwlYrSO8fYmkp/yDeCGE+j62a3PAsiLe+wgENOMqQLoPuKYnSLn4XlNfWNg hFeQ== X-Gm-Message-State: AOJu0Yy+JxfzQ/TMOHN7r4u2awt2C0Rj9GVW27soCqjjVbTVRMEi5FRL EFca84TdBh8WE4r2Xzu/ImltC0zrUodYDr7GMDKM2ErEz0KEo2NspqlEXnHJvfacegaDAXIXZwN 42Ufkfl0hUmrFAURxekU/JaVd9iFh94nE5wOUaw== X-Google-Smtp-Source: AGHT+IFSF5Ow/OVAumOhgxlcAwh63SjsDp0XUQb5ZO6jSoDYuyUq4WlVHzXvuEHMc6VSGNDDo+09XxRCXrXlA+wRV3s= X-Received: by 2002:a05:6871:10f:b0:219:9fe4:1eb1 with SMTP id y15-20020a056871010f00b002199fe41eb1mr2988870oab.19.1707921045941; Wed, 14 Feb 2024 06:30:45 -0800 (PST) MIME-Version: 1.0 References: <20240213193732.28cb8329@slate.karlpinc.com> In-Reply-To: <20240213193732.28cb8329@slate.karlpinc.com> From: Daniele Varrazzo Date: Wed, 14 Feb 2024 15:30:33 +0100 Message-ID: Subject: Re: Reporting UnicodeEncodeError info on arbitrary data sent to PG with psycopg3 To: "Karl O. Pinc" Cc: "psycopg@postgresql.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hello Karl, On Wed, 14 Feb 2024 at 02:37, Karl O. Pinc wrote: > This does not work. What is wrong with what I'm doing > and how do I do what I want? (And how am I supposed to > know why this does not work and what works?) I call the > dumper because I want to rely on psycopg3's mechanisms > and not have to query the server for its encoding > and figure out the PG->Python encoding mappings myself. Keep in mind that you are playing with objects that are somewhat internal, so it wouldn't be impossible that these interfaces will change in the future. It's not planned at the moment and it wouldn't happen in a minor version anyway. However, the main problem I see there is that `conn.adapters.get_dumper()` returns a class. If you want a dumper you must instantiate it. The following works as you expect: >>> conn.execute("set client_encoding to 'latin1'") >>> dumper =3D conn.adapters.get_dumper(str, psycopg.adapt.PyFormat.TEXT)(str, conn) >>> dumper.dump('=E2=82=AC') ... UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac' in position 0: ordinal not in range(256) Note however that if you just want to know the Python codec you can find it in `conn.info.encoding` (https://www.psycopg.org/psycopg3/docs/api/objects.html#psycopg.ConnectionI= nfo.encoding): >>> conn.info.encoding 'iso8859-1' >>> "=E2=82=AC".encode(conn.info.encoding) ... UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac' in position 0: ordinal not in range(256) Hope this helps -- Daniele