Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vpwqZ-001jEp-0s for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Feb 2026 23:08:44 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vpwqY-001RSk-0c for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Feb 2026 23:08:43 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vpwqX-001RSa-2v for pgsql-hackers@lists.postgresql.org; Tue, 10 Feb 2026 23:08:42 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vpwqW-000000003ba-2ojx for pgsql-hackers@lists.postgresql.org; Tue, 10 Feb 2026 23:08:42 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 61AN8c5N3980444; Tue, 10 Feb 2026 18:08:38 -0500 From: Tom Lane To: Thomas Munro cc: PostgreSQL Hackers Subject: Re: Do we still need MULE_INTERNAL? In-reply-to: References: Comments: In-reply-to Thomas Munro message dated "Wed, 11 Feb 2026 11:39:20 +1300" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <3980442.1770764918.1@sss.pgh.pa.us> Date: Tue, 10 Feb 2026 18:08:38 -0500 Message-ID: <3980443.1770764918@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Thomas Munro writes: > MULE_INTERNAL solved a really hard problem years ago and must have > been extremely useful, but I think we might be able to drop it now, > and I have a patch. FWIW, I am on board with dropping it, and I have another reason you didn't list: AFAICS there are multiple ways to represent the same string in MULE. Any character available in more than one encoding has more than one equally-legitimate MULE representation, which is catastrophic for functions as basic as text equality. You could argue that this is no worse than the situation for combining characters in Unicode, but there there's at least an agreed-on normal form. > This history may be very well known to hackers in Japan, but I had to > start from zero with my archeologist hat on, and I suspect this is as > obscure to many others as it was to me, so here's what I have come up > with: Thanks for doing that research, BTW. This was mostly new to me. regards, tom lane