public inbox for [email protected]  
help / color / mirror / Atom feed
From: Daniel Verite <[email protected]>
To: Linus Heckemann <[email protected]>
Cc: pgsql-general mailing list <[email protected]>
Subject: Re: Can we get sha* function over text, that could be used in index?
Date: Fri, 20 Feb 2026 22:40:37 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <ygawm0a9bh9.fsf@localhost>

	Linus Heckemann wrote:

> - there is a byte-array representation of text columns, which appears to
>  be independent of database encoding

Not sure what you're refering to. Both the on-disk and in-memory
representations of text/varchar are encoding-dependent.

> The obvious (to a naive user, like I was) approach, casting to bytea,
> has exceptionally surprising behaviour: for many text strings, it does
> exactly what the naive user might hope for, giving back the UTF-8
> representation. But multiple distinct text strings, like '\033' and
> '\x1b', convert to the same byte string! And text strings containing a
> backslash that doesn't fit the bytea hex format or the bytea escape
> format will fail to convert completely!

Yes. It seems a common mistake to forget or ignore that
backslashes are special in the input text representation of bytea.
It might be not obvious from reading the doc at [1]
but we just need to quote backslashes by doubling them.

AFAIK a working solution for the OP would be:
 sha256(replace(colname, '\', '\\')::bytea)

The result is encoding-dependent, but that does not matter
in the context of an expression. index.
If the database ever needs to change its encoding, it will have to
be recreated entirely anyway.


[1]
https://www.postgresql.org/docs/current/datatype-binary.html#DATATYPE-BINARY-BYTEA-ESCAPE-FORMAT

Best regards,
-- 
Daniel Vérité 
https://postgresql.verite.pro/






view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Can we get sha* function over text, that could be used in index?
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox