public inbox for [email protected]
help / color / mirror / Atom feedChanging a varchar(7) domain into text directly in pg_type
4+ messages / 3 participants
[nested] [flat]
* Changing a varchar(7) domain into text directly in pg_type
@ 2025-05-27 14:27 Richard Zetterberg <[email protected]>
2025-05-27 18:09 ` Re: Changing a varchar(7) domain into text directly in pg_type Adrian Klaver <[email protected]>
0 siblings, 1 reply; 4+ messages in thread
From: Richard Zetterberg @ 2025-05-27 14:27 UTC (permalink / raw)
To: [email protected]
Hello,
I have a read-only table that contains a set of never changing categories.
Each category has a unique alpha numerical ID and a description. The
purpose of this table is so that other tables can reference the ID of this
table, to make sure that they don't contain invalid/unknown categories and
so that users can lookup the description of each category.
This category table has the following type on the ID column: "varchar(7)"
(yes, I should have used text). In order to avoid having to type
"varchar(7)" in all the tables that references the category table, I
created this domain that I used as type for all referencing columns:
"CREATE DOMAIN cat.id AS varchar(7);".
During some data archeology, I found a bunch of new categories that haven't
been imported into the database yet, and they have IDs longer than 7.
I've seen claims that varchar and text have the same representation on disk
and that they are treated the same way "under the hood", except for the
extra constraint checks on varchar. So, I thought that maybe I could just
change the type of my domain to text, directly in pg_type and that should
solve my problems.
After some thinkering, and looking at how "CREATE DOMAIN cat.id AS
varchar(7);" and "CREATE DOMAIN cat.id AS text;" appeared in pg_type and
information_schema.domains, I came up with this query:
```
WITH text_oid AS (
SELECT typ.oid AS text_oid
FROM pg_type AS typ
INNER JOIN pg_namespace AS ns
ON typ.typnamespace = ns.oid
AND typ.typname = 'text'
), target_oid AS (
SELECT typ.oid AS target_oid
FROM pg_type AS typ
INNER JOIN pg_namespace AS ns
ON typ.typnamespace = ns.oid
AND ns.nspname = 'cat
AND typ.typname = 'id'
), oids AS (
SELECT *
FROM text_oid
CROSS JOIN target_oid
) UPDATE pg_type AS styp
SET typoutput = 'textout'
, typsend = 'textsend'
, typbasetype = o.text_oid
, typtypmod = -1
FROM oids AS o
WHERE oid = o.target_oid;
```
After running that query, my domain had the type text instead of varchar(7)
and all the
fkeys to my category table seems to be working.
Could this be a viable option to solve my problem? Or will I face serious
problems later down the line that I haven't discovered/considered yet?
Thanks for any insight,
Richard Zetterberg
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: Changing a varchar(7) domain into text directly in pg_type
2025-05-27 14:27 Changing a varchar(7) domain into text directly in pg_type Richard Zetterberg <[email protected]>
@ 2025-05-27 18:09 ` Adrian Klaver <[email protected]>
2025-05-28 09:53 ` Re: Changing a varchar(7) domain into text directly in pg_type Richard Zetterberg <[email protected]>
0 siblings, 1 reply; 4+ messages in thread
From: Adrian Klaver @ 2025-05-27 18:09 UTC (permalink / raw)
To: Richard Zetterberg <[email protected]>; [email protected]
On 5/27/25 7:27 AM, Richard Zetterberg wrote:
> Hello,
>
> I have a read-only table that contains a set of never changing
> categories. Each category has a unique alpha numerical ID and a
> description. The purpose of this table is so that other tables can
> reference the ID of this table, to make sure that they don't contain
> invalid/unknown categories and so that users can lookup the description
> of each category.
Define 'read-only'. In other words can you temporarily make it not
read-only and change the type to text(or just varchar (no length specifier)?
This would be the easiest fix.
>
> This category table has the following type on the ID column:
> "varchar(7)" (yes, I should have used text). In order to avoid having to
> type "varchar(7)" in all the tables that references the category table,
> I created this domain that I used as type for all referencing columns:
> "CREATE DOMAIN cat.id <http://cat.id; AS varchar(7);".
>
> During some data archeology, I found a bunch of new categories that
> haven't been imported into the database yet, and they have IDs longer
> than 7.
If the read-only table field has a maximum length of 7 and you have
incoming data that is coming in longer then 7 characters, how are they
going to reference the read-only table?
>
> I've seen claims that varchar and text have the same representation on
> disk and that they are treated the same way "under the hood", except for
> the extra constraint checks on varchar. So, I thought that maybe I could
> just change the type of my domain to text, directly in pg_type and that
> should solve my problems
Per my comment above, how?
> Thanks for any insight,
> Richard Zetterberg
--
Adrian Klaver
[email protected]
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: Changing a varchar(7) domain into text directly in pg_type
2025-05-27 14:27 Changing a varchar(7) domain into text directly in pg_type Richard Zetterberg <[email protected]>
2025-05-27 18:09 ` Re: Changing a varchar(7) domain into text directly in pg_type Adrian Klaver <[email protected]>
@ 2025-05-28 09:53 ` Richard Zetterberg <[email protected]>
2025-05-28 14:13 ` Re: Changing a varchar(7) domain into text directly in pg_type Ron Johnson <[email protected]>
0 siblings, 1 reply; 4+ messages in thread
From: Richard Zetterberg @ 2025-05-28 09:53 UTC (permalink / raw)
To: Adrian Klaver <[email protected]>; +Cc: [email protected]
Thanks for the reply, Adrian!
I must have explained my problem poorly, sorry for that.
Let me present an imaginary scenario and hopefully that clears things up.
It's 2015 and you want to start a birdwatching club and want to keep track
of sightings in a postgres database.
To make sure that users in your birdwatching club all enter the specie of
the observed bird correctly, you decide to create a table that will contain
all distinct bird species. Since there are over >10 000 bird species you
decide to download an existing taxonomy of bird species. On
birdtaxonomies.com you download edition 5 of the bird species taxonomy as a
CSV-file. Each row of this file represents a bird specie and it has two
columns: unique ID and name.
You observe that the IDs in the downloaded file have at most 6 characters.
To make room for additional bird species in coming editions of the
taxonomy, you decide to use `varchar(7)` as the type of the ID column. To
improve readability, you create a type alias:
```
CREATE DOMAIN species_id AS varchar(7);
```
Then you create the table that contains the bird species taxonomy:
```
CREATE TABLE species (
id species_id PRIMARY KEY,
name text NOT NULL
);
```
Once all the bird species are imported from the taxonomy file into the
`species` table, nothing is changed in this file.
To guarantee that people enter the specie correctly, you now add a foreign
key to all of your observation tables referencing the `species` table.
Here's an example of what one of those tables look like:
```
CREATE TABLE sightings (
id serial PRIMARY KEY,
specie_id species_id NOT NULL REFERENCES species (id),
coordinates geography(POINT, 4326) NOT NULL,
date timestamptz NOT NULL
);
```
Each year, birdtaxonomies.com release a new edition of the bird species
taxonomy, that you update your species table with. But this year,
birdtaxonomies.com comes decides they want a more fine grained way of
classifying bird species, which results in latest edition of bird species
taxonomy to have more than 1 million species and IDs that are as long as 9
characters.
After these 10 years your club has more than 1 million members worldwide,
more than 20 different kinds of observations tables, procedures and views,
billions of rows of different kinds of observations and a total size of
over 100 GB.
In order to import the latest taxonomy, you need to change the type of the
domain `species_id` from `varchar(7)` to `text`. But postgres doesn't allow
you to change the type of a domain with a single command. Instead what you
need to do is:
1. Rename current domain: `ALTER DOMAIN species_id RENAME TO
species_id_depricated;`
2. Create a new domain: `CREATE DOMAIN species_id AS text;`
3. Change all >20 tables, procedures and views to use the new domain
4. Drop the old domain
You had done a similar but smaller change a couple of years ago, and that
meant days of downtime. Also, you are a lazy person and it's tedious to
write the SQL code to change all tables, procedures and views. So, you
decide to manipulate `pg_type` directly, since you read somewhere online
that `varchar` and `text` are basically the same under the hood.
End of story.
Basically, I was hoping that my shortcut would save time and that this
small `pg_type` manipulation can be done without any side effects. I don't
have any knowledge of postgres internals, so I don't know if postgres would
accept this change and carry on like nothing happened. Or, if it will break
my indices or how queries are executed.
In my head, this change would be instant and postgres would carry on like
nothing happened and that the domain always had the type `text`. Is this a
fools errand? Or could it actually make sense in this situation?
Thanks
Richard
On Tue, May 27, 2025 at 8:10 PM Adrian Klaver <[email protected]>
wrote:
>
>
> On 5/27/25 7:27 AM, Richard Zetterberg wrote:
> > Hello,
> >
> > I have a read-only table that contains a set of never changing
> > categories. Each category has a unique alpha numerical ID and a
> > description. The purpose of this table is so that other tables can
> > reference the ID of this table, to make sure that they don't contain
> > invalid/unknown categories and so that users can lookup the description
> > of each category.
> Define 'read-only'. In other words can you temporarily make it not
> read-only and change the type to text(or just varchar (no length
> specifier)?
> This would be the easiest fix.
>
> >
> > This category table has the following type on the ID column:
> > "varchar(7)" (yes, I should have used text). In order to avoid having to
> > type "varchar(7)" in all the tables that references the category table,
> > I created this domain that I used as type for all referencing columns:
> > "CREATE DOMAIN cat.id <http://cat.id; AS varchar(7);".
> >
> > During some data archeology, I found a bunch of new categories that
> > haven't been imported into the database yet, and they have IDs longer
> > than 7.
>
> If the read-only table field has a maximum length of 7 and you have
> incoming data that is coming in longer then 7 characters, how are they
> going to reference the read-only table?
>
> >
> > I've seen claims that varchar and text have the same representation on
> > disk and that they are treated the same way "under the hood", except for
> > the extra constraint checks on varchar. So, I thought that maybe I could
> > just change the type of my domain to text, directly in pg_type and that
> > should solve my problems
>
> Per my comment above, how?
>
>
> > Thanks for any insight,
> > Richard Zetterberg
>
> --
> Adrian Klaver
> [email protected]
>
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: Changing a varchar(7) domain into text directly in pg_type
2025-05-27 14:27 Changing a varchar(7) domain into text directly in pg_type Richard Zetterberg <[email protected]>
2025-05-27 18:09 ` Re: Changing a varchar(7) domain into text directly in pg_type Adrian Klaver <[email protected]>
2025-05-28 09:53 ` Re: Changing a varchar(7) domain into text directly in pg_type Richard Zetterberg <[email protected]>
@ 2025-05-28 14:13 ` Ron Johnson <[email protected]>
0 siblings, 0 replies; 4+ messages in thread
From: Ron Johnson @ 2025-05-28 14:13 UTC (permalink / raw)
To: pgsql-generallists.postgresql.org <[email protected]>
Foreign key "code tables" are ubiquitous in good database design; another
example is zip/postal codes and cities. Statuses are another standard
usage. Thus, what you're doing is absolutely bog-standard. (In your case,
I would add "taxonomy_edition" to the species table. You get more
information, and don't have to update the species table; just add more to
it. New sightings can use the new taxonomy.)
As to whether it would work... I absolutely _would not_ go mucking around
in the catalog. Bite the bullet and ALTER the twenty-odd columns to TEXT.
(Why TEXT and not a DOMAIN? Simplicity.)
On Wed, May 28, 2025 at 8:33 AM Richard Zetterberg <
[email protected]> wrote:
> Thanks for the reply, Adrian!
>
> I must have explained my problem poorly, sorry for that.
>
> Let me present an imaginary scenario and hopefully that clears things up.
>
> It's 2015 and you want to start a birdwatching club and want to keep track
> of sightings in a postgres database.
>
> To make sure that users in your birdwatching club all enter the specie of
> the observed bird correctly, you decide to create a table that will contain
> all distinct bird species. Since there are over >10 000 bird species you
> decide to download an existing taxonomy of bird species. On
> birdtaxonomies.com you download edition 5 of the bird species taxonomy as
> a CSV-file. Each row of this file represents a bird specie and it has two
> columns: unique ID and name.
>
> You observe that the IDs in the downloaded file have at most 6 characters.
> To make room for additional bird species in coming editions of the
> taxonomy, you decide to use `varchar(7)` as the type of the ID column. To
> improve readability, you create a type alias:
>
> ```
> CREATE DOMAIN species_id AS varchar(7);
> ```
>
> Then you create the table that contains the bird species taxonomy:
>
> ```
> CREATE TABLE species (
> id species_id PRIMARY KEY,
> name text NOT NULL
> );
> ```
>
> Once all the bird species are imported from the taxonomy file into the
> `species` table, nothing is changed in this file.
>
> To guarantee that people enter the specie correctly, you now add a foreign
> key to all of your observation tables referencing the `species` table.
> Here's an example of what one of those tables look like:
>
> ```
> CREATE TABLE sightings (
> id serial PRIMARY KEY,
> specie_id species_id NOT NULL REFERENCES species (id),
> coordinates geography(POINT, 4326) NOT NULL,
> date timestamptz NOT NULL
> );
> ```
>
> Each year, birdtaxonomies.com release a new edition of the bird species
> taxonomy, that you update your species table with. But this year,
> birdtaxonomies.com comes decides they want a more fine grained way of
> classifying bird species, which results in latest edition of bird species
> taxonomy to have more than 1 million species and IDs that are as long as 9
> characters.
>
> After these 10 years your club has more than 1 million members worldwide,
> more than 20 different kinds of observations tables, procedures and views,
> billions of rows of different kinds of observations and a total size of
> over 100 GB.
>
> In order to import the latest taxonomy, you need to change the type of the
> domain `species_id` from `varchar(7)` to `text`. But postgres doesn't allow
> you to change the type of a domain with a single command. Instead what you
> need to do is:
>
> 1. Rename current domain: `ALTER DOMAIN species_id RENAME TO
> species_id_depricated;`
> 2. Create a new domain: `CREATE DOMAIN species_id AS text;`
> 3. Change all >20 tables, procedures and views to use the new domain
> 4. Drop the old domain
>
> You had done a similar but smaller change a couple of years ago, and that
> meant days of downtime. Also, you are a lazy person and it's tedious to
> write the SQL code to change all tables, procedures and views. So, you
> decide to manipulate `pg_type` directly, since you read somewhere online
> that `varchar` and `text` are basically the same under the hood.
>
> End of story.
>
> Basically, I was hoping that my shortcut would save time and that this
> small `pg_type` manipulation can be done without any side effects. I don't
> have any knowledge of postgres internals, so I don't know if postgres would
> accept this change and carry on like nothing happened. Or, if it will break
> my indices or how queries are executed.
>
> In my head, this change would be instant and postgres would carry on like
> nothing happened and that the domain always had the type `text`. Is this a
> fools errand? Or could it actually make sense in this situation?
>
> Thanks
> Richard
>
> On Tue, May 27, 2025 at 8:10 PM Adrian Klaver <[email protected]>
> wrote:
>
>>
>>
>> On 5/27/25 7:27 AM, Richard Zetterberg wrote:
>> > Hello,
>> >
>> > I have a read-only table that contains a set of never changing
>> > categories. Each category has a unique alpha numerical ID and a
>> > description. The purpose of this table is so that other tables can
>> > reference the ID of this table, to make sure that they don't contain
>> > invalid/unknown categories and so that users can lookup the description
>> > of each category.
>> Define 'read-only'. In other words can you temporarily make it not
>> read-only and change the type to text(or just varchar (no length
>> specifier)?
>> This would be the easiest fix.
>>
>> >
>> > This category table has the following type on the ID column:
>> > "varchar(7)" (yes, I should have used text). In order to avoid having
>> to
>> > type "varchar(7)" in all the tables that references the category table,
>> > I created this domain that I used as type for all referencing columns:
>> > "CREATE DOMAIN cat.id <http://cat.id; AS varchar(7);".
>> >
>> > During some data archeology, I found a bunch of new categories that
>> > haven't been imported into the database yet, and they have IDs longer
>> > than 7.
>>
>> If the read-only table field has a maximum length of 7 and you have
>> incoming data that is coming in longer then 7 characters, how are they
>> going to reference the read-only table?
>>
>> >
>> > I've seen claims that varchar and text have the same representation on
>> > disk and that they are treated the same way "under the hood", except
>> for
>> > the extra constraint checks on varchar. So, I thought that maybe I
>> could
>> > just change the type of my domain to text, directly in pg_type and that
>> > should solve my problems
>>
>> Per my comment above, how?
>>
>>
>> > Thanks for any insight,
>> > Richard Zetterberg
>>
>> --
>> Adrian Klaver
>> [email protected]
>>
>
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
^ permalink raw reply [nested|flat] 4+ messages in thread
end of thread, other threads:[~2025-05-28 14:13 UTC | newest]
Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-05-27 14:27 Changing a varchar(7) domain into text directly in pg_type Richard Zetterberg <[email protected]>
2025-05-27 18:09 ` Adrian Klaver <[email protected]>
2025-05-28 09:53 ` Richard Zetterberg <[email protected]>
2025-05-28 14:13 ` Ron Johnson <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox