public inbox for [email protected]  
help / color / mirror / Atom feed
From: Lyle <[email protected]>
To: [email protected]
Subject: Unclear CHARACTER specification
Date: Sun, 30 Dec 2012 15:02:31 +0000
Message-ID: <[email protected]> (raw)
List-Unsubscribe: <mailto:[email protected]?body=unsub%20pgsql-docs>

Hi,
   Here:
http://www.postgresql.org/docs/9.2/interactive/datatype-character.html

There isn't really clear guidance on ranges and limits. All we have to 
go on is this paragraph:

=start
The storage requirement for a short string (up to 126 bytes) is 1 byte 
plus the actual string, which includes the space padding in the case of 
character. Longer strings have 4 bytes of overhead instead of 1. Long 
strings are compressed by the system automatically, so the physical 
requirement on disk might be less. Very long values are also stored in 
background tables so that they do not interfere with rapid access to 
shorter column values. In any case, the longest possible character 
string that can be stored is about 1 GB. (The maximum value that will be 
allowed for n in the data type declaration is less than that. It 
wouldn't be useful to change this because with multibyte character 
encodings the number of characters and bytes can be quite different. If 
you desire to store long strings with no specific upper limit, use text 
or character varying without a length specifier, rather than making up 
an arbitrary length limit.)
=end

 From this I can deduce that if I want to keep the character string as a 
short string CHAR(126) should achieve that, for single byte character 
encodings at least.

Then it says about long and very long strings, but doesn't give any 
figures. Nor does it say whether it decides a string is long or very 
long before or after the compression. There must be byte length figures 
associated with this, please could we have them?

Why is the longest possible string about 1GB? Is that compressed or 
uncompressed? Why can't we have an exact figure? What compression is 
used? If the amount allowed for n in the data type definition is less, 
what is it?

I can't imagine the PostgreSQL code has something like
if ( length( string ) + RAND() > 1GB ) {
error();
}
Seeing how short, long, and very long strings have been mentioned, along 
with compression, why not complete this description with specific figures?

Is there other documentation available that covers this?

Sorry for all the questions, but I really thing all of this should be 
covered already. Even if it's platform specific, a table with platforms 
and figures would do the job nicely.


Lyle



-- 
Sent via pgsql-docs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs




view thread (2+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Unclear CHARACTER specification
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox