public inbox for [email protected]  
help / color / mirror / Atom feed
Unclear CHARACTER specification
2+ messages / 2 participants
[nested] [flat]

* Unclear CHARACTER specification
@ 2012-12-30 15:02 Lyle <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Lyle @ 2012-12-30 15:02 UTC (permalink / raw)
  To: pgsql-docs

Hi,
   Here:
http://www.postgresql.org/docs/9.2/interactive/datatype-character.html

There isn't really clear guidance on ranges and limits. All we have to 
go on is this paragraph:

=start
The storage requirement for a short string (up to 126 bytes) is 1 byte 
plus the actual string, which includes the space padding in the case of 
character. Longer strings have 4 bytes of overhead instead of 1. Long 
strings are compressed by the system automatically, so the physical 
requirement on disk might be less. Very long values are also stored in 
background tables so that they do not interfere with rapid access to 
shorter column values. In any case, the longest possible character 
string that can be stored is about 1 GB. (The maximum value that will be 
allowed for n in the data type declaration is less than that. It 
wouldn't be useful to change this because with multibyte character 
encodings the number of characters and bytes can be quite different. If 
you desire to store long strings with no specific upper limit, use text 
or character varying without a length specifier, rather than making up 
an arbitrary length limit.)
=end

 From this I can deduce that if I want to keep the character string as a 
short string CHAR(126) should achieve that, for single byte character 
encodings at least.

Then it says about long and very long strings, but doesn't give any 
figures. Nor does it say whether it decides a string is long or very 
long before or after the compression. There must be byte length figures 
associated with this, please could we have them?

Why is the longest possible string about 1GB? Is that compressed or 
uncompressed? Why can't we have an exact figure? What compression is 
used? If the amount allowed for n in the data type definition is less, 
what is it?

I can't imagine the PostgreSQL code has something like
if ( length( string ) + RAND() > 1GB ) {
error();
}
Seeing how short, long, and very long strings have been mentioned, along 
with compression, why not complete this description with specific figures?

Is there other documentation available that covers this?

Sorry for all the questions, but I really thing all of this should be 
covered already. Even if it's platform specific, a table with platforms 
and figures would do the job nicely.


Lyle



-- 
Sent via pgsql-docs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs




^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Unclear CHARACTER specification
@ 2013-01-04 20:11 Kevin Grittner <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Kevin Grittner @ 2013-01-04 20:11 UTC (permalink / raw)
  To: Lyle <[email protected]>; pgsql-docs

Lyle wrote:

> Is there other documentation available that covers this?

You might want to read this page:

http://www.postgresql.org/docs/9.2/static/storage-toast.html

Note that there is seldom a good reason to use char(n) in
PostgreSQL for any value of n. As the page you cited mentions,
"While character(n) has performance advantages in some other
database systems, there is no such advantage in PostgreSQL; in fact
character(n) is usually the slowest of the three because of its
additional storage costs. In most situations text or character
varying should be used instead."

Don't try to micro-optimize by breaking what is logically a larger
string into 126 character pieces; that will defeat copression, take
more space to store, and add processing overhead. In other words,
such an attempt will almost certainly backfire.

-Kevin


-- 
Sent via pgsql-docs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs




^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2013-01-04 20:11 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2012-12-30 15:02 Unclear CHARACTER specification Lyle <[email protected]>
2013-01-04 20:11 Re: Unclear CHARACTER specification Kevin Grittner <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox