X-Original-To: pgsql-docs-postgresql.org@localhost.postgresql.org Received: from localhost (unknown [200.46.204.144]) by svr1.postgresql.org (Postfix) with ESMTP id 6A3953A19D0 for ; Tue, 4 Jan 2005 00:06:13 +0000 (GMT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 99573-04 for ; Tue, 4 Jan 2005 00:05:48 +0000 (GMT) Received: from candle.pha.pa.us (candle.pha.pa.us [207.106.42.251]) by svr1.postgresql.org (Postfix) with ESMTP id 5C0E23A3C0B for ; Tue, 4 Jan 2005 00:05:49 +0000 (GMT) Received: (from pgman@localhost) by candle.pha.pa.us (8.11.6/8.11.6) id j0405cG28522; Mon, 3 Jan 2005 19:05:38 -0500 (EST) From: Bruce Momjian Message-Id: <200501040005.j0405cG28522@candle.pha.pa.us> Subject: Re: Doc patch needed: encodings? In-Reply-To: <5412.1102360408@sss.pgh.pa.us> To: Tom Lane Date: Mon, 3 Jan 2005 19:05:38 -0500 (EST) Cc: Peter Eisentraut , josh@agliodbs.com, PostgreSQL Docs X-Mailer: ELM [version 2.4ME+ PL108 (25)] MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=ELM1104797138-1052-0_ Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, hits=0.004 tagged_above=0 required=5 tests=AWL X-Spam-Level: X-Archive-Number: 200501/1 X-Sequence-Number: 2760 --ELM1104797138-1052-0_ Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII I have applied the following patch to mention non-C locales affect LIKE. --------------------------------------------------------------------------- Tom Lane wrote: > Peter Eisentraut writes: > > Josh Berkus wrote: > >> I'd like to have an explanation of this somewhere else newbies are > >> liable to read it, *before* their first production "LIKE" query > >> doesn't use an index. Where would be appropriate? > > > Near the documentation of "LIKE". > > I think it would be fair to mention this somewhere near the discussion > of creating a database cluster, too. The existing documentation does > warn you that sort order may be affected by your choice, but there is > nothing anywhere near that section to suggest that LIKE performance > might be affected. A para in the "Locale Support" section (in > charset.sgml) would probably be appropriate, and maybe another word or > two in the place that link to it in runtime.sgml and ref/initdb.sgml. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 --ELM1104797138-1052-0_ Content-Transfer-Encoding: 7bit Content-Type: text/plain Content-Disposition: inline; filename="/bjm/diff" Index: doc/src/sgml/charset.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v retrieving revision 2.47 diff -c -c -r2.47 charset.sgml *** doc/src/sgml/charset.sgml 27 Dec 2004 22:30:10 -0000 2.47 --- doc/src/sgml/charset.sgml 4 Jan 2005 00:02:40 -0000 *************** *** 189,198 **** ! Benefits</> <para> ! Locale support influences in particular the following features: <itemizedlist> <listitem> --- 189,198 ---- </sect2> <sect2> ! <title>Behavior</> <para> ! Locale support influences the following features: <itemizedlist> <listitem> *************** *** 204,209 **** --- 204,216 ---- <listitem> <para> + The ability to use indexes with <literal>LIKE</> clauses + <indexterm><primary>LIKE</><secondary>and locales</></indexterm> + </para> + </listitem> + + <listitem> + <para> The <function>to_char</> family of functions </para> </listitem> *************** *** 211,219 **** </para> <para> ! The only severe drawback of using the locale support in ! <productname>PostgreSQL</> is its speed. So use locales only if ! you actually need them. </para> </sect2> --- 218,228 ---- </para> <para> ! The drawback of using locales other than <literal>C</> or ! <literal>POSIX</> in <productname>PostgreSQL</> is its performance ! impact. It slows character handling and prevents ordinary indexes ! from being used by <literal>LIKE</>. For this reason use locales ! only if you actually need them. </para> </sect2> Index: doc/src/sgml/runtime.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v retrieving revision 1.299 diff -c -c -r1.299 runtime.sgml *** doc/src/sgml/runtime.sgml 26 Dec 2004 23:06:56 -0000 1.299 --- doc/src/sgml/runtime.sgml 4 Jan 2005 00:02:56 -0000 *************** *** 144,152 **** that can be found in <xref linkend="locale">. The sort order used within a particular database cluster is set by <command>initdb</command> and cannot be changed later, short of ! dumping all data, rerunning <command>initdb</command>, and ! reloading the data. So it's important to make this choice correctly ! the first time. </para> </sect1> --- 144,153 ---- that can be found in <xref linkend="locale">. The sort order used within a particular database cluster is set by <command>initdb</command> and cannot be changed later, short of ! dumping all data, rerunning <command>initdb</command>, and reloading ! the data. There is also a performance impact for using locales ! other than <literal>C</> or <literal>POSIX</>. Therefore, it is ! important to make this choice correctly the first time. </para> </sect1> Index: doc/src/sgml/ref/initdb.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/ref/initdb.sgml,v retrieving revision 1.32 diff -c -c -r1.32 initdb.sgml *** doc/src/sgml/ref/initdb.sgml 1 Aug 2004 06:19:18 -0000 1.32 --- doc/src/sgml/ref/initdb.sgml 4 Jan 2005 00:02:57 -0000 *************** *** 54,74 **** </para> <para> ! <command>initdb</command> initializes the database cluster's ! default locale and character set encoding. Some locale categories ! are fixed for the lifetime of the cluster, so it is important to ! make the right choice when running <command>initdb</command>. ! Other locale categories can be changed later when the server is ! started. <command>initdb</command> will write those locale ! settings into the <filename>postgresql.conf</filename> ! configuration file so they are the default, but they can be changed ! by editing that file. To set the locale that ! <command>initdb</command> uses, see the description of the ! <option>--locale</option> option. The character set encoding can be set separately for each database as it is created. <command>initdb</command> determines the encoding for the <literal>template1</literal> database, which will serve as the ! default for all other databases. To alter the default encoding use the <option>--encoding</option> option. </para> --- 54,75 ---- </para> <para> ! <command>initdb</command> initializes the database cluster's default ! locale and character set encoding. Some locale categories are fixed ! for the lifetime of the cluster. There is also a performance impact ! in using locales other than <literal>C</> or <literal>POSIX</>. ! Therefore it is important to make the right choice when running ! <command>initdb</command>. Other locale categories can be changed ! later when the server is started. <command>initdb</command> will ! write those locale settings into the ! <filename>postgresql.conf</filename> configuration file so they are ! the default, but they can be changed by editing that file. To set the ! locale that <command>initdb</command> uses, see the description of ! the <option>--locale</option> option. The character set encoding can be set separately for each database as it is created. <command>initdb</command> determines the encoding for the <literal>template1</literal> database, which will serve as the ! default for all other databases. To alter the default encoding use the <option>--encoding</option> option. </para> --ELM1104797138-1052-0_--