public inbox for [email protected]  
help / color / mirror / Atom feed
Upcoming PG re-releases
55+ messages / 21 participants
[nested] [flat]

* Upcoming PG re-releases
@ 2005-11-30 15:56  Tom Lane <[email protected]>
  0 siblings, 3 replies; 55+ messages in thread

From: Tom Lane @ 2005-11-30 15:56 UTC (permalink / raw)
  To: [email protected]

It's been about a month since 8.1.0 was released, and we've found about
the usual number of bugs for a new release, so it seems like it's time
for 8.1.1.  The core committee has tentatively agreed to plan a release
for Tuesday Dec 6 (which means wrapping tarballs Monday).  We will
at the same time be making new dot-releases in the 7.3, 7.4, and 8.0
branches, principally to fix the SLRU race condition reported by Jim
Nasby and Robert Creager.

So ... if you've got any open issues with the back branches, now's the
time to get those patches in ...

			regards, tom lane



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-11-30 16:13  Andrew Dunstan <[email protected]>
  parent: Tom Lane <[email protected]>
  2 siblings, 1 reply; 55+ messages in thread

From: Andrew Dunstan @ 2005-11-30 16:13 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: [email protected]



Tom Lane wrote:

> We will
>at the same time be making new dot-releases in the 7.3, 7.4, and 8.0
>branches, principally to fix the SLRU race condition reported by Jim
>Nasby and Robert Creager.
>
>
>  
>

Was there a conclusion out of the recent discussion on EOL policy? The 
consensus seemed to be something like: "We will maintain releases to the 
best of our ability for at least 2 years plus 1 release cycle. After 
that, support may be dropped at any time when maintenance becomes 
difficult."

Have we actually officially stopped supporting the 7.2 series?

All this needs some announcement from the core trsam, IMNSHO - there has 
been some confusion over it (e.g. I saw someone recently saying we had 
stopped supporting the 7.3 series, which the above would seem to 
indicate is not true).

cheers

andrew



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-11-30 16:40  Tom Lane <[email protected]>
  parent: Andrew Dunstan <[email protected]>
  0 siblings, 2 replies; 55+ messages in thread

From: Tom Lane @ 2005-11-30 16:40 UTC (permalink / raw)
  To: Andrew Dunstan <[email protected]>; +Cc: [email protected]

Andrew Dunstan <[email protected]> writes:
> Have we actually officially stopped supporting the 7.2 series?

Yeah, we have.  It reached the "too difficult to support" point already
(the VACUUM/ctid bug back in August --- the patch used in the later
branches wouldn't apply at all, IIRC).

> All this needs some announcement from the core trsam, IMNSHO - there has 
> been some confusion over it (e.g. I saw someone recently saying we had 
> stopped supporting the 7.3 series, which the above would seem to 
> indicate is not true).

Personally I expect to keep supporting 7.3 for a long while, because Red
Hat pays me to ;-) ... and the EOL date for RHEL3 is a long way away yet.
The PG community may stop bothering with 7.3 releases before that.  But
I think Marc and Bruce figure "as long as the patches are in our CVS we
may as well put out a release".

We hashed all this out in the pghackers list back in August, but I agree
there ought to be something about it on the website.

			regards, tom lane



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 18:23  Robert Treat <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 2 replies; 55+ messages in thread

From: Robert Treat @ 2005-11-30 18:23 UTC (permalink / raw)
  To: [email protected]; pgsql-www; +Cc: Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

On Wednesday 30 November 2005 11:40, Tom Lane wrote:
> Personally I expect to keep supporting 7.3 for a long while, because Red
> Hat pays me to ;-) ... and the EOL date for RHEL3 is a long way away yet.
> The PG community may stop bothering with 7.3 releases before that.  But
> I think Marc and Bruce figure "as long as the patches are in our CVS we
> may as well put out a release".
>

Yeah, thats one of the reasons I am skeptical about having official policies 
on this type of thing.  If Sun decided they wanted to maintain 7.2 and were 
going to dedicate developers and testing for it, would we really turn that 
away?  OK, I don't really want to have this discussion again, but as of now I 
think we are all agreed that 7.2 is unsupported. 

> We hashed all this out in the pghackers list back in August, but I agree
> there ought to be something about it on the website.
>

We've been kicking it around but haven't moved much on this...

Marc, can you move the 7.2 branches in the FTP under the OLD directory?
http://www.postgresql.org/ftp/source/

We need to do the same with 7.2 documentation, moving them into the Manual 
Archive http://www.postgresql.org/docs/manuals/archive.html.  We can also 
change the caption on the main documentation page to note these are manuals 
for the current supported versions. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 18:30  Marc G. Fournier <[email protected]>
  parent: Robert Treat <[email protected]>
  1 sibling, 0 replies; 55+ messages in thread

From: Marc G. Fournier @ 2005-11-30 18:30 UTC (permalink / raw)
  To: Robert Treat <[email protected]>; +Cc: [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>


Done, as well as moved all but the last two of each version after ...


On Wed, 30 Nov 2005, Robert Treat wrote:

> On Wednesday 30 November 2005 11:40, Tom Lane wrote:
>> Personally I expect to keep supporting 7.3 for a long while, because Red
>> Hat pays me to ;-) ... and the EOL date for RHEL3 is a long way away yet.
>> The PG community may stop bothering with 7.3 releases before that.  But
>> I think Marc and Bruce figure "as long as the patches are in our CVS we
>> may as well put out a release".
>>
>
> Yeah, thats one of the reasons I am skeptical about having official policies
> on this type of thing.  If Sun decided they wanted to maintain 7.2 and were
> going to dedicate developers and testing for it, would we really turn that
> away?  OK, I don't really want to have this discussion again, but as of now I
> think we are all agreed that 7.2 is unsupported.
>
>> We hashed all this out in the pghackers list back in August, but I agree
>> there ought to be something about it on the website.
>>
>
> We've been kicking it around but haven't moved much on this...
>
> Marc, can you move the 7.2 branches in the FTP under the OLD directory?
> http://www.postgresql.org/ftp/source/
>
> We need to do the same with 7.2 documentation, moving them into the Manual
> Archive http://www.postgresql.org/docs/manuals/archive.html.  We can also
> change the caption on the main documentation page to note these are manuals
> for the current supported versions.
>
> -- 
> Robert Treat
> Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: [email protected]           Yahoo!: yscrappy              ICQ: 7615664



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 18:33  Magnus Hagander <[email protected]>
  0 siblings, 2 replies; 55+ messages in thread

From: Magnus Hagander @ 2005-11-30 18:33 UTC (permalink / raw)
  To: Marc G. Fournier <[email protected]>; Robert Treat <[email protected]>; +Cc: [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

Someone suggested earlier that we should drop the binaries for
nonsupported versions completely from the ftp site. Thoughts on this?

If not, they should at least go into OLD as well. But personally, I'm
for dropping them completely. If you're on something that old (heck, we
have 7.0 binaries..), you can still build from source.

Speaking of which, any reason not to drop the 8.1 beta win32 binaries?

//Magnus
 

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Marc G. Fournier
> Sent: Wednesday, November 30, 2005 7:31 PM
> To: Robert Treat
> Cc: [email protected]; [email protected]; 
> Tom Lane; Andrew Dunstan
> Subject: Re: [pgsql-www] [HACKERS] Upcoming PG re-releases
> 
> 
> Done, as well as moved all but the last two of each version after ...
> 
> 
> On Wed, 30 Nov 2005, Robert Treat wrote:
> 
> > On Wednesday 30 November 2005 11:40, Tom Lane wrote:
> >> Personally I expect to keep supporting 7.3 for a long 
> while, because 
> >> Red Hat pays me to ;-) ... and the EOL date for RHEL3 is a 
> long way away yet.
> >> The PG community may stop bothering with 7.3 releases 
> before that.  
> >> But I think Marc and Bruce figure "as long as the patches 
> are in our 
> >> CVS we may as well put out a release".
> >>
> >
> > Yeah, thats one of the reasons I am skeptical about having official 
> > policies on this type of thing.  If Sun decided they wanted to 
> > maintain 7.2 and were going to dedicate developers and 
> testing for it, 
> > would we really turn that away?  OK, I don't really want to 
> have this 
> > discussion again, but as of now I think we are all agreed 
> that 7.2 is unsupported.
> >
> >> We hashed all this out in the pghackers list back in August, but I 
> >> agree there ought to be something about it on the website.
> >>
> >
> > We've been kicking it around but haven't moved much on this...
> >
> > Marc, can you move the 7.2 branches in the FTP under the 
> OLD directory?
> > http://www.postgresql.org/ftp/source/
> >
> > We need to do the same with 7.2 documentation, moving them into the 
> > Manual Archive 
> http://www.postgresql.org/docs/manuals/archive.html.  
> > We can also change the caption on the main documentation 
> page to note 
> > these are manuals for the current supported versions.
> >
> > --
> > Robert Treat
> > Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
> >
> > ---------------------------(end of 
> > broadcast)---------------------------
> > TIP 2: Don't 'kill -9' the postmaster
> >
> 
> ----
> Marc G. Fournier           Hub.Org Networking Services 
> (http://www.hub.org)
> Email: [email protected]           Yahoo!: yscrappy             
>  ICQ: 7615664
> 
> ---------------------------(end of 
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 18:39  Marc G. Fournier <[email protected]>
  parent: Magnus Hagander <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: Marc G. Fournier @ 2005-11-30 18:39 UTC (permalink / raw)
  To: Magnus Hagander <[email protected]>; +Cc: Marc G. Fournier <[email protected]>; Robert Treat <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>


'k, moved it all into OLD as well ... haven't removed anything until more 
opt in on this ... I do agree that if you really want that old, you can 
build from scratch, but I'm also not the one that went to the trouble of 
building the binaries :)


On Wed, 30 Nov 2005, Magnus Hagander wrote:

> Someone suggested earlier that we should drop the binaries for
> nonsupported versions completely from the ftp site. Thoughts on this?
>
> If not, they should at least go into OLD as well. But personally, I'm
> for dropping them completely. If you're on something that old (heck, we
> have 7.0 binaries..), you can still build from source.
>
> Speaking of which, any reason not to drop the 8.1 beta win32 binaries?
>
> //Magnus
>
>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Marc G. Fournier
>> Sent: Wednesday, November 30, 2005 7:31 PM
>> To: Robert Treat
>> Cc: [email protected]; [email protected];
>> Tom Lane; Andrew Dunstan
>> Subject: Re: [pgsql-www] [HACKERS] Upcoming PG re-releases
>>
>>
>> Done, as well as moved all but the last two of each version after ...
>>
>>
>> On Wed, 30 Nov 2005, Robert Treat wrote:
>>
>>> On Wednesday 30 November 2005 11:40, Tom Lane wrote:
>>>> Personally I expect to keep supporting 7.3 for a long
>> while, because
>>>> Red Hat pays me to ;-) ... and the EOL date for RHEL3 is a
>> long way away yet.
>>>> The PG community may stop bothering with 7.3 releases
>> before that.
>>>> But I think Marc and Bruce figure "as long as the patches
>> are in our
>>>> CVS we may as well put out a release".
>>>>
>>>
>>> Yeah, thats one of the reasons I am skeptical about having official
>>> policies on this type of thing.  If Sun decided they wanted to
>>> maintain 7.2 and were going to dedicate developers and
>> testing for it,
>>> would we really turn that away?  OK, I don't really want to
>> have this
>>> discussion again, but as of now I think we are all agreed
>> that 7.2 is unsupported.
>>>
>>>> We hashed all this out in the pghackers list back in August, but I
>>>> agree there ought to be something about it on the website.
>>>>
>>>
>>> We've been kicking it around but haven't moved much on this...
>>>
>>> Marc, can you move the 7.2 branches in the FTP under the
>> OLD directory?
>>> http://www.postgresql.org/ftp/source/
>>>
>>> We need to do the same with 7.2 documentation, moving them into the
>>> Manual Archive
>> http://www.postgresql.org/docs/manuals/archive.html.
>>> We can also change the caption on the main documentation
>> page to note
>>> these are manuals for the current supported versions.
>>>
>>> --
>>> Robert Treat
>>> Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
>>>
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 2: Don't 'kill -9' the postmaster
>>>
>>
>> ----
>> Marc G. Fournier           Hub.Org Networking Services
>> (http://www.hub.org)
>> Email: [email protected]           Yahoo!: yscrappy
>>  ICQ: 7615664
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>>        choose an index scan if your joining column's datatypes do not
>>        match
>>
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: [email protected]           Yahoo!: yscrappy              ICQ: 7615664



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 18:59  Robert Treat <[email protected]>
  parent: Magnus Hagander <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: Robert Treat @ 2005-11-30 18:59 UTC (permalink / raw)
  To: Magnus Hagander <[email protected]>; +Cc: Marc G. Fournier <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

On Wed, 2005-11-30 at 13:33, Magnus Hagander wrote:
> Someone suggested earlier that we should drop the binaries for
> nonsupported versions completely from the ftp site. Thoughts on this?
> 
> If not, they should at least go into OLD as well. But personally, I'm
> for dropping them completely. If you're on something that old (heck, we
> have 7.0 binaries..), you can still build from source.
> 

I'm against the idea... the cost for us is minimal, and the hassle
involved in building from source is quite large. 


Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 22:31  Robert Bernier <[email protected]>
  parent: Marc G. Fournier <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Robert Bernier @ 2005-11-30 22:31 UTC (permalink / raw)
  To: pgsql-www

On Wednesday 30 November 2005 13:39, Marc G. Fournier wrote:
> On Wed, 30 Nov 2005, Magnus Hagander wrote:
> > Someone suggested earlier that we should drop the binaries for
> > nonsupported versions completely from the ftp site. Thoughts on this?

I'm for keeping them in some sort of archive for historical reasons. My feeling is that somewhere down the road this will be a big deal.

Robert Bernier



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 22:39  Tom Lane <[email protected]>
  parent: Joshua D. Drake <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Tom Lane @ 2005-11-30 22:39 UTC (permalink / raw)
  To: [email protected]; +Cc: Robert Bernier <[email protected]>; pgsql-www

"Joshua D. Drake" <[email protected]> writes:
> On Wed, 2005-11-30 at 17:31 -0500, Robert Bernier wrote:
>> I'm for keeping them in some sort of archive for historical reasons. My feeling is that somewhere down the road this will be a big deal.

> We always have the CVS repo, so if we remove them... not big deal.

It's not necessarily that easy to rebuild old releases --- for instance,
modern versions of bison will spit up on our older grammar files, due to
carelessness about semicolons; and newer C compilers may complain about
things that older ones let pass, too.

Unless we're feeling short of disk space on the server, I'm for leaving
them there somewhere.  But definitely mark them old and not-recommended.

			regards, tom lane



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-11-30 22:41  Joshua D. Drake <[email protected]>
  parent: Robert Bernier <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Joshua D. Drake @ 2005-11-30 22:41 UTC (permalink / raw)
  To: Robert Bernier <[email protected]>; +Cc: pgsql-www

On Wed, 2005-11-30 at 17:31 -0500, Robert Bernier wrote:
> On Wednesday 30 November 2005 13:39, Marc G. Fournier wrote:
> > On Wed, 30 Nov 2005, Magnus Hagander wrote:
> > > Someone suggested earlier that we should drop the binaries for
> > > nonsupported versions completely from the ftp site. Thoughts on this?
> 
> I'm for keeping them in some sort of archive for historical reasons. My feeling is that somewhere down the road this will be a big deal.
> 

We always have the CVS repo, so if we remove them... not big deal.

> Robert Bernier
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
-- 
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: PLphp, PLperl, ODBCng - http://www.commandprompt.com/





^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-01 01:21  Christopher Kings-Lynne <[email protected]>
  parent: Tom Lane <[email protected]>
  2 siblings, 0 replies; 55+ messages in thread

From: Christopher Kings-Lynne @ 2005-12-01 01:21 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: [email protected]

I have a COPY CSV weird thing I'll post in a minute...

Tom Lane wrote:
> It's been about a month since 8.1.0 was released, and we've found about
> the usual number of bugs for a new release, so it seems like it's time
> for 8.1.1.  The core committee has tentatively agreed to plan a release
> for Tuesday Dec 6 (which means wrapping tarballs Monday).  We will
> at the same time be making new dot-releases in the 7.3, 7.4, and 8.0
> branches, principally to fix the SLRU race condition reported by Jim
> Nasby and Robert Creager.
> 
> So ... if you've got any open issues with the back branches, now's the
> time to get those patches in ...
> 
> 			regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 01:36  David Fetter <[email protected]>
  parent: Robert Treat <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: David Fetter @ 2005-12-01 01:36 UTC (permalink / raw)
  To: Robert Treat <[email protected]>; +Cc: [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

On Wed, Nov 30, 2005 at 01:23:38PM -0500, Robert Treat wrote:
> On Wednesday 30 November 2005 11:40, Tom Lane wrote:
> > Personally I expect to keep supporting 7.3 for a long while,
> > because Red Hat pays me to ;-) ... and the EOL date for RHEL3 is a
> > long way away yet.  The PG community may stop bothering with 7.3
> > releases before that.  But I think Marc and Bruce figure "as long
> > as the patches are in our CVS we may as well put out a release".
> 
> Yeah, thats one of the reasons I am skeptical about having official
> policies on this type of thing.

I see this as an excellent reason to draw a bright, sharp line between
what vendors support and what the community as a whole does,
especially where individual community members wear another hat.

> If Sun decided they wanted to maintain 7.2 and were going to
> dedicate developers and testing for it, would we really turn that
> away?

If any company chooses to support versions that the community is no
longer supporting, that can be part of their value-add or more
properly, their headache.  Making commitments on behalf of the
community--which will be held responsible for them no matter what
happens--based on what some company says it's going to do this week is
*extremely* ill-advised.

> OK, I don't really want to have this discussion again, but as of now
> I think we are all agreed that 7.2 is unsupported. 

And it's good that we're making more definite moves to show that we no
longer support it :)

> > We hashed all this out in the pghackers list back in August, but I agree
> > there ought to be something about it on the website.
> >
> 
> We've been kicking it around but haven't moved much on this...
> 
> Marc, can you move the 7.2 branches in the FTP under the OLD directory?
> http://www.postgresql.org/ftp/source/
> 
> We need to do the same with 7.2 documentation, moving them into the Manual 
> Archive http://www.postgresql.org/docs/manuals/archive.html.  We can also 
> change the caption on the main documentation page to note these are manuals 
> for the current supported versions. 

Excellent :)

Cheers,
D
-- 
David Fetter [email protected] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-01 02:21  Andrew Dunstan <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 0 replies; 55+ messages in thread

From: Andrew Dunstan @ 2005-12-01 02:21 UTC (permalink / raw)
  To: [email protected]; +Cc: [email protected]; [email protected]

Tom Lane said:

>
> We hashed all this out in the pghackers list back in August, but I
> agree there ought to be something about it on the website.
>

The reason I asked again is that, notwithstanding the recent discussion, I
have observed confusion about the matter (including Jan telling me he didn't
think there was any agreed policy).

cheers

andrew





^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 03:56  Marc G. Fournier <[email protected]>
  parent: David Fetter <[email protected]>
  0 siblings, 2 replies; 55+ messages in thread

From: Marc G. Fournier @ 2005-12-01 03:56 UTC (permalink / raw)
  To: David Fetter <[email protected]>; +Cc: Robert Treat <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

On Wed, 30 Nov 2005, David Fetter wrote:

> On Wed, Nov 30, 2005 at 01:23:38PM -0500, Robert Treat wrote:
>> On Wednesday 30 November 2005 11:40, Tom Lane wrote:
>>> Personally I expect to keep supporting 7.3 for a long while,
>>> because Red Hat pays me to ;-) ... and the EOL date for RHEL3 is a
>>> long way away yet.  The PG community may stop bothering with 7.3
>>> releases before that.  But I think Marc and Bruce figure "as long
>>> as the patches are in our CVS we may as well put out a release".
>>
>> Yeah, thats one of the reasons I am skeptical about having official
>> policies on this type of thing.
>
> I see this as an excellent reason to draw a bright, sharp line between
> what vendors support and what the community as a whole does,
> especially where individual community members wear another hat.

So, if Sun, SRA, Pervasive, Command Prompt, etc were to submit a patch for 
v7.2, we'd refuse it?  I think not ...

Will we accept/fix a bug report *for* v7.2, that is different ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: [email protected]           Yahoo!: yscrappy              ICQ: 7615664



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 04:22  Joshua D. Drake <[email protected]>
  parent: Marc G. Fournier <[email protected]>
  1 sibling, 0 replies; 55+ messages in thread

From: Joshua D. Drake @ 2005-12-01 04:22 UTC (permalink / raw)
  To: Marc G. Fournier <[email protected]>; +Cc: David Fetter <[email protected]>; Robert Treat <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>


>>
>> I see this as an excellent reason to draw a bright, sharp line between
>> what vendors support and what the community as a whole does,
>> especially where individual community members wear another hat.
>
>
> So, if Sun, SRA, Pervasive, Command Prompt, etc were to submit a patch 
> for v7.2, we'd refuse it?  I think not ...

Oh but you should. The community has enough to worry about.

Joshua D. Drake


-- 
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: PLphp, PLperl - http://www.commandprompt.com/




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 05:15  David Fetter <[email protected]>
  parent: Marc G. Fournier <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: David Fetter @ 2005-12-01 05:15 UTC (permalink / raw)
  To: Marc G. Fournier <[email protected]>; +Cc: Robert Treat <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

On Wed, Nov 30, 2005 at 11:56:33PM -0400, Marc G. Fournier wrote:
> On Wed, 30 Nov 2005, David Fetter wrote:
> 
> >On Wed, Nov 30, 2005 at 01:23:38PM -0500, Robert Treat wrote:
> >>On Wednesday 30 November 2005 11:40, Tom Lane wrote:
> >>>Personally I expect to keep supporting 7.3 for a long while,
> >>>because Red Hat pays me to ;-) ... and the EOL date for RHEL3 is a
> >>>long way away yet.  The PG community may stop bothering with 7.3
> >>>releases before that.  But I think Marc and Bruce figure "as long
> >>>as the patches are in our CVS we may as well put out a release".
> >>
> >>Yeah, thats one of the reasons I am skeptical about having official
> >>policies on this type of thing.
> >
> >I see this as an excellent reason to draw a bright, sharp line between
> >what vendors support and what the community as a whole does,
> >especially where individual community members wear another hat.
> 
> So, if Sun, SRA, Pervasive, Command Prompt, etc were to submit a patch for 
> v7.2, we'd refuse it?

That depends on what you mean by "refuse."  Such a patch wouldn't
resurrect the original Postgres with POSTQUEL and cause us to support
it, and it won't cause us to start supporting PostgreSQL 7.2 again
either.

That said, there's a backports project on pgfoundry.  We could see
about something like an "attic" project for such patches, etc.  This
way, the community doesn't get albatrosses draped over its neck, and
the patches are available for those interested :)

Cheers,
D
-- 
David Fetter [email protected] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 10:06  Richard Huxton <[email protected]>
  parent: Robert Treat <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Richard Huxton @ 2005-12-01 10:06 UTC (permalink / raw)
  To: Robert Treat <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Marc G. Fournier <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

Robert Treat wrote:
> On Wed, 2005-11-30 at 13:33, Magnus Hagander wrote:
>> Someone suggested earlier that we should drop the binaries for
>> nonsupported versions completely from the ftp site. Thoughts on this?
>>
>> If not, they should at least go into OLD as well. But personally, I'm
>> for dropping them completely. If you're on something that old (heck, we
>> have 7.0 binaries..), you can still build from source.
>>
> 
> I'm against the idea... the cost for us is minimal, and the hassle
> involved in building from source is quite large. 

I don't have a need for an old PG binary. But when I have needed really 
old binaries it's always been in the middle of the night, in front of a 
machine with a teletype terminal, in the dark, surrounded by wolves 
while a timer ticks into the red... Locating the right versions of 17 
different libraries and compiling from source is always my second choice.

If it's practical to keep them, I'd like to suggest doing so. If it's 
not practical, could we have a where_to_find_old_versions.txt file and 
open a project on sourceforge to keep them?

-- 
   Richard Huxton
   Archonet Ltd



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 10:35  Euler Taveira de Oliveira <[email protected]>
  parent: Richard Huxton <[email protected]>
  0 siblings, 2 replies; 55+ messages in thread

From: Euler Taveira de Oliveira @ 2005-12-01 10:35 UTC (permalink / raw)
  To: Richard Huxton <[email protected]>; Robert Treat <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Marc G. Fournier <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

--- Richard Huxton <[email protected]> escreveu:

> If it's practical to keep them, I'd like to suggest doing so. If it's
> not practical, could we have a where_to_find_old_versions.txt file
> and 
> open a project on sourceforge to keep them?
> 
What about an museum.postgresql.org to keep the old releases?


Euler Taveira de Oliveira
euler[at]yahoo_com_br


	



	
		
_______________________________________________________ 
Yahoo! doce lar. Faça do Yahoo! sua homepage. 
http://br.yahoo.com/homepageset.html 




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 12:09  Peter Eisentraut <[email protected]>
  parent: Euler Taveira de Oliveira <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: Peter Eisentraut @ 2005-12-01 12:09 UTC (permalink / raw)
  To: [email protected]; +Cc: Euler Taveira de Oliveira <[email protected]>; Richard Huxton <[email protected]>; Robert Treat <[email protected]>; Magnus Hagander <[email protected]>; Marc G. Fournier <[email protected]>; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

Am Donnerstag, 1. Dezember 2005 11:35 schrieb Euler Taveira de Oliveira:
> What about an museum.postgresql.org to keep the old releases?

That gave me a good laugh, but there is something to be said about moving all 
no longer supported releases (according to the criteria that are being 
discussed) to an unmirrored site, say, archive.postgresql.org.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [pgsql-www] Upcoming PG re-releases
@ 2005-12-01 12:20  Csaba Nagy <[email protected]>
  parent: Euler Taveira de Oliveira <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: Csaba Nagy @ 2005-12-01 12:20 UTC (permalink / raw)
  To: Euler Taveira de Oliveira <[email protected]>; +Cc: Richard Huxton <[email protected]>; Robert Treat <[email protected]>; Magnus Hagander <[email protected]>; Marc G. Fournier <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

Maybe "mausoleum" would be even better name :-D

Cheers,
Csaba.

On Thu, 2005-12-01 at 11:35, Euler Taveira de Oliveira wrote:
> --- Richard Huxton <[email protected]> escreveu:
> 
> > If it's practical to keep them, I'd like to suggest doing so. If it's
> > not practical, could we have a where_to_find_old_versions.txt file
> > and 
> > open a project on sourceforge to keep them?
> > 
> What about an museum.postgresql.org to keep the old releases?
> 
> 
> Euler Taveira de Oliveira
> euler[at]yahoo_com_br
> 
> 
> 	
> 
> 
> 
> 	
> 		
> _______________________________________________________ 
> Yahoo! doce lar. Faça do Yahoo! sua homepage. 
> http://br.yahoo.com/homepageset.html
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to [email protected] so that your
>        message can get through to the mailing list cleanly




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [pgsql-www] Upcoming PG re-releases
@ 2005-12-01 12:37  Richard Huxton <[email protected]>
  parent: Csaba Nagy <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Richard Huxton @ 2005-12-01 12:37 UTC (permalink / raw)
  To: Csaba Nagy <[email protected]>; +Cc: Euler Taveira de Oliveira <[email protected]>; Robert Treat <[email protected]>; Magnus Hagander <[email protected]>; Marc G. Fournier <[email protected]>; [email protected]; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

Csaba Nagy wrote:
> Maybe "mausoleum" would be even better name :-D

Come on people, it's clearly: elephants-graveyard.postgresl.org

-- 
   Richard Huxton
   Archonet Ltd



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 17:00  Marc G. Fournier <[email protected]>
  parent: Peter Eisentraut <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Marc G. Fournier @ 2005-12-01 17:00 UTC (permalink / raw)
  To: Peter Eisentraut <[email protected]>; +Cc: [email protected]; Euler Taveira de Oliveira <[email protected]>; Richard Huxton <[email protected]>; Robert Treat <[email protected]>; Magnus Hagander <[email protected]>; Marc G. Fournier <[email protected]>; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

On Thu, 1 Dec 2005, Peter Eisentraut wrote:

> Am Donnerstag, 1. Dezember 2005 11:35 schrieb Euler Taveira de Oliveira:
>> What about an museum.postgresql.org to keep the old releases?
>
> That gave me a good laugh, but there is something to be said about moving all
> no longer supported releases (according to the criteria that are being
> discussed) to an unmirrored site, say, archive.postgresql.org.

That would be fairly trivial ... let me add it to the 'todo list' ... I 
take it that it would be safe to relegate the /pub/source/OLD stuff there 
too?

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: [email protected]           Yahoo!: yscrappy              ICQ: 7615664




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 17:10  Dave Page <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Dave Page @ 2005-12-01 17:10 UTC (permalink / raw)
  To: Marc G. Fournier <[email protected]>; Peter Eisentraut <[email protected]>; +Cc: [email protected]; Euler Taveira de Oliveira <[email protected]>; Richard Huxton <[email protected]>; Robert Treat <[email protected]>; Magnus Hagander <[email protected]>; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

 

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Marc G. Fournier
> Sent: 01 December 2005 17:01
> To: Peter Eisentraut
> Cc: [email protected]; Euler Taveira de Oliveira; 
> Richard Huxton; Robert Treat; Magnus Hagander; Marc G. 
> Fournier; [email protected]; Tom Lane; Andrew Dunstan
> Subject: Re: [pgsql-www] [HACKERS] Upcoming PG re-releases
> 
> On Thu, 1 Dec 2005, Peter Eisentraut wrote:
> 
> > Am Donnerstag, 1. Dezember 2005 11:35 schrieb Euler Taveira 
> de Oliveira:
> >> What about an museum.postgresql.org to keep the old releases?
> >
> > That gave me a good laugh, but there is something to be 
> said about moving all
> > no longer supported releases (according to the criteria 
> that are being
> > discussed) to an unmirrored site, say, archive.postgresql.org.
> 
> That would be fairly trivial ... let me add it to the 'todo 
> list' ... I 
> take it that it would be safe to relegate the /pub/source/OLD 
> stuff there 
> too?

Not so trivial to put behind a web interface or the download tracker
though. Is it really necessary to have a separate archive downloads
site? It's not like the old ones get in the way, or cost anything other
than disk space on the mirrors to store (and I've only ever heard mirror
admins say how small our site is compared to many others!).

Plus of course, weren't we trying to reduce the number of VMs/sites?

Regards, Dave.



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 17:21  Andrew Dunstan <[email protected]>
  parent: Dave Page <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Andrew Dunstan @ 2005-12-01 17:21 UTC (permalink / raw)
  To: Dave Page <[email protected]>; +Cc: Marc G. Fournier <[email protected]>; Peter Eisentraut <[email protected]>; [email protected]; Euler Taveira de Oliveira <[email protected]>; Richard Huxton <[email protected]>; Robert Treat <[email protected]>; Magnus Hagander <[email protected]>; pgsql-www; Tom Lane <[email protected]>



Dave Page wrote:

> 
>
>  
>
>>-----Original Message-----
>>From: [email protected] 
>>[mailto:[email protected]] On Behalf Of Marc G. Fournier
>>Sent: 01 December 2005 17:01
>>To: Peter Eisentraut
>>Cc: [email protected]; Euler Taveira de Oliveira; 
>>Richard Huxton; Robert Treat; Magnus Hagander; Marc G. 
>>Fournier; [email protected]; Tom Lane; Andrew Dunstan
>>Subject: Re: [pgsql-www] [HACKERS] Upcoming PG re-releases
>>
>>On Thu, 1 Dec 2005, Peter Eisentraut wrote:
>>
>>    
>>
>>>Am Donnerstag, 1. Dezember 2005 11:35 schrieb Euler Taveira 
>>>      
>>>
>>de Oliveira:
>>    
>>
>>>>What about an museum.postgresql.org to keep the old releases?
>>>>        
>>>>
>>>That gave me a good laugh, but there is something to be 
>>>      
>>>
>>said about moving all
>>    
>>
>>>no longer supported releases (according to the criteria 
>>>      
>>>
>>that are being
>>    
>>
>>>discussed) to an unmirrored site, say, archive.postgresql.org.
>>>      
>>>
>>That would be fairly trivial ... let me add it to the 'todo 
>>list' ... I 
>>take it that it would be safe to relegate the /pub/source/OLD 
>>stuff there 
>>too?
>>    
>>
>
>Not so trivial to put behind a web interface or the download tracker
>though. Is it really necessary to have a separate archive downloads
>site? It's not like the old ones get in the way, or cost anything other
>than disk space on the mirrors to store (and I've only ever heard mirror
>admins say how small our site is compared to many others!).
>
>Plus of course, weren't we trying to reduce the number of VMs/sites?
>
>
>  
>

Agreed. I see no virtue in this at all. If we continue to make stuff 
available it must be because someone will need it. I can see that 
happening if some catastrophe happens on an old system, in which case 
the person hunting is likely to need to find it easily and possibly fast.

The network traffic involved in mirroring something that doesn't change 
is usually trivial, and disk space seems to be at most a few $ per Gb 
these days, so surely this is not a resource issue.

cheers

andrew




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [HACKERS] Upcoming PG re-releases
@ 2005-12-01 18:26  Magnus Hagander <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Magnus Hagander @ 2005-12-01 18:26 UTC (permalink / raw)
  To: Dave Page <[email protected]>; Marc G. Fournier <[email protected]>; Peter Eisentraut <[email protected]>; +Cc: [email protected]; Euler Taveira de Oliveira <[email protected]>; Richard Huxton <[email protected]>; Robert Treat <[email protected]>; pgsql-www; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>

> > That would be fairly trivial ... let me add it to the 'todo 
> list' ... 
> > I take it that it would be safe to relegate the 
> /pub/source/OLD stuff 
> > there too?
> 
> Not so trivial to put behind a web interface or the download 
> tracker though. Is it really necessary to have a separate 
> archive downloads site? It's not like the old ones get in the 
> way, or cost anything other than disk space on the mirrors to 
> store (and I've only ever heard mirror admins say how small 
> our site is compared to many others!).
> 
> Plus of course, weren't we trying to reduce the number of VMs/sites?

Agreed. If we're going to keep it, just sticking it in a /old/ directory
is definitly a lot better.

//Magnus




^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-03 15:54  Bruce Momjian <[email protected]>
  parent: Neil Conway <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-03 15:54 UTC (permalink / raw)
  To: Neil Conway <[email protected]>; +Cc: Tom Lane <[email protected]>; [email protected]

Neil Conway wrote:
> On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > It's been about a month since 8.1.0 was released, and we've found about
> > the usual number of bugs for a new release, so it seems like it's time
> > for 8.1.1.
> 
> I think one fix that should be made in time for 8.1.1 is adding a note
> to the "version migration" section of the 8.1 release notes describing
> the "invalid UTF-8 byte sequence" problems that some people have run
> into when upgrading from prior versions. I'm not familiar enough with
> the problem or its remedies to add the note myself, though.

Agreed, but I don't understand the problem well enough either.  Does
anyone?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-03 15:56  Neil Conway <[email protected]>
  parent: Tom Lane <[email protected]>
  2 siblings, 1 reply; 55+ messages in thread

From: Neil Conway @ 2005-12-03 15:56 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: [email protected]

On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> It's been about a month since 8.1.0 was released, and we've found about
> the usual number of bugs for a new release, so it seems like it's time
> for 8.1.1.

I think one fix that should be made in time for 8.1.1 is adding a note
to the "version migration" section of the 8.1 release notes describing
the "invalid UTF-8 byte sequence" problems that some people have run
into when upgrading from prior versions. I'm not familiar enough with
the problem or its remedies to add the note myself, though.

-Neil





^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: [pgsql-www] Upcoming PG re-releases
@ 2005-12-03 16:10  Kevin Brown <[email protected]>
  parent: David Fetter <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Kevin Brown @ 2005-12-03 16:10 UTC (permalink / raw)
  To: [email protected]

David Fetter wrote:
> On Wed, Nov 30, 2005 at 11:56:33PM -0400, Marc G. Fournier wrote:
> > So, if Sun, SRA, Pervasive, Command Prompt, etc were to submit a patch for 
> > v7.2, we'd refuse it?
> 
> That depends on what you mean by "refuse."  Such a patch wouldn't
> resurrect the original Postgres with POSTQUEL and cause us to support
> it, and it won't cause us to start supporting PostgreSQL 7.2 again
> either.

Okay, but suppose the patch in question breaks the version in question
in some subtle but horrible way?  If the community isn't "supporting"
the release in question then it implies that it won't go to the effort
of testing the patch, subjecting it to a beta period, etc.  But since
the patch would be applied by the community, the implication would be
that the community *endorses* the patch in question, since the
official source would be changed to reflect it.  If the patch breaks
the release horribly, just blindly accepting it wouldn't do good
things to the community's reputation.

And that means that the only really good way to guard against such an
occurrance is to subject the patch to the same process that's used for
officially supported releases.  At that point, there's no real
distinction between "officially supported" and "not officially
supported".  I doubt the community wants to go down that road.


The acceptance of a patch by the community probably implies a lot more
than one would think at first glance, so this is certainly an issue
that should be thought all the way through.



-- 
Kevin Brown					      [email protected]



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-04 16:25  Paul Lindner <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 2 replies; 55+ messages in thread

From: Paul Lindner @ 2005-12-04 16:25 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Neil Conway <[email protected]>; Tom Lane <[email protected]>; [email protected]

On Sat, Dec 03, 2005 at 10:54:08AM -0500, Bruce Momjian wrote:
> Neil Conway wrote:
> > On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > > It's been about a month since 8.1.0 was released, and we've found about
> > > the usual number of bugs for a new release, so it seems like it's time
> > > for 8.1.1.
> > 
> > I think one fix that should be made in time for 8.1.1 is adding a note
> > to the "version migration" section of the 8.1 release notes describing
> > the "invalid UTF-8 byte sequence" problems that some people have run
> > into when upgrading from prior versions. I'm not familiar enough with
> > the problem or its remedies to add the note myself, though.
> 
> Agreed, but I don't understand the problem well enough either.  Does
> anyone?

There was a thread a couple of weeks back about this problem.  Here's
my sample writeup -- I give my permission for anyone to use it as they
see fit:


Upgrading UNICODE databases to 8.1

Postgres 8.1 includes a number of bug-fixes and improvements to
Unicode and UTF-8 character handling.  Unfortunately previous releases
would accept character sequences that were not valid UTF-8.  This
may cause problems when upgrading your database using
pg_dump/pg_restore resulting in an error message like this:

  Invalid UNICODE byte sequence detected near byte ...

To convert your pre-8.1 database to 8.1 you may have to remove and/or
fix the offending characters.  One simple way to fix the problem is to
run your pg_dump output through the iconv command like this:

  iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql

The -c flag tells iconv to omit invalid characters from output.

There is one problem with this.  Most versions of iconv try to read
the entire input file into memory.  If you dump is quite large you
will need to split the dump into multiple files and convert each one
individually.  You must use the -l flag for split to insure that the
unicode byte sequences are not split.

   split -l 10000 dump.sql

Another possible solution is to use the --inserts flag to pg_dump.
When you load the resulting data dump in 8.1 this will result in the
problem rows showing up in your error log.

-- 
Paul Lindner        ||||| | | | |  |  |  |   |   |
[email protected]


^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-04 16:34  Tom Lane <[email protected]>
  parent: Paul Lindner <[email protected]>
  1 sibling, 2 replies; 55+ messages in thread

From: Tom Lane @ 2005-12-04 16:34 UTC (permalink / raw)
  To: Paul Lindner <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Neil Conway <[email protected]>; [email protected]

Paul Lindner <[email protected]> writes:
> To convert your pre-8.1 database to 8.1 you may have to remove and/or
> fix the offending characters.  One simple way to fix the problem is to
> run your pg_dump output through the iconv command like this:

>   iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql

Is that really a one-size-fits-all solution?  Especially with -c?

			regards, tom lane



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-04 16:40  Paul Lindner <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: Paul Lindner @ 2005-12-04 16:40 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Paul Lindner <[email protected]>; Bruce Momjian <[email protected]>; Neil Conway <[email protected]>; [email protected]

On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
> Paul Lindner <[email protected]> writes:
> > To convert your pre-8.1 database to 8.1 you may have to remove and/or
> > fix the offending characters.  One simple way to fix the problem is to
> > run your pg_dump output through the iconv command like this:
> 
> >   iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> 
> Is that really a one-size-fits-all solution?  Especially with -c?
> 

I'd say yes, and the -c flag is needed so iconv strips out the
invalid characters.  

This technique worked for some smaller databases I converted and
croaked with out-of-memory on the larger ones.

It certainly doesn't make the problem worse.

If one wanted to fix this in the general case one could duplicate the
iconv behavior in the Postgres code via some kind of special
flag/setting that is only used for imports..

  set strip_bad_utf8 = on


-- 
Paul Lindner        ||||| | | | |  |  |  |   |   |
[email protected]


^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-04 16:52  Tom Lane <[email protected]>
  parent: Paul Lindner <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Tom Lane @ 2005-12-04 16:52 UTC (permalink / raw)
  To: Paul Lindner <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Neil Conway <[email protected]>; [email protected]

Paul Lindner <[email protected]> writes:
> On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
>> Paul Lindner <[email protected]> writes:
>>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
>>
>> Is that really a one-size-fits-all solution?  Especially with -c?

> I'd say yes, and the -c flag is needed so iconv strips out the
> invalid characters.

That's exactly what's bothering me about it.  If we recommend that
we had better put a large THIS WILL DESTROY YOUR DATA warning first.
The problem is that the data is not "invalid" from the user's point
of view --- more likely, it's in some non-UTF8 encoding --- and so
just throwing away some of the characters is unlikely to make people
happy.

			regards, tom lane



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Upcoming PG re-releases
@ 2005-12-04 17:19  Gregory Maxwell <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Gregory Maxwell @ 2005-12-04 17:19 UTC (permalink / raw)
  To: [email protected]

On 12/4/05, Tom Lane <[email protected]> wrote:
> Paul Lindner <[email protected]> writes:
> > On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
> >> Paul Lindner <[email protected]> writes:
> >>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> >>
> >> Is that really a one-size-fits-all solution?  Especially with -c?
>
> > I'd say yes, and the -c flag is needed so iconv strips out the
> > invalid characters.
>
> That's exactly what's bothering me about it.  If we recommend that
> we had better put a large THIS WILL DESTROY YOUR DATA warning first.
> The problem is that the data is not "invalid" from the user's point
> of view --- more likely, it's in some non-UTF8 encoding --- and so
> just throwing away some of the characters is unlikely to make people
> happy.

Nor is it even guarenteed to make the data load: If the column is
unique constrained and the removal of the non-UTF characters makes two
rows have the same data where they didn't before...

The way to preserve the data is to switch the column to be a bytea.



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-04 18:55  Martijn van Oosterhout <[email protected]>
  parent: Gregory Maxwell <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Martijn van Oosterhout @ 2005-12-04 18:55 UTC (permalink / raw)
  To: Gregory Maxwell <[email protected]>; +Cc: [email protected]

On Sun, Dec 04, 2005 at 12:19:32PM -0500, Gregory Maxwell wrote:
> > That's exactly what's bothering me about it.  If we recommend that
> > we had better put a large THIS WILL DESTROY YOUR DATA warning first.
> > The problem is that the data is not "invalid" from the user's point
> > of view --- more likely, it's in some non-UTF8 encoding --- and so
> > just throwing away some of the characters is unlikely to make people
> > happy.
> 
> Nor is it even guarenteed to make the data load: If the column is
> unique constrained and the removal of the non-UTF characters makes two
> rows have the same data where they didn't before...
> 
> The way to preserve the data is to switch the column to be a bytea.

Additionally, it's hard to suggest anything better without specific
knowledge of the characters that are incorrect and how they got there.

The ideal solution would be a way for people to identify problem data
*before* they dump so they have an opportunity to fix it. Something
like a module they can load and say:

select val from table where not utf8_validate(val);

This would allow people to examine the data while the system is still
running and fix it. Maybe we can code something up in plpgsql? Slow as
molasses but you'll be able to run it anywhere.

Have a nice day,
-- 
Martijn van Oosterhout   <[email protected]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-05 04:43  Gavin Sherry <[email protected]>
  parent: Tom Lane <[email protected]>
  1 sibling, 0 replies; 55+ messages in thread

From: Gavin Sherry @ 2005-12-05 04:43 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Paul Lindner <[email protected]>; Bruce Momjian <[email protected]>; Neil Conway <[email protected]>; [email protected]

Hi all,

On Sun, 4 Dec 2005, Tom Lane wrote:

> Paul Lindner <[email protected]> writes:
> > To convert your pre-8.1 database to 8.1 you may have to remove and/or
> > fix the offending characters.  One simple way to fix the problem is to
> > run your pg_dump output through the iconv command like this:
>
> >   iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
>
> Is that really a one-size-fits-all solution?  Especially with -c?
>

It's definately not a one size fits all. The reassuring thing is that
others have tried to deal with this problem before.

Omar Kilani and I have spent a few hours looking at the problem. For
situations where there is a lot of invalid encoding, manual fixing is just
not viable. The vim project has a kind of fuzzy encoding conversion which
accounts for a lot of the non-UTF8 sequences in UTF8 data. You can use vim
to modify your text dump as follows:

vim -c ":wq! ++enc=utf8 fixed.dump" original.dump

Now, our testing of this is far from exhaustive but it's a lot better than
just cutting the data from the original dump. Those suffering the problem
should definately check this out, particularly if you have a non-trivial
amount of data.

Thanks,

Gavin



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-06 19:26  Bruce Momjian <[email protected]>
  parent: Paul Lindner <[email protected]>
  1 sibling, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-06 19:26 UTC (permalink / raw)
  To: Paul Lindner <[email protected]>; +Cc: Neil Conway <[email protected]>; Tom Lane <[email protected]>; [email protected]


I have added your suggestions to the 8.1.X release notes.

---------------------------------------------------------------------------

Paul Lindner wrote:
-- Start of PGP signed section.
> On Sat, Dec 03, 2005 at 10:54:08AM -0500, Bruce Momjian wrote:
> > Neil Conway wrote:
> > > On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > > > It's been about a month since 8.1.0 was released, and we've found about
> > > > the usual number of bugs for a new release, so it seems like it's time
> > > > for 8.1.1.
> > > 
> > > I think one fix that should be made in time for 8.1.1 is adding a note
> > > to the "version migration" section of the 8.1 release notes describing
> > > the "invalid UTF-8 byte sequence" problems that some people have run
> > > into when upgrading from prior versions. I'm not familiar enough with
> > > the problem or its remedies to add the note myself, though.
> > 
> > Agreed, but I don't understand the problem well enough either.  Does
> > anyone?
> 
> There was a thread a couple of weeks back about this problem.  Here's
> my sample writeup -- I give my permission for anyone to use it as they
> see fit:
> 
> 
> Upgrading UNICODE databases to 8.1
> 
> Postgres 8.1 includes a number of bug-fixes and improvements to
> Unicode and UTF-8 character handling.  Unfortunately previous releases
> would accept character sequences that were not valid UTF-8.  This
> may cause problems when upgrading your database using
> pg_dump/pg_restore resulting in an error message like this:
> 
>   Invalid UNICODE byte sequence detected near byte ...
> 
> To convert your pre-8.1 database to 8.1 you may have to remove and/or
> fix the offending characters.  One simple way to fix the problem is to
> run your pg_dump output through the iconv command like this:
> 
>   iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> 
> The -c flag tells iconv to omit invalid characters from output.
> 
> There is one problem with this.  Most versions of iconv try to read
> the entire input file into memory.  If you dump is quite large you
> will need to split the dump into multiple files and convert each one
> individually.  You must use the -l flag for split to insure that the
> unicode byte sequences are not split.
> 
>    split -l 10000 dump.sql
> 
> Another possible solution is to use the --inserts flag to pg_dump.
> When you load the resulting data dump in 8.1 this will result in the
> problem rows showing up in your error log.
> 
> -- 
> Paul Lindner        ||||| | | | |  |  |  |   |   |
> [email protected]
-- End of PGP section, PGP failed!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-06 19:27  Tom Lane <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Tom Lane @ 2005-12-06 19:27 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Paul Lindner <[email protected]>; Neil Conway <[email protected]>; [email protected]

Bruce Momjian <[email protected]> writes:
> I have added your suggestions to the 8.1.X release notes.

Did you read the followup discussion?  Recommending -c without a large
warning seems a very bad idea.

			regards, tom lane



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-06 19:32  Bruce Momjian <[email protected]>
  parent: Tom Lane <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-06 19:32 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Paul Lindner <[email protected]>; Neil Conway <[email protected]>; [email protected]

Tom Lane wrote:
> Bruce Momjian <[email protected]> writes:
> > I have added your suggestions to the 8.1.X release notes.
> 
> Did you read the followup discussion?  Recommending -c without a large
> warning seems a very bad idea.

Well, I said it would remove invalid sequences.  What else should we
say?

	This will remove invalid character sequences.

I saw no clear solution that allowed sequences to be corrected.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-06 20:25  Bruce Momjian <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-06 20:25 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Tom Lane <[email protected]>; Paul Lindner <[email protected]>; Neil Conway <[email protected]>; [email protected]

Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <[email protected]> writes:
> > > I have added your suggestions to the 8.1.X release notes.
> > 
> > Did you read the followup discussion?  Recommending -c without a large
> > warning seems a very bad idea.
> 
> Well, I said it would remove invalid sequences.  What else should we
> say?
> 
> 	This will remove invalid character sequences.
> 
> I saw no clear solution that allowed sequences to be corrected.

The release note text is:

	Some users are having problems loading <literal>UTF8</> data into 8.1.X.
	 This is because previous versions allowed invalid <literal>UTF8</>
	sequences to be entered into the database, and this release properly
	accepts only valid <literal>UTF8</> sequences.	One way to correct a
	dumpfile is to use <command>iconv -c -f UTF-8 -t UTF-8</>. This will
	remove invalid character sequences. <command>iconv</> reads the entire
	input file into memory so it might be necessary to <command>split</> the
	dump into multiple smaller files for processing.

One nice solution would be if iconv would report the lines with errors
and you could correct them, but I see no way to do that.  The only thing
you could do is to diff the old and new files to see the problems.  Is
that helpful?  Here is new text I have used:

	Some users are having problems loading <literal>UTF8</> data into 8.1.X.
	 This is because previous versions allowed invalid <literal>UTF8</>
	sequences to be entered into the database, and this release properly
	accepts only valid <literal>UTF8</> sequences.  One way to correct a
	dumpfile is to use <command>iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql
	dumpfile.sql</>.  The <literal>-c</> option removes invalid character
	sequences.  A diff of the two files will show the sequences that are
	invalid.  <command>iconv</> reads the entire input file into memory so
	it might be necessary to <command>split</> the dump into multiple
	smaller files for processing.

It highlights the 'diff' idea.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-06 20:47  Peter Eisentraut <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Peter Eisentraut @ 2005-12-06 20:47 UTC (permalink / raw)
  To: [email protected]; +Cc: Bruce Momjian <[email protected]>

Bruce Momjian wrote:
> One nice solution would be if iconv would report the lines with
> errors and you could correct them, but I see no way to do that.  The
> only thing you could do is to diff the old and new files to see the
> problems.  Is that helpful?  Here is new text I have used:

I think this is nice.  It users see a big mess, they will have to clean
it up by hand anyway.

How about this for better wording:

diff -u -3 -p -r1.400.2.4 release.sgml
--- doc/src/sgml/release.sgml   6 Dec 2005 20:26:02 -0000       1.400.2.4
+++ doc/src/sgml/release.sgml   6 Dec 2005 20:44:26 -0000
@@ -528,15 +528,16 @@ psql -t -f fixseq.sql db1 | psql -e db1

      <listitem>
       <para>
-       Some users are having problems loading <literal>UTF8</> data into
-       8.1.X.  This is because previous versions allowed invalid <literal>UTF8</>
+       Some users are having problems loading UTF-8 data into
+       8.1.X.  This is because previous versions allowed invalid UTF-8 byte
        sequences to be entered into the database, and this release
-       properly accepts only valid <literal>UTF8</> sequences.  One
-       way to correct a dumpfile is to use <command>iconv -c -f UTF-8 -t UTF-8
+       properly accepts only valid UTF-8 sequences.  One
+       way to correct a dumpfile is to run the command <command>iconv -c -f UTF-8 -t UTF-8
        -o cleanfile.sql dumpfile.sql</>.  The <literal>-c</> option removes
        invalid character sequences.  A diff of the two files will show the
        sequences that are invalid.  <command>iconv</> reads the entire input
-       file into memory so it might be necessary to <command>split</> the dump
+       file into memory so it might be necessary to use <command>split</>
+       to break up the dump
        into multiple smaller files for processing.
       </para>
      </listitem>

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-06 21:00  Bruce Momjian <[email protected]>
  parent: Peter Eisentraut <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-06 21:00 UTC (permalink / raw)
  To: Peter Eisentraut <[email protected]>; +Cc: [email protected]


Nice, updated.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > One nice solution would be if iconv would report the lines with
> > errors and you could correct them, but I see no way to do that.  The
> > only thing you could do is to diff the old and new files to see the
> > problems.  Is that helpful?  Here is new text I have used:
> 
> I think this is nice.  It users see a big mess, they will have to clean
> it up by hand anyway.
> 
> How about this for better wording:
> 
> diff -u -3 -p -r1.400.2.4 release.sgml
> --- doc/src/sgml/release.sgml   6 Dec 2005 20:26:02 -0000       1.400.2.4
> +++ doc/src/sgml/release.sgml   6 Dec 2005 20:44:26 -0000
> @@ -528,15 +528,16 @@ psql -t -f fixseq.sql db1 | psql -e db1
> 
>       <listitem>
>        <para>
> -       Some users are having problems loading <literal>UTF8</> data into
> -       8.1.X.  This is because previous versions allowed invalid <literal>UTF8</>
> +       Some users are having problems loading UTF-8 data into
> +       8.1.X.  This is because previous versions allowed invalid UTF-8 byte
>         sequences to be entered into the database, and this release
> -       properly accepts only valid <literal>UTF8</> sequences.  One
> -       way to correct a dumpfile is to use <command>iconv -c -f UTF-8 -t UTF-8
> +       properly accepts only valid UTF-8 sequences.  One
> +       way to correct a dumpfile is to run the command <command>iconv -c -f UTF-8 -t UTF-8
>         -o cleanfile.sql dumpfile.sql</>.  The <literal>-c</> option removes
>         invalid character sequences.  A diff of the two files will show the
>         sequences that are invalid.  <command>iconv</> reads the entire input
> -       file into memory so it might be necessary to <command>split</> the dump
> +       file into memory so it might be necessary to use <command>split</>
> +       to break up the dump
>         into multiple smaller files for processing.
>        </para>
>       </listitem>
> 
> -- 
> Peter Eisentraut
> http://developer.postgresql.org/~petere/
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-07 03:45  Gavin Sherry <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Gavin Sherry @ 2005-12-07 03:45 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; [email protected]

Hi,

On Tue, 6 Dec 2005, Bruce Momjian wrote:

>
> Nice, updated.
>
> ---------------------------------------------------------------------------
>

I think my suggestion from the other day is useful also.

---

Omar Kilani and I have spent a few hours looking at the problem. For
situations where there is a lot of invalid encoding, manual fixing is just
not viable. The vim project has a kind of fuzzy encoding conversion which
accounts for a lot of the non-UTF8 sequences in UTF8 data. You can use vim
to modify your text dump as follows:

vim -c ":wq! ++enc=utf8 fixed.dump" original.dump

---

I think this is a viable option for people with a non-trivial amount of
data and don't see manual fixing or potentially losing data as a viable
option.

Thanks,

Gavin



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-07 04:37  Bruce Momjian <[email protected]>
  parent: Gavin Sherry <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-07 04:37 UTC (permalink / raw)
  To: Gavin Sherry <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; [email protected]


Exactly what does vim do that iconv does not?  Fuzzy encoding sounds
scary to me.

---------------------------------------------------------------------------

Gavin Sherry wrote:
> Hi,
> 
> On Tue, 6 Dec 2005, Bruce Momjian wrote:
> 
> >
> > Nice, updated.
> >
> > ---------------------------------------------------------------------------
> >
> 
> I think my suggestion from the other day is useful also.
> 
> ---
> 
> Omar Kilani and I have spent a few hours looking at the problem. For
> situations where there is a lot of invalid encoding, manual fixing is just
> not viable. The vim project has a kind of fuzzy encoding conversion which
> accounts for a lot of the non-UTF8 sequences in UTF8 data. You can use vim
> to modify your text dump as follows:
> 
> vim -c ":wq! ++enc=utf8 fixed.dump" original.dump
> 
> ---
> 
> I think this is a viable option for people with a non-trivial amount of
> data and don't see manual fixing or potentially losing data as a viable
> option.
> 
> Thanks,
> 
> Gavin
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-08 22:37  Gavin Sherry <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Gavin Sherry @ 2005-12-08 22:37 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; [email protected]

On Tue, 6 Dec 2005, Bruce Momjian wrote:

>
> Exactly what does vim do that iconv does not?  Fuzzy encoding sounds
> scary to me.
>

Right. It actually makes assumptions about the source encoding. People who
care about their data need, unfortunately, to spend a bit of time on this
problem. I've been discussing the same issue on the slony1 mailing list,
because the issue can affect people's ability upgrade using slony1.

http://gborg.postgresql.org/pipermail/slony1-general/2005-December/003430.html

It would be good if had the script I suggest in the email:

	A script which identifies non-utf-8 characters and provides some
	context, line numbers, etc, will greatly speed up the process of
	remedying the situation.

Thoughts?

Gavin



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-08 22:44  Bruce Momjian <[email protected]>
  parent: Gavin Sherry <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-08 22:44 UTC (permalink / raw)
  To: Gavin Sherry <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; [email protected]

Gavin Sherry wrote:
> On Tue, 6 Dec 2005, Bruce Momjian wrote:
> 
> >
> > Exactly what does vim do that iconv does not?  Fuzzy encoding sounds
> > scary to me.
> >
> 
> Right. It actually makes assumptions about the source encoding. People who
> care about their data need, unfortunately, to spend a bit of time on this
> problem. I've been discussing the same issue on the slony1 mailing list,
> because the issue can affect people's ability upgrade using slony1.
> 
> http://gborg.postgresql.org/pipermail/slony1-general/2005-December/003430.html
> 
> It would be good if had the script I suggest in the email:
> 
> 	A script which identifies non-utf-8 characters and provides some
> 	context, line numbers, etc, will greatly speed up the process of
> 	remedying the situation.

I think the best we can do is the "iconv -c with the diff" idea, which
is already in the release notes.  I suppose we could merge the iconv and
diff into a single command, but I don't see a portable way to output the
iconv output to stdout., /dev/stdin not being portable.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-08 22:54  Gregory Maxwell <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Gregory Maxwell @ 2005-12-08 22:54 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>; [email protected]

On 12/8/05, Bruce Momjian <[email protected]> wrote:
> >       A script which identifies non-utf-8 characters and provides some
> >       context, line numbers, etc, will greatly speed up the process of
> >       remedying the situation.
>
> I think the best we can do is the "iconv -c with the diff" idea, which
> is already in the release notes.  I suppose we could merge the iconv and
> diff into a single command, but I don't see a portable way to output the
> iconv output to stdout., /dev/stdin not being portable.

No, what is needed for people who care about fixing their data is a
loadable strip_invalid_utf8() that works in older versions.. then just
select * from bar where foo != strip_invalid_utf8(foo);  The function
would be useful in general, for example, if you have an application
which doesn't already have much utf8 logic, you want to use a text
field, and stripping is the behaviour you want. For example, lots of
simple web applications.



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-09 16:17  Martijn van Oosterhout <[email protected]>
  parent: Gregory Maxwell <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Martijn van Oosterhout @ 2005-12-09 16:17 UTC (permalink / raw)
  To: Gregory Maxwell <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>; [email protected]

On Thu, Dec 08, 2005 at 05:54:35PM -0500, Gregory Maxwell wrote:
> No, what is needed for people who care about fixing their data is a
> loadable strip_invalid_utf8() that works in older versions.. then just
> select * from bar where foo != strip_invalid_utf8(foo);  The function
> would be useful in general, for example, if you have an application
> which doesn't already have much utf8 logic, you want to use a text
> field, and stripping is the behaviour you want. For example, lots of
> simple web applications.

Would something like the following work? It's written in pl/pgsql and
does (AFAICS) the same checking as the backend in recent releases.
Except the backend only supports up to 4-byte UTF-8 whereas this
function checks upto six byte. For a six byte UTF-8 character, who is
wrong?

In any case, people should be able to do something like:

SELECT field FROM table WHERE NOT utf8_verify(field,4);

To check conformance with PostgreSQL 8.1. Note, I don't have large
chunks of UTF-8 to test with but it works for the characters I tried
with. Tested with 7.4.

Have a nice day,
-- 
Martijn van Oosterhout   <[email protected]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

create or replace function utf8_verify(bytea,integer) returns bool as '
DECLARE
   str ALIAS FOR $1;
   maxlen ALIAS FOR $2;
   strlen INTEGER;
   i integer;
   j INTEGER;
   len integer;
   chr integer;
   wchr integer;
BEGIN
   i := 0;
   strlen := length(str);

   WHILE i < strlen LOOP

     -- Check leading byte
     chr := get_byte(str,i);

     IF chr < 128 THEN     -- 0x00 - 0x80   - single byte
        len := 1;
        wchr := chr;
     ELSIF chr < 192 THEN  -- 0x80 - 0xC0   - illegal
        RETURN false;
     ELSIF chr < 224 THEN  -- 0xC0 - 0xE0   - two bytes
        len := 2;
        wchr := chr - 192;
     ELSIF chr < 240 THEN  -- 0xE0 - 0xF0   - three bytes
        len := 3;
        wchr := chr - 224;
     ELSIF chr < 248 THEN  -- 0xF0 - 0xF8   - four bytes
        len := 4;
        wchr := chr - 240;
     ELSIF chr < 252 THEN  -- 0xF8 - 0xFC   - five bytes
        len := 5;
        wchr := chr - 248;
     ELSIF chr < 254 THEN  -- 0xFC - 0xFE   - six bytes
        len := 6;
        wchr := chr - 252;
     ELSE
        RETURN false;   -- FE and FF not currently defined
     END IF;

     IF i + len > strlen THEN
        RETURN false;
     END IF;

     IF len > maxlen THEN
        RETURN false;
     END IF;

     -- Check remaining characters
     j := 1;
     WHILE len > j LOOP
        chr := get_byte(str, i+j);
        IF chr < 128 OR chr >= 192 THEN
            RETURN false;
        END IF;
        wchr := (wchr << 6) + (chr - 192);
        j := j+1;
     END LOOP;

     -- Verify shortest possible string
     IF len = 1 AND wchr >= 128 THEN
        RETURN false;
     ELSIF len = 2 AND (wchr < 128 OR wchr >= 2048) THEN
        RETURN false;
     ELSIF len = 3 AND (wchr < 2048 OR wchr >= 65536) THEN
        RETURN false;
     ELSIF len = 4 AND (wchr < 65536 OR wchr >= 2097152) THEN
        RETURN false;
     ELSIF len = 5 AND (wchr < 2097152 OR wchr >= 67108864) THEN
        RETURN false;
     ELSIF len = 6 AND (wchr < 67108864 OR wchr >= 2147483648) THEN
        RETURN false;
     END IF;

--     RAISE NOTICE ''Checked char offset %, OK (wchr=%,len=%)'', i, wchr, len;

     i := i+len;
   END LOOP;

  RETURN true;
END;
' language plpgsql;


Attachments:

  [text/plain] utf8_verify.sql (2.1K, 2-utf8_verify.sql)
  download | inline:
create or replace function utf8_verify(bytea,integer) returns bool as '
DECLARE
   str ALIAS FOR $1;
   maxlen ALIAS FOR $2;
   strlen INTEGER;
   i integer;
   j INTEGER;
   len integer;
   chr integer;
   wchr integer;
BEGIN
   i := 0;
   strlen := length(str);

   WHILE i < strlen LOOP

     -- Check leading byte
     chr := get_byte(str,i);

     IF chr < 128 THEN     -- 0x00 - 0x80   - single byte
        len := 1;
        wchr := chr;
     ELSIF chr < 192 THEN  -- 0x80 - 0xC0   - illegal
        RETURN false;
     ELSIF chr < 224 THEN  -- 0xC0 - 0xE0   - two bytes
        len := 2;
        wchr := chr - 192;
     ELSIF chr < 240 THEN  -- 0xE0 - 0xF0   - three bytes
        len := 3;
        wchr := chr - 224;
     ELSIF chr < 248 THEN  -- 0xF0 - 0xF8   - four bytes
        len := 4;
        wchr := chr - 240;
     ELSIF chr < 252 THEN  -- 0xF8 - 0xFC   - five bytes
        len := 5;
        wchr := chr - 248;
     ELSIF chr < 254 THEN  -- 0xFC - 0xFE   - six bytes
        len := 6;
        wchr := chr - 252;
     ELSE
        RETURN false;   -- FE and FF not currently defined
     END IF;

     IF i + len > strlen THEN
        RETURN false;
     END IF;

     IF len > maxlen THEN
        RETURN false;
     END IF;

     -- Check remaining characters
     j := 1;
     WHILE len > j LOOP
        chr := get_byte(str, i+j);
        IF chr < 128 OR chr >= 192 THEN
            RETURN false;
        END IF;
        wchr := (wchr << 6) + (chr - 192);
        j := j+1;
     END LOOP;

     -- Verify shortest possible string
     IF len = 1 AND wchr >= 128 THEN
        RETURN false;
     ELSIF len = 2 AND (wchr < 128 OR wchr >= 2048) THEN
        RETURN false;
     ELSIF len = 3 AND (wchr < 2048 OR wchr >= 65536) THEN
        RETURN false;
     ELSIF len = 4 AND (wchr < 65536 OR wchr >= 2097152) THEN
        RETURN false;
     ELSIF len = 5 AND (wchr < 2097152 OR wchr >= 67108864) THEN
        RETURN false;
     ELSIF len = 6 AND (wchr < 67108864 OR wchr >= 2147483648) THEN
        RETURN false;
     END IF;

--     RAISE NOTICE ''Checked char offset %, OK (wchr=%,len=%)'', i, wchr, len;

     i := i+len;
   END LOOP;

  RETURN true;
END;
' language plpgsql;

^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-09 16:34  Bruce Momjian <[email protected]>
  parent: Martijn van Oosterhout <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-09 16:34 UTC (permalink / raw)
  To: Martijn van Oosterhout <[email protected]>; +Cc: Gregory Maxwell <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>; [email protected]

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Thu, Dec 08, 2005 at 05:54:35PM -0500, Gregory Maxwell wrote:
> > No, what is needed for people who care about fixing their data is a
> > loadable strip_invalid_utf8() that works in older versions.. then just
> > select * from bar where foo != strip_invalid_utf8(foo);  The function
> > would be useful in general, for example, if you have an application
> > which doesn't already have much utf8 logic, you want to use a text
> > field, and stripping is the behaviour you want. For example, lots of
> > simple web applications.
> 
> Would something like the following work? It's written in pl/pgsql and
> does (AFAICS) the same checking as the backend in recent releases.
> Except the backend only supports up to 4-byte UTF-8 whereas this
> function checks upto six byte. For a six byte UTF-8 character, who is
> wrong?
> 
> In any case, people should be able to do something like:
> 
> SELECT field FROM table WHERE NOT utf8_verify(field,4);
> 
> To check conformance with PostgreSQL 8.1. Note, I don't have large
> chunks of UTF-8 to test with but it works for the characters I tried
> with. Tested with 7.4.

I think the problem with any kind of function-call detection is that the
data has to get into the database first, and it isn't clear how someone
loading a failed dump would do that aside from modifying the column to
bytea in the dump, loading it in, then fixing it.  The iconv idea has
the advantage that it can be fixed before loading into the database.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-09 16:44  Martijn van Oosterhout <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Martijn van Oosterhout @ 2005-12-09 16:44 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Gregory Maxwell <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>; [email protected]

On Fri, Dec 09, 2005 at 11:34:22AM -0500, Bruce Momjian wrote:
> I think the problem with any kind of function-call detection is that the
> data has to get into the database first, and it isn't clear how someone
> loading a failed dump would do that aside from modifying the column to
> bytea in the dump, loading it in, then fixing it.  The iconv idea has
> the advantage that it can be fixed before loading into the database.

The point of this function is to test the data *before* you even create
the dump, while it is still running on 7.4 or 8.0.

This means someone who is planning on upgrading to 8.1 in two months
can use this function now to weed out the bad data before the upgrade
even starts.

Have a nice day,
-- 
Martijn van Oosterhout   <[email protected]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-09 17:38  Bruce Momjian <[email protected]>
  parent: Martijn van Oosterhout <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-09 17:38 UTC (permalink / raw)
  To: Martijn van Oosterhout <[email protected]>; +Cc: Gregory Maxwell <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>; [email protected]

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Fri, Dec 09, 2005 at 11:34:22AM -0500, Bruce Momjian wrote:
> > I think the problem with any kind of function-call detection is that the
> > data has to get into the database first, and it isn't clear how someone
> > loading a failed dump would do that aside from modifying the column to
> > bytea in the dump, loading it in, then fixing it.  The iconv idea has
> > the advantage that it can be fixed before loading into the database.
> 
> The point of this function is to test the data *before* you even create
> the dump, while it is still running on 7.4 or 8.0.
> 
> This means someone who is planning on upgrading to 8.1 in two months
> can use this function now to weed out the bad data before the upgrade
> even starts.

Oh, so you back-load it into the old database.  Interesting.  I assume
to be useful you would have to write something that checked every column
values in every table and database.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-09 18:28  Martijn van Oosterhout <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Martijn van Oosterhout @ 2005-12-09 18:28 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Gregory Maxwell <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>; [email protected]

On Fri, Dec 09, 2005 at 12:38:21PM -0500, Bruce Momjian wrote:
> > This means someone who is planning on upgrading to 8.1 in two months
> > can use this function now to weed out the bad data before the upgrade
> > even starts.
> 
> Oh, so you back-load it into the old database.  Interesting.  I assume
> to be useful you would have to write something that checked every column
> values in every table and database.

Umm, yeah. I was thinking about how to do that. pl/pgsql is not the
best language to do that in. In any case I found a bug in the version I
posted and also added a function that does:

test=# select * from db_utf8_verify();
 tab  | fld | location 
------+-----+----------
 tbl1 | foo | (12,3)
(1 row)

It gives the table, field and ctid of any values that failed. It skips
pg_catalog. It's also *really* slow for long strings. Just executing it
on the pg_rewrite in the default installation takes forever. If someone
really wanted this for a large database maybe they should recode it in
C.

http://svana.org/kleptog/pgsql/utf8_verify.sql

Have a nice day,
-- 
Martijn van Oosterhout   <[email protected]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-10 23:50  Robert Treat <[email protected]>
  parent: Martijn van Oosterhout <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Robert Treat @ 2005-12-10 23:50 UTC (permalink / raw)
  To: [email protected]; Martijn van Oosterhout <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Gregory Maxwell <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>

Was thinking if someone could summarize this all it would make a really good 
FAQ entry. 

Robert Treat

On Friday 09 December 2005 13:28, Martijn van Oosterhout wrote:
> On Fri, Dec 09, 2005 at 12:38:21PM -0500, Bruce Momjian wrote:
> > > This means someone who is planning on upgrading to 8.1 in two months
> > > can use this function now to weed out the bad data before the upgrade
> > > even starts.
> >
> > Oh, so you back-load it into the old database.  Interesting.  I assume
> > to be useful you would have to write something that checked every column
> > values in every table and database.
>
> Umm, yeah. I was thinking about how to do that. pl/pgsql is not the
> best language to do that in. In any case I found a bug in the version I
> posted and also added a function that does:
>
> test=# select * from db_utf8_verify();
>  tab  | fld | location
> ------+-----+----------
>  tbl1 | foo | (12,3)
> (1 row)
>
> It gives the table, field and ctid of any values that failed. It skips
> pg_catalog. It's also *really* slow for long strings. Just executing it
> on the pg_rewrite in the default installation takes forever. If someone
> really wanted this for a large database maybe they should recode it in
> C.
>
> http://svana.org/kleptog/pgsql/utf8_verify.sql
>
> Have a nice day,

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-11 02:18  Bruce Momjian <[email protected]>
  parent: Robert Treat <[email protected]>
  0 siblings, 1 reply; 55+ messages in thread

From: Bruce Momjian @ 2005-12-11 02:18 UTC (permalink / raw)
  To: Robert Treat <[email protected]>; +Cc: [email protected]; Martijn van Oosterhout <[email protected]>; Gregory Maxwell <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>


I don't see it asked very often, and I think our 8.1 releae note
addition (plus a mention in the 8.1.1 notes) will complete this.

---------------------------------------------------------------------------

Robert Treat wrote:
> Was thinking if someone could summarize this all it would make a really good 
> FAQ entry. 
> 
> Robert Treat
> 
> On Friday 09 December 2005 13:28, Martijn van Oosterhout wrote:
> > On Fri, Dec 09, 2005 at 12:38:21PM -0500, Bruce Momjian wrote:
> > > > This means someone who is planning on upgrading to 8.1 in two months
> > > > can use this function now to weed out the bad data before the upgrade
> > > > even starts.
> > >
> > > Oh, so you back-load it into the old database.  Interesting.  I assume
> > > to be useful you would have to write something that checked every column
> > > values in every table and database.
> >
> > Umm, yeah. I was thinking about how to do that. pl/pgsql is not the
> > best language to do that in. In any case I found a bug in the version I
> > posted and also added a function that does:
> >
> > test=# select * from db_utf8_verify();
> >  tab  | fld | location
> > ------+-----+----------
> >  tbl1 | foo | (12,3)
> > (1 row)
> >
> > It gives the table, field and ctid of any values that failed. It skips
> > pg_catalog. It's also *really* slow for long strings. Just executing it
> > on the pg_rewrite in the default installation takes forever. If someone
> > really wanted this for a large database maybe they should recode it in
> > C.
> >
> > http://svana.org/kleptog/pgsql/utf8_verify.sql
> >
> > Have a nice day,
> 
> -- 
> Robert Treat
> Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
> 

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [email protected]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073



^ permalink  raw  reply  [nested|flat] 55+ messages in thread

* Re: Upcoming PG re-releases
@ 2005-12-11 05:30  Joshua D. Drake <[email protected]>
  parent: Bruce Momjian <[email protected]>
  0 siblings, 0 replies; 55+ messages in thread

From: Joshua D. Drake @ 2005-12-11 05:30 UTC (permalink / raw)
  To: Bruce Momjian <[email protected]>; +Cc: Robert Treat <[email protected]>; [email protected]; Martijn van Oosterhout <[email protected]>; Gregory Maxwell <[email protected]>; Gavin Sherry <[email protected]>; Peter Eisentraut <[email protected]>

Bruce Momjian wrote:
> I don't see it asked very often, and I think our 8.1 releae note
> addition (plus a mention in the 8.1.1 notes) will complete this.
>
>   
Actually a "upgrade" FAQ is probably a good idea. Something that says 
what really happens
when foo changes in 8.1 or how foo is different then 8.0.

The idea that there is a practical (for those that have practical 
implications) resource for finding
out what it really means that the UTF-8 stuff changed .

Joshua D. Drake


> ---------------------------------------------------------------------------
>
> Robert Treat wrote:
>   
>> Was thinking if someone could summarize this all it would make a really good 
>> FAQ entry. 
>>
>> Robert Treat
>>
>> On Friday 09 December 2005 13:28, Martijn van Oosterhout wrote:
>>     
>>> On Fri, Dec 09, 2005 at 12:38:21PM -0500, Bruce Momjian wrote:
>>>       
>>>>> This means someone who is planning on upgrading to 8.1 in two months
>>>>> can use this function now to weed out the bad data before the upgrade
>>>>> even starts.
>>>>>           
>>>> Oh, so you back-load it into the old database.  Interesting.  I assume
>>>> to be useful you would have to write something that checked every column
>>>> values in every table and database.
>>>>         
>>> Umm, yeah. I was thinking about how to do that. pl/pgsql is not the
>>> best language to do that in. In any case I found a bug in the version I
>>> posted and also added a function that does:
>>>
>>> test=# select * from db_utf8_verify();
>>>  tab  | fld | location
>>> ------+-----+----------
>>>  tbl1 | foo | (12,3)
>>> (1 row)
>>>
>>> It gives the table, field and ctid of any values that failed. It skips
>>> pg_catalog. It's also *really* slow for long strings. Just executing it
>>> on the pg_rewrite in the default installation takes forever. If someone
>>> really wanted this for a large database maybe they should recode it in
>>> C.
>>>
>>> http://svana.org/kleptog/pgsql/utf8_verify.sql
>>>
>>> Have a nice day,
>>>       
>> -- 
>> Robert Treat
>> Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
>>
>>     
>
>   





^ permalink  raw  reply  [nested|flat] 55+ messages in thread


end of thread, other threads:[~2005-12-11 05:30 UTC | newest]

Thread overview: 55+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2005-11-30 15:56 Upcoming PG re-releases Tom Lane <[email protected]>
2005-11-30 16:13 ` Andrew Dunstan <[email protected]>
2005-11-30 16:40   ` Tom Lane <[email protected]>
2005-11-30 18:23     ` Robert Treat <[email protected]>
2005-11-30 18:30       ` Marc G. Fournier <[email protected]>
2005-12-01 01:36       ` David Fetter <[email protected]>
2005-12-01 03:56         ` Marc G. Fournier <[email protected]>
2005-12-01 04:22           ` Joshua D. Drake <[email protected]>
2005-12-01 05:15           ` David Fetter <[email protected]>
2005-12-03 16:10             ` Kevin Brown <[email protected]>
2005-12-01 02:21     ` Andrew Dunstan <[email protected]>
2005-12-01 01:21 ` Christopher Kings-Lynne <[email protected]>
2005-12-03 15:56 ` Neil Conway <[email protected]>
2005-12-03 15:54   ` Bruce Momjian <[email protected]>
2005-12-04 16:25     ` Paul Lindner <[email protected]>
2005-12-04 16:34       ` Tom Lane <[email protected]>
2005-12-04 16:40         ` Paul Lindner <[email protected]>
2005-12-04 16:52           ` Tom Lane <[email protected]>
2005-12-04 17:19             ` Gregory Maxwell <[email protected]>
2005-12-04 18:55               ` Martijn van Oosterhout <[email protected]>
2005-12-05 04:43         ` Gavin Sherry <[email protected]>
2005-12-06 19:26       ` Bruce Momjian <[email protected]>
2005-12-06 19:27         ` Tom Lane <[email protected]>
2005-12-06 19:32           ` Bruce Momjian <[email protected]>
2005-12-06 20:25             ` Bruce Momjian <[email protected]>
2005-12-06 20:47               ` Peter Eisentraut <[email protected]>
2005-12-06 21:00                 ` Bruce Momjian <[email protected]>
2005-12-07 03:45                   ` Gavin Sherry <[email protected]>
2005-12-07 04:37                     ` Bruce Momjian <[email protected]>
2005-12-08 22:37                       ` Gavin Sherry <[email protected]>
2005-12-08 22:44                         ` Bruce Momjian <[email protected]>
2005-12-08 22:54                           ` Gregory Maxwell <[email protected]>
2005-12-09 16:17                             ` Martijn van Oosterhout <[email protected]>
2005-12-09 16:34                               ` Bruce Momjian <[email protected]>
2005-12-09 16:44                                 ` Martijn van Oosterhout <[email protected]>
2005-12-09 17:38                                   ` Bruce Momjian <[email protected]>
2005-12-09 18:28                                     ` Martijn van Oosterhout <[email protected]>
2005-12-10 23:50                                       ` Robert Treat <[email protected]>
2005-12-11 02:18                                         ` Bruce Momjian <[email protected]>
2005-12-11 05:30                                           ` Joshua D. Drake <[email protected]>
2005-11-30 18:33 Re: [HACKERS] Upcoming PG re-releases Magnus Hagander <[email protected]>
2005-11-30 18:39 ` Marc G. Fournier <[email protected]>
2005-11-30 22:31   ` Robert Bernier <[email protected]>
2005-11-30 22:41     ` Joshua D. Drake <[email protected]>
2005-11-30 22:39       ` Tom Lane <[email protected]>
2005-11-30 18:59 ` Robert Treat <[email protected]>
2005-12-01 10:06   ` Richard Huxton <[email protected]>
2005-12-01 10:35     ` Euler Taveira de Oliveira <[email protected]>
2005-12-01 12:09       ` Peter Eisentraut <[email protected]>
2005-12-01 17:00         ` Marc G. Fournier <[email protected]>
2005-12-01 12:20       ` Csaba Nagy <[email protected]>
2005-12-01 12:37         ` Richard Huxton <[email protected]>
2005-12-01 17:10 Re: [HACKERS] Upcoming PG re-releases Dave Page <[email protected]>
2005-12-01 17:21 ` Andrew Dunstan <[email protected]>
2005-12-01 18:26 Re: [HACKERS] Upcoming PG re-releases Magnus Hagander <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox