Received: from maia.hub.org (maia-2.hub.org [200.46.204.251]) by mail.postgresql.org (Postfix) with ESMTP id CF4DCB5DBD8 for ; Mon, 13 Jun 2011 14:24:11 -0300 (ADT) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.204.251]) (amavisd-maia, port 10024) with ESMTP id 65253-01 for ; Mon, 13 Jun 2011 17:24:05 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-pv0-f174.google.com (mail-pv0-f174.google.com [74.125.83.174]) by mail.postgresql.org (Postfix) with ESMTP id 561EDB5DBCD for ; Mon, 13 Jun 2011 14:24:04 -0300 (ADT) Received: by pvg12 with SMTP id 12so2208879pvg.19 for ; Mon, 13 Jun 2011 10:24:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=yXLhTY5XLXHdMYANVeNY8uJpdYJ6ScGatUNPH9ahwio=; b=ZsqOUohW+JKv6E59fGdt0/1BT01Y9mygnCBIOcduLmFVaWIz5wf+GY3HZ/f/Tww5CY D6dRLuI5IJMb1FW4UjUcbDvcDRqNw+genA8cE516PvQVEv7hFgnxLkCJNsWV2dyPF3o7 c4pbdmSrq5kENcn2jVqLbn2AvUow/yPqRbSbo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Lx9NSWKJoektoMkVQ5QBx0gnZPhaUG3r10aB3gVO0AgqVy1h86X7RNZpPfZ8V3B+CP d7mMWDGRdJ2FMjjMVWAVD8cnqVM1HqVpqjTuisej6IMDFImV99MCYNmR6BaH2vu1Mnjs VkPnXo18zQoZZIL4fvri+P0lJiY+OPFIPAso0= MIME-Version: 1.0 Received: by 10.68.19.131 with SMTP id f3mr2429129pbe.379.1307985842152; Mon, 13 Jun 2011 10:24:02 -0700 (PDT) Received: by 10.68.56.70 with HTTP; Mon, 13 Jun 2011 10:24:02 -0700 (PDT) In-Reply-To: <4DE89072.7070305@usit.uio.no> References: <4DE89072.7070305@usit.uio.no> Date: Mon, 13 Jun 2011 13:24:02 -0400 Message-ID: Subject: Re: Information about WAL Configuration needs an update From: Robert Haas To: Rafael Martinez Cc: pgsql-docs@postgresql.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=0.002 tagged_above=-5 required=5 tests=FREEMAIL_FROM=0.001, RFC_ABUSE_POST=0.001 X-Spam-Level: X-Archive-Number: 201106/49 X-Sequence-Number: 6823 On Fri, Jun 3, 2011 at 3:42 AM, Rafael Martinez wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello > > I am sending this email because I think the section > "28.4. WAL Configuration" [1] of the manual needs to be improved to > avoid some potential configuration problems. > > I have experienced the problem I am going to describe myself and it is > not the first time other postgres users have asked about this in other > forums. > > This section says among other things (pg-8.3): > > " ..... There will always be at least one WAL segment file, and will > normally not be more than (2 + checkpoint_completion_target) * > checkpoint_segments + 1 files. Each segment file is normally 16 MB > (though this size can be altered when building the server). You can use > this to estimate space requirements for WAL. Ordinarily, when old log > segment files are no longer needed, they are recycled (renamed to become > the next segments in the numbered sequence). If, due to a short-term > peak of log output rate, there are more than 3 * checkpoint_segments + 1 > segment files, the unneeded segment files will be deleted instead of > recycled until the system gets back under this limit....." > > For 9.0 it is almost the same but with some additional information about > wal_keep_segments. > > The part I think should be improved by a note or an extra paragraph is > this one "... If, due to a short-term peak of log output rate ..." > > What is the meaning of a "short-term peak" and how many WAL files over > the (3 * checkpoint_segments + 1 segment files) limit can we expect > during a short-term peak? > > I sent some days ago an email to pgsql-general about this, REF: > http://archives.postgresql.org/pgsql-general/2011-05/msg00764.php > > But we did not get to any conclusion about how much disk for WAL files > is really necessary. > > I've run some tests to try to get some numbers that can explain what > happens in my case. > > What we have seen is that when creating a GIN index in a tsvector column > the number of WAL files grow almost proportionally with the size of the > index we are creating. > > The GIN index we are creating on a ~7GB table in one our system is > around 17GB. > > The amount of WAL files in this system will grow to 1353 WAL files while > this GIN index is being created (checkpoint_segments=128, > checkpoint_completion_target=0.5 and checkpoint_timeout=5min) > > Normally, the amount of WAL files according to the documentation should > be between 321 to 385 in our case. But it doesn't say anything about how > many WAL files you can expect during a "short-term peak" and what can > provoke this. > > In our case we got over 1000 "extra" WAL files that it is almost the > equivalent to the 17GB of our GIN index. The amount of WAL files got > back to a normal level after this GIN index was generated. > > You can see the graph with the generation of WAL files + some extra > information for this test here: http://folk.uio.no/rafael/total_wal/ > > What do you think? Shouldn't we update the documentation with some > information about this? Perhaps, but we'd have to think of something intelligent to say about it first. We can't remove the old WAL files until we successfully checkpoint, and so I think if checkpoints are taking a very long to complete or failing altogether, there's actually no upper bound. I don't think we have any kind of "hard stop" where, if no log space is available, we just refuse to process write transactions - such a thing would seem to be rather dangerous. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company