Message-ID: <472A37FD.60501@hagander.net>
Date: Thu, 01 Nov 2007 21:33:01 +0100
From: Magnus Hagander <magnus@hagander.net>
User-Agent: Thunderbird 2.0.0.6 (X11/20070728)
MIME-Version: 1.0
To: "Marc G. Fournier" <scrappy@hub.org>
CC: Andrew Sullivan <ajs@crankycanuck.ca>,  pgsql-www@postgresql.org
Subject: Re: what is up with the PG mailing lists?
References: <25716.1193887595@sss.pgh.pa.us>
	<DADF296033D290EF9634F611@ganymede.hub.org>
	<26669.1193891360@sss.pgh.pa.us> <47299585.7030402@hagander.net>
	<47299957.5020605@postgresql.org> <2968.1193919208@sss.pgh.pa.us>
	<20071101080959.49f3087b@scratch>
	<20071101152333.GM27676@crankycanuck.ca>
	<4729F105.30704@hagander.net>
	<1127E6493CBA8A29F343C4D7@ganymede.hub.org>
	<4729F7D2.6050608@hagander.net>
	<AD9BF3BA60F6634EA7FCDB76@ganymede.hub.org>
In-Reply-To: <AD9BF3BA60F6634EA7FCDB76@ganymede.hub.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Marc G. Fournier wrote:
>> No. All those cases are reasons for acceptable delays. But how often
>> does say network connectivity go away for an hour? If they do, you need
>> to better hosting provider.
> 
> You really don't have a clue on how an SMTP server works, do you?  If delivery 

Well, it's been a couple of years since I last wrote a code patch for a
SMTP server, but yeah, I have a fair clue on how it works. And I do run
servers that deliver some 100,000 mails a day. I know, it's not much,
but I know enough to keep those working, and I've never seen internal
delays like what we're seeing here.


> fails, it backs up and tries again *later* ... if there is a high volume of 
> email going through said server, *later* could very well be 1 hour ... and, in 
> fact, its an incremental backup, so it actually works out to be something like:
> 
> Try now, fail, try in 5 minutes, fail, try in 10 minutes, fail, try in 20 
> minutes, fail, etc ... I'm not sure if its a simple '2x' algorithm, but the 
> delay between attempts does get progressively greater, so if it fails after 
> trying at '40 minutes', then it will be another hour and a half after *that* 
> beofre it will try again, etc ...

That's an implementation detail, that differs wildly between different
SMTP servers. But you already know that of course. Postfix,
specifically, implements a '2x' algorithm. There's also a minimum
backoff time (configurable in new versions, previously fixed at 1000
seconds) and a maximum backoff time (configurable).


The main question remains. As Tom posted again in this thread, the delay
happens *internally between hub.org machines*. By your reasoning, that
means it's getting multiple failures to move mail internally. To me,
that's a clear indication that something is wrong. I'm sorry to hear you
don't agree.


>> A couple of minutes delay is perfectly acceptable. A couple of hours is
>> an indication that something is wrong.
> 
> Well, when you see a couple of hours delay, then do something *useful* and let 
> me know ... the only *useful* reports I've had in the past 24 hours dealt with 
> a problem that Tom reported yesterday and that I fixed within minutes of him 
> reporting ... the headers that you and Bruce sent me were *from that problem* 
> ...

I have given up. I used to send these, but nothing is fixed. Maybe I
should set up a procmail script to capture them...

Oh, and the headers I sent were because the email was stuck in the
moderation queue.


//Magnus