Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mj65Q-0008F6-7g for pgsql-www@arkaria.postgresql.org; Fri, 05 Nov 2021 20:45:20 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1mj64r-0007Ml-Ry for pgsql-www@arkaria.postgresql.org; Fri, 05 Nov 2021 20:44:45 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mj64r-0007Mc-KQ for pgsql-www@lists.postgresql.org; Fri, 05 Nov 2021 20:44:45 +0000 Received: from imp.madness.at ([2a02:16a8:dc41::218]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mj64l-0008He-D4 for pgsql-www@lists.postgresql.org; Fri, 05 Nov 2021 20:44:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=kaltenbrunner.cc; s=20190215; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description; bh=AtVnTDsR+l590ginfi3irxbpzBabGOFX/MA4Rui+RiE=; b=Bfl4ZU+o4/74QxBJ5AsalEQzRW aTNGfbF9J22ZSoIIONbcQ0z07l8OYwou4pgOBgkIV8aBq6ygUUR5oONesJYAlG8muhW2xFZErN+92 rvHgDQ7EcCtTeTS2B0Isuyj8ANWoF2zCvhl5KB9Panz3x513YNyjjDN411zvEVThJsf4qMw5WD6mq 8FzTGvoMAKGK1F7yZIJ35jNgIJid50VYdc2+yHJjdjsf4elcsD90ZHAzKkZRn7LeIbidbYH+Lc59w 60oI+l1USO/UJf7kArF2X07y+PaCKwesUzz0k5PNrWzvxTf+Pe2TrGBumA8/pYA54manE4KqfwCiB jNfXoVBw==; Received: from [83.215.245.171] (helo=[192.168.8.98]) by imp.madness.at with esmtpsa (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1mj64h-0007Rx-29; Fri, 05 Nov 2021 21:44:37 +0100 Subject: Re: [PATCH] pgarchives: parser: handle messages in which Message-ID is missing To: Magnus Hagander , Alvaro Herrera Cc: =?UTF-8?Q?C=c3=a9lestin_Matte?= , pgsql-www@lists.postgresql.org References: <53316703-db69-7067-c82e-47a598711595@cmatte.me> <202111041947.x5su2sdjdx74@alvherre.pgsql> From: Stefan Kaltenbrunner Message-ID: <61fb4078-6b27-9921-7f7d-0345978d8300@kaltenbrunner.cc> Date: Fri, 5 Nov 2021 21:44:33 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 11/4/21 10:07 PM, Magnus Hagander wrote: > > > On Thu, Nov 4, 2021 at 8:47 PM Alvaro Herrera > wrote: > > On 2021-Nov-04, Célestin Matte wrote: > > > > I don't think this should be the responsibility of pglister. As you > > > say, "most MTAs do add this field" -- and the solution is to > > > configure the MTA to do this. We already rely on the MTA to get a > > > lot of other important things right. > > > > But then these messages will get delivered by pglister but pgarchives > > will fail to archive them, although they do not actually break > > requirements. Shouldn't we follow the RFC here? > > > I agree that the scenario is a problem, per below.  I don't agree that > making up an id is a solution to that problem. > > > Maybe pglister should refuse to deliver messages that don't contain > a Message-Id. > > > It should. I actually thought it did already, but apparently it does > not. I guess we've only ever used it under properly configured MTAs :) > > Have you actually come across any case where a *proper* non-spam message > is sent without a message-id and passes through actual mailservers on > the way? > > Looking through the approximately 1.4 million mails in the postgres list > archives, not a single one has a message-id generated by the archives > server MTA (which is configured to generate it). Not a single one by our > inbound relay servers. And exactly one by the pglister server -- which > turns out to be a bounce that ended up in the archives because of a > misconfiguration back in 2018 that's not visible in the public archives. as mentioned down-thread by Justin Clift we have been plain rejecting mails without a message-id on the postgresql.org inbound relays since March 27th 2012(!) according to our repo and the number of rejects due to that rule is actually not-insignificant (approximately 200-400/day with the majority being for a very small number of bounce generating senders) but the number of complaints is also approaching (almost) zero. So the reason why pglister is not seeing them a lot is because we dont accept them upstream, not that they dont exist in the wild... Though the ones in the wild seem to be "not very useful"... Stefan Stefan