Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mjP5P-0004Ie-FJ for pgsql-www@arkaria.postgresql.org; Sat, 06 Nov 2021 17:02:35 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1mjP5O-00064x-0Q for pgsql-www@arkaria.postgresql.org; Sat, 06 Nov 2021 17:02:34 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mjP5N-00064o-PJ for pgsql-www@lists.postgresql.org; Sat, 06 Nov 2021 17:02:33 +0000 Received: from mail-lf1-x12e.google.com ([2a00:1450:4864:20::12e]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1mjP5M-0003E7-0V for pgsql-www@lists.postgresql.org; Sat, 06 Nov 2021 17:02:33 +0000 Received: by mail-lf1-x12e.google.com with SMTP id bu18so25671730lfb.0 for ; Sat, 06 Nov 2021 10:02:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hagander-net.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Ne9lEE2x4HlId/tL39uaN9wbpImxMNZg+L60i4QLSPM=; b=upP+2CtWVRKKjcwLh6tsI8ooS7syV/xO9TvJPNr6ugf9bLkwKMsq1m++ghgi5OEc21 IDTI3LOAWU78Q8sMhvQDY0okFDLBkaHqnMAIPxwne/Lzp7A1zpx8PuBNfDdHt7ajXDOV FQi/NI/3VIOl8dR4iuKntMgDDFmUnMUADYGwKNIb7UpoXN6di9t6dx1jjVNeC8fGVZsK XTwI5xkJ+vyL4hPsIgRaRCExWXTp9WQ94+Rnx3//9DszwaGZG5zRfw8zUjujtAVi1lQ5 +00ScF4qS0nixFeb6wrZA80CDctZgoQfD6wkEDQAkpSwypgzYrypzlkJvFiNf/KhI0xd oVaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Ne9lEE2x4HlId/tL39uaN9wbpImxMNZg+L60i4QLSPM=; b=TjDNhBunZEDeLa8XM9j5MUnvDsq8OQEYkAs/62hNFZYHyq2bkFjj+XvvR8AkaevypM wKu4nRnR32GgbNLq2xZ/VRe/cGCsi5pqiEM7rlQXtlOIh6eFydW0SmNpc8ke5qJyuBz1 vry16MHEaUz7TqFkcFkdly5Tq1dfkTbBOwTT/zrSJaNJTM8JFPjafuAal7AXIVkwJFT2 Ho4/pNT3wDWKejXFcGphb3Vsoq10EOQbgPgtd0OU8xwdsCQpzZDD7lsWl/S0hnvFw/rF NY3zjxRF/ETMqjh3QQY54YoCQgwh0aEtq8qKRrFiID2PgldrTnfojyeNKEb1LKb6g+5k e/BQ== X-Gm-Message-State: AOAM532FDsUOoUB9iKFbrr+U4Eun9zB/7MJKZ6hntOJxqkWIuNIOo7Sp ViPYnxZ9Sr7xdaiTCA0iNEtedjG8Vyqt36ii+Xs8616u6I0= X-Google-Smtp-Source: ABdhPJzba07v6KLoAEbduXmS4Mg7+bQoyT+bDmOr+utiJh8uwaQXk3xjtRi1PzU0kewf92cVLj4B9SR9CDYgsf4nnds= X-Received: by 2002:a05:6512:249:: with SMTP id b9mr983617lfo.496.1636218151016; Sat, 06 Nov 2021 10:02:31 -0700 (PDT) MIME-Version: 1.0 References: <53316703-db69-7067-c82e-47a598711595@cmatte.me> <202111041947.x5su2sdjdx74@alvherre.pgsql> <61fb4078-6b27-9921-7f7d-0345978d8300@kaltenbrunner.cc> In-Reply-To: <61fb4078-6b27-9921-7f7d-0345978d8300@kaltenbrunner.cc> From: Magnus Hagander Date: Sat, 6 Nov 2021 18:02:09 +0100 Message-ID: Subject: Re: [PATCH] pgarchives: parser: handle messages in which Message-ID is missing To: Stefan Kaltenbrunner Cc: Alvaro Herrera , =?UTF-8?Q?C=C3=A9lestin_Matte?= , pgsql-www@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000aaaeb605d021b99a" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000aaaeb605d021b99a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Nov 5, 2021 at 9:44 PM Stefan Kaltenbrunner wrote: > On 11/4/21 10:07 PM, Magnus Hagander wrote: > > > > > > On Thu, Nov 4, 2021 at 8:47 PM Alvaro Herrera > > wrote: > > > > On 2021-Nov-04, C=C3=A9lestin Matte wrote: > > > > > > I don't think this should be the responsibility of pglister. A= s > you > > > > say, "most MTAs do add this field" -- and the solution is to > > > > configure the MTA to do this. We already rely on the MTA to ge= t > a > > > > lot of other important things right. > > > > > > But then these messages will get delivered by pglister but > pgarchives > > > will fail to archive them, although they do not actually break > > > requirements. Shouldn't we follow the RFC here? > > > > > > I agree that the scenario is a problem, per below. I don't agree that > > making up an id is a solution to that problem. > > > > > > Maybe pglister should refuse to deliver messages that don't contain > > a Message-Id. > > > > > > It should. I actually thought it did already, but apparently it does > > not. I guess we've only ever used it under properly configured MTAs :) > > > > Have you actually come across any case where a *proper* non-spam messag= e > > is sent without a message-id and passes through actual mailservers on > > the way? > > > > Looking through the approximately 1.4 million mails in the postgres lis= t > > archives, not a single one has a message-id generated by the archives > > server MTA (which is configured to generate it). Not a single one by ou= r > > inbound relay servers. And exactly one by the pglister server -- which > > turns out to be a bounce that ended up in the archives because of a > > misconfiguration back in 2018 that's not visible in the public archives= . > > as mentioned down-thread by Justin Clift we have been plain rejecting > mails without a message-id on the postgresql.org inbound relays since > March 27th 2012(!) according to our repo and the number of rejects due > to that rule is actually not-insignificant (approximately 200-400/day > with the majority being for a very small number of bounce generating > senders) but the number of complaints is also approaching (almost) zero. > Oh I forgot about that one. That clearly explains why I didn't find anything *after* 2012 in the archives -- but in my defence, we don't have any from before that either :) I thought we actually manufactured a message-id on them rather than reject them, but I guess I was confusing it with how we handle email injected internally where I think we do that. --=20 Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/ --000000000000aaaeb605d021b99a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, Nov 5, 2021 at 9:44 PM Stefan Kal= tenbrunner <stefan@kaltenbrun= ner.cc> wrote:
On 11/4/21 10:07 PM, Magnus Hagander wrote= :
>
>
> On Thu, Nov 4, 2021 at 8:47 PM Alvaro Herrera <alvherre@alvh.no-ip.org
> <mailto:alvherre@alvh.no-ip.org>> wrote:
>
>=C2=A0 =C2=A0 =C2=A0On 2021-Nov-04, C=C3=A9lestin Matte wrote:
>
>=C2=A0 =C2=A0 =C2=A0 > > I don't think this should be the res= ponsibility of pglister. As you
>=C2=A0 =C2=A0 =C2=A0 > > say, "most MTAs do add this field&q= uot; -- and the solution is to
>=C2=A0 =C2=A0 =C2=A0 > > configure the MTA to do this. We already= rely on the MTA to get a
>=C2=A0 =C2=A0 =C2=A0 > > lot of other important things right.
>=C2=A0 =C2=A0 =C2=A0 >
>=C2=A0 =C2=A0 =C2=A0 > But then these messages will get delivered by= pglister but pgarchives
>=C2=A0 =C2=A0 =C2=A0 > will fail to archive them, although they do n= ot actually break
>=C2=A0 =C2=A0 =C2=A0 > requirements. Shouldn't we follow the RFC= here?
>
>
> I agree that the scenario is a problem, per below.=C2=A0 I don't a= gree that
> making up an id is a solution to that problem.
>
>
>=C2=A0 =C2=A0 =C2=A0Maybe pglister should refuse to deliver messages th= at don't contain
>=C2=A0 =C2=A0 =C2=A0a Message-Id.
>
>
> It should. I actually thought it did already, but apparently it does <= br> > not. I guess we've only ever used it under properly configured MTA= s :)
>
> Have you actually come across any case where a *proper* non-spam messa= ge
> is sent without a message-id and passes through actual mailservers on =
> the way?
>
> Looking through the approximately 1.4 million mails in the postgres li= st
> archives, not a single one has a message-id generated by the archives =
> server MTA (which is configured to generate it). Not a single one by o= ur
> inbound relay servers. And exactly one by the pglister server -- which=
> turns out to be a bounce that ended up in the archives because of a > misconfiguration back in 2018 that's not visible in the public arc= hives.

as mentioned down-thread by Justin Clift we have been plain rejecting
mails without a message-id on the postgresql.org inbound relays since
March 27th 2012(!) according to our repo and the number of rejects due
to that rule is actually not-insignificant (approximately 200-400/day
with the majority being for a very small number of bounce generating
senders) but the number of complaints is also approaching (almost) zero.

Oh I forgot about that one. That clearly = explains why I didn't find anything *after* 2012 in the archives -- but= in my defence, we don't have any from before that either :)
=
I thought we actually manufactured a message-id on them rath= er than reject them, but I guess I was confusing it with how we handle emai= l injected internally where I think we do that.

-= -
=C2= =A0Magnus Hagander
=C2=A0Me: https://www.hagander.net/
=C2=A0Work: https://www.redpill-linpro.com/<= /a>
--000000000000aaaeb605d021b99a--