Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gjMQX-0001Aq-Gk for pgsql-www@arkaria.postgresql.org; Tue, 15 Jan 2019 10:58:38 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1gjMPY-0000SB-5L for pgsql-www@arkaria.postgresql.org; Tue, 15 Jan 2019 10:57:36 +0000 Received: from magus.postgresql.org ([87.238.57.229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gjMPX-0000Rc-3u for pgsql-www@lists.postgresql.org; Tue, 15 Jan 2019 10:57:35 +0000 Received: from mail-lj1-x229.google.com ([2a00:1450:4864:20::229]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gjMPV-0001EV-Kk for pgsql-www@postgresql.org; Tue, 15 Jan 2019 10:57:35 +0000 Received: by mail-lj1-x229.google.com with SMTP id n18-v6so1914577lji.7 for ; Tue, 15 Jan 2019 02:57:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hagander-net.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=k+rZp+qAU97pn4e4ajmTQVk9iRfUXNLFc7xM1cMnz/o=; b=ONEYGyOW6v/H7M5XdvFGmuGXdSbLx5z9OM5solp9V638S1qSvLRvSiMfh4biss0n3b nEG+mbTaYpE+oa9xJXlnGfAVLZGTeNr3DXZP0E/bcmF/H724zpFLdID4iwCjLuK1LLEo hyNSuFwtNJxJbLljZc4MLbNGKLG1N7I/Vt0rklZrydHwVZj5RkB1mbFdEoTa/wvPNkgW Lm/Iysdxb3L5CawHZ5laz17VBSPjG7XJuN7nrgWF85BMf7kgPHnSt4oqmMY6t6EsJSzo 53BhQ08XEnR01BLoyQp7HvZP4nJhZrxU5Eqh2+7aLie9UMhReBl0wtw6B66yKSUT5uZD QhPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=k+rZp+qAU97pn4e4ajmTQVk9iRfUXNLFc7xM1cMnz/o=; b=IPtt+WiZmdeNGOCAzltq3I5rcKLMlChHwRLto8rrVW4ldwyIIVwUoI0CB5TXkDc5Re d1W8AXRmEWwrAJ5+OOHOnw5tnMhjL3Hcll9cUz3WF71bhdTiaKGjCt0JxmN36eNDFtQ8 wxBMG5QyMSTAQ3omPp0tLj3MhpwjShAV9EdEVkwiCDp7sp3tuRgl1Y+3YayflMKFpoZ1 mhmJ15geJPNkh+opbhOxhdgWGWmuK/5Rs+E0HVNt8VNs1l7OO1AeKyME+svIFCNpdTF9 DnRncXhTEJE6jNCx72UMWFBFnffQctgpXaGQBzCRkLpm0KdNVFqNHHTVMGIAdXm0bQrj zoVg== X-Gm-Message-State: AJcUuke0PDJH5O++4hh3KOvGGXfn9WuLI970CFnjhyYWRFx+GhhCx2OK 0CHjsJoQCjdVsExGYqKYLsnx7aZgNzL99mddw1H8T4BO X-Google-Smtp-Source: ALg8bN6wR46Lbg+7qXwuTojhHQCPyoOHlbzbWuBXBtgTPTklFuZmCWOE0nDxd2FP9BhEEvCdHJ52NRyIbPn15c3bNbc= X-Received: by 2002:a2e:9cd2:: with SMTP id g18-v6mr2416281ljj.161.1547549852063; Tue, 15 Jan 2019 02:57:32 -0800 (PST) MIME-Version: 1.0 References: <20190114221809.eymqah36d6uq5nir@alap3.anarazel.de> <9cfc2880-0623-7cc3-45f4-342c3af881fe@postgresql.org> In-Reply-To: <9cfc2880-0623-7cc3-45f4-342c3af881fe@postgresql.org> From: Magnus Hagander Date: Tue, 15 Jan 2019 11:57:21 +0100 Message-ID: Subject: Re: mailing list redirect for bug numbers? To: "Jonathan S. Katz" Cc: Andres Freund , PostgreSQL WWW Content-Type: multipart/alternative; boundary="0000000000003489a6057f7d0758" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk --0000000000003489a6057f7d0758 Content-Type: text/plain; charset="UTF-8" On Tue, Jan 15, 2019 at 11:43 AM Jonathan S. Katz wrote: > Hi Andres, > > On 1/14/19 5:18 PM, Andres Freund wrote: > > Hi, > > > > How hard would it be to have a redirect similar to > > https://www.postgresql.org/message-id/ > > > > that accepted bug numbers instead of message ids? I don't know the > > precise database schema of the archives, but I assume it could be done > > with a prefix query that filters the sender to @postgresql.org, the list > > to pgsql-bugs, and the prefix to "BUG #" or such. > > The bug ID numbers are generated from pgweb using: > > SELECT nextval('bug_id_seq') > > And then prefixed to the email thread as per the above. > Worth noticing is that it's also prefixed to the message-id, but that's a fairly new thing. More on this in a second. > > > Or perhaps > > there's a database table with bugs -> messageid mappings somewhere? Or > > could be created using a query like the above? > > IMV that would be an excellent suggestion. My guess is in order to make > that work, we would create the mapping when the initial bug report makes > it into the archives. > > It'd be neat to link to bugs from commit messages in a clearer format > > (i.e. to the bug number, rather than it being one of potentially > > multiple message ids), and it also makes manual lookup nicer. > > Agreed, that sounds like a nicer UX. > > The only big catch I see is that if someone emails -bugs directly, no > number is assigned, so we would have to leave that be. > Yeah, those would be entirely out of scope. I don't know if we would want to use "/message-id/" as the parent URL, > just in case someone sent a message with an ID of just digits (for > whatever reason). Dare I suggest something like "/bugs//? > Or would we perhaps want tp use the postgr.es urls-shorterner with just a /b/ (like we have /m/ for messageids)? > Assuming buy-in, what would need to be done is: > > - Adjust the message import script to parse inbound messages with above > message beginning to -bugs. Determine if it is the first message to the > thread / bug ID is already registered. If it does not exist, record the > bug ID, message ID combo in a new table > That would suddenly put a very hard coded assumption into the archives code about this format, which I think is a bad idea in general since the archives code is not specific to the pgsql list usage *at all* today.... - Write a one time script to map old bug id to first message id in the > thread. > > - Update the urls.py in pgarchives to handle said route and fail > gracefully if bug ID does not exist > > - Note in pgweb where the email is generated that any changes to email > subject could break things. > > And that should be that. > I think it would be easier to just have a simple piece of lookup code that has access to the archives db. When fed a bug number, it first looks for a messageid according to the new format messageid, and if its found (and just one) then redirect to that. If not, then to a subject prefix search and validate that the messageid is one that does match what it should do (e.g. that it comes from the correct servers -- there haven't been many so far, so it's easy to construct a whitelist) and returns exactly one, then redirect to that, otherwise 404. The difficulty in all those is we don't currently index the subject other than as part of the full text. But that can probably be added pretty cheaply. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/ --0000000000003489a6057f7d0758 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Jan 15, 2019 at 11:43 AM Jonathan= S. Katz <jkatz@postgresql.org> wrote:
Hi Andres,

On 1/14/19 5:18 PM, Andres Freund wrote:
> Hi,
>
> How hard would it be to have a redirect similar to
>
https://www.postgresql.org/message-id/<id>
>
> that accepted bug numbers instead of message ids?=C2=A0 I don't kn= ow the
> precise database schema of the archives, but I assume it could be done=
> with a prefix query that filters the sender to @postgresql.org, the li= st
> to pgsql-bugs, and the prefix to "BUG #<bugno>" or suc= h.

The bug ID numbers are generated from pgweb using:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 SELECT nextval('bug_id_seq')

And then prefixed to the email thread as per the above.

Worth noticing is that it's also prefixed to the messa= ge-id, but that's a fairly new thing.


More on this in a second.

> Or perhaps
> there's a database table with bugs -> messageid mappings somewh= ere? Or
> could be created using a query like the above?

IMV that would be an excellent suggestion. My guess is in order to make
that work, we would create the mapping when the initial bug report makes it into the archives.=C2=A0

> It'd be neat to link to bugs from commit messages in a clearer for= mat
> (i.e. to the bug number, rather than it being one of potentially
> multiple message ids), and it also makes manual lookup nicer.

Agreed, that sounds like a nicer UX.

The only big catch I see is that if someone emails -bugs directly, no
number is assigned, so we would have to leave that be.

Yeah, those would be entirely out of scope.

<= /div>

I don't know if we would want to use "/message-id/" as the pa= rent URL,
just in case someone sent a message with an ID of just digits (for
whatever reason). Dare I suggest something like "/bugs/<id>/?

Or would we perhaps want tp use the postgr.es urls-shorterner with just a /b/ (like= we have /m/ for messageids)?=C2=A0

=C2=A0
Assuming buy-in, what would need to be done is:

- Adjust the message import script to parse inbound messages with above
message beginning to -bugs. Determine if it is the first message to the
thread / bug ID is already registered. If it does not exist, record the
bug ID, message ID combo in a new table

That would suddenly put a very hard coded assumption into the archives cod= e about this format, which I think is a bad idea in general since the archi= ves code is not specific to the pgsql list usage *at all* today....


- Write a one time script to map old bug id to first message id in the
thread.

- Update the urls.py in pgarchives to handle said route and fail
gracefully if bug ID does not exist

- Note in pgweb where the email is generated that any changes to email
subject could break things.

And that should be that.

I think it wou= ld be easier to just have a simple piece of lookup code that has access to = the archives db. When fed a bug number, it first looks for a messageid acco= rding to the new format messageid, and if its found (and just one) then red= irect to that. If not, then to a subject prefix search and validate that th= e messageid is one that does match what it should do (e.g. that it comes fr= om the correct servers -- there haven't been many so far, so it's e= asy to construct a whitelist) and returns exactly one, then redirect to tha= t, otherwise 404.

The difficulty in all those is w= e don't currently index the subject other than as part of the full text= . But that can probably be added pretty cheaply.
=C2=A0
--
= =C2=A0Magnus Hagander
=C2=A0Me: https://www.hagander.net/
=C2=A0Work: https://www.redpill-linpro.co= m/
--0000000000003489a6057f7d0758--