public inbox for [email protected]
help / color / mirror / Atom feedFrom: Dave Page <[email protected]>
To: Christoph Berg <[email protected]>
To: Dave Page <[email protected]>
To: PostgreSQL WWW <[email protected]>
To: Adrian Vondendriesch <[email protected]>
Subject: Re: apt.postgresql.org django app for www.postgresql.org
Date: Fri, 10 Jul 2020 13:50:08 +0100
Message-ID: <CA+OCxoxK1KR7cyn_wn-VwVfYOvpUXhooNC0VAuOysPp2uoM8=Q@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CA+OCxoz7FvOQh19RtX0Cykv0ScbeYD4RLTPitmAWR-9y2cVxWw@mail.gmail.com>
<CA+OCxowkHqCxk5A4e1VRYOLzj7s+_Pm47WZ2=qpbUz7A2BzO2A@mail.gmail.com>
<[email protected]>
On Thu, Jul 9, 2020 at 10:31 AM Christoph Berg <[email protected]> wrote:
> Re: Dave Page
> > I spent a bunch of time playing with this, as I intend to get repo
> > browsing for both Yum and Apt onto the website.
> >
> > There was quite a bit of work to do to get it working with modern
> versions
> > of Django and Python 3. Once I got through enough of that to start
> looking
> > at the actual functionality what I found was *really* comprehensive.
> > Unfortunately I think there's actually far more there than we should put
> on
> > the main website.
>
> Hi Dave,
>
> thanks for picking this up.
>
> It's well possible that I overshot the goal when I picked this up from
> Adrian and put more and more info into it.
>
:-)
>
> > - I think the QA section is clearly something that's aimed at you as
> > maintainers of the apt repos. This definitely doesn't belong on the main
> > website in my opinion.
>
> Yeah that could go.
>
> > - The madison interface is also interesting (academically), but I think
> is
> > of little use to the vast majority of our users; I'm not even sure that
> the
> > majority of Debian/Ubuntu users would know about rmadison.
>
> I'm using that daily on the Debian archive, and it would help me a lot
> if it were there. But we don't have to link it from every page, it's
> just some API endpoint, we don't have to confuse users by linking to
> it.
>
> > - Similarly, I think the binary and source package pages are far more
> > comprehensive than most of our users need or would care about.
> >
> > One of the biggest barriers of adoption to PostgreSQL is the perceived
> > complexity, including that of getting it up and running. That's why I'm
> > spending a lot of time at the moment trying to simplify and clarify the
> > download and installation processes. I think what we have in this patch
> > will simply be information overload for most of our users.
>
> Ack. We can probably merge the source and binary views into a single
> (bigger) page with less clutter that users would reach by default. We
> can still have the detailed pages linked from that for users that need
> to know the details. I'm sure we can find a way to do that that
> doesn't spoil the complexity reduction idea.
>
Yeah. As a side note, the other thing I'm trying to do is be fairly
consistent design-wise between apt, yum and zypp.
>
> > My suggestion is that we incorporate a relatively simple browser into the
> > main website, which allows users to easily browse the available packages
> > and see the details of them.
>
> That'd be about what I said above, I think.
>
> > I already have the repo scanning part of that
> > done for both apt and yum, generating JSON output in a way that can be
> > integrated with our download server sync process, which can load that
> into
> > the website database.
>
> Is that online somewhere?
>
No, but I've attached the WIP scripts. Comments welcome. Note that I've
studiously avoided using any modules or external utilities that are only
available on Debian/Ubuntu. Sample output for Apt looks like:
[
{
"Architecture": "arm64",
"Build": "3.pgdg+1",
"Description": "debug symbols for pg-rage-terminator-9.6",
"Distribution": "sid",
"Filename":
"pool/main/p/pg-rage-terminator/pg-rage-terminator-9.6-dbgsym_0.1.7-3.pgdg+1_arm64.deb",
"Licence": "",
"Maintainer": "Adrian Vondendriesch <[email protected]>",
"Package": "pg-rage-terminator-9.6-dbgsym",
"Repo": "sid-pgdg-testing",
"Version": "0.1.7"
},
{
"Architecture": "arm64",
"Build": "1.pgdg+1",
"Description": "PostgreSQL management tool - GUI application\npgAdmin
is an open source administration and management tool for the\nPostgreSQL
database. It includes a graphical administration interface, an SQL\nquery
tool, a procedural code debugger and much more. The tool is designed
to\nanswer the needs of developers, DBAs and system administrators
alike.\n\nThis package installs the GUI application.",
"Distribution": "sid",
"Filename": "pool/main/p/pgadmin4/pgadmin4_4.21-1.pgdg+1_arm64.deb",
"Licence": "",
"Maintainer": "Debian PostgreSQL Maintainers <
[email protected]>",
"Package": "pgadmin4",
"Repo": "sid-pgdg-testing",
"Url": "https://www.pgadmin.org/";,
"Version": "4.21"
}
]
And for Yum:
[
{
"Architecture": "ppc64le",
"Build": "9.rhel7",
"Description": "pgAgent is a job scheduler for PostgreSQL which may be
managed\nusing pgAdmin.",
"Distribution": "rhel-7",
"Filename":
"9.5/redhat/rhel-7-ppc64le/pgagent_95-3.4.0-9.rhel7.ppc64le.rpm",
"Licence": "PostgreSQL",
"Maintainer": "",
"Package": "pgagent_95",
"Repo": "9.5",
"Url": "http://www.pgadmin.org/";,
"Version": "3.4.0"
},
{
"Architecture": "ppc64le",
"Build": "1.f25",
"Description": "The PostgreSQL Audit extension (pgaudit) provides
detailed session\nand/or object audit logging via the standard PostgreSQL
logging\nfacility.\n\nThe goal of the PostgreSQL Audit extension (pgaudit)
is to provide\nPostgreSQL users with capability to produce audit logs often
required to\ncomply with government, financial, or ISO
certifications.\n\nAn audit is an official inspection of an individual's or
organization's\naccounts, typically by an independent body. The information
gathered by\nthe PostgreSQL Audit extension (pgaudit) is properly called an
audit\ntrail or audit log. The term audit log is used in this
documentation.",
"Distribution": "rhel-7",
"Filename":
"9.5/redhat/rhel-7-ppc64le/pgaudit10_95-1.0.5-1.f25.ppc64le.rpm",
"Licence": "BSD",
"Maintainer": "",
"Package": "pgaudit10_95",
"Repo": "9.5",
"Url": "https://www.pgaudit.org";,
"Version": "1.0.5"
}
]
>
> > I would support a separate site (probably under apt.enterprisedb.com)
> that
> > supports the level of functionality you have in your patch; and I think
> > much, if not all of the code you currently have could be used for that.
> > This could of course be linked from the main website.
>
> Maintaining two sets of interfaces is probably too much. I think we
> can get the "main" one to work, we just need to remove lots of
> clutter.
>
I have a suspicion that what I think would be appropriate for most of our
users wouldn't be enough for you. Definitely open to discussion though!
>
> > Obviously that should be apt.postgresql.org :-)
>
> That got me for a second. ;)
Yeah, muscle memory :-p
--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake
EDB: http://www.enterprisedb.com
Attachments:
[text/x-python-script] scan-apt.py (4.4K, 3-scan-apt.py)
download | inline:
import argparse
import json
import os
from pathlib import Path
def get_package_files(repo_dir):
# Get a list of all the Packages files
package_files = []
for path in Path(repo_dir).rglob('Packages'):
package_files.append(path.absolute())
return package_files
def get_distribution(package_file):
# Find the Release file corresponding to the Packages file, and get the
# distribution name from it.
with open(os.path.dirname(package_file) + '/Release', "r") as release_data:
for release_line in release_data:
if release_line.startswith("Archive: "):
return release_line[9:]
return None
def get_licence(deb_file):
# TODO: Figure out a sane way to get the license
license = ''
return license
def get_packages(package_file):
# Extract all the packages from a package file
packages = []
with open(package_file, "r") as package_data:
package = ""
first = True
for line in package_data:
if line.startswith('Package: ') and not first:
# Get the distribution
distribution = get_distribution(package_file)
if distribution is not None:
package = package + '\nDistribution: ' + distribution
packages.append(package)
package = ""
else:
first = False
package = package + line
return packages
def get_package(package_data):
# Decode a Package entry into a dictionary
package = {}
in_description = False
for line in package_data.splitlines():
# Package
if line.startswith("Package: "):
package['Package'] = line[9:]
# Version
if line.startswith("Version: "):
# The build is normally prefixed with a -, but sometimes
# just .pgdg
package['Version'] = line[9:].split('-')[0]
package['Build'] = ''.join(line[9:].split('-')[1:])
if '.pgdg' in package['Version']:
version = line[9:].split('.pgdg')
package['Version'] = version[0]
package['Build'] = 'pgdg' + version[1]
# Architecture
if line.startswith("Architecture: "):
package['Architecture'] = line[14:]
# Filename
if line.startswith("Filename: "):
package['Filename'] = line[10:]
# Licence
licence = get_licence(line[10:])
if licence is not None:
package['Licence'] = licence
# Description. This can be multi-line. Treat the first line
# as normal, then scan the rest until we hit the end
if in_description:
if line.strip() == ".":
package['Description'] = package['Description'] + "\n"
# The description ends when we encounter a line that doesn't start
# with a space.
elif not line.startswith(" "):
in_description = False
else:
package['Description'] = \
package['Description'] + '\n' + line.strip()
if line.startswith("Description: "):
package['Description'] = line[13:]
in_description = True
# Distribution/Repo
if line.startswith("Distribution: "):
package['Distribution'] = line[14:].split('-')[0]
package['Repo'] = line[14:]
# URL
if line.startswith("Homepage: "):
package['Url'] = line[10:]
# Packager
if line.startswith("Maintainer: "):
package['Maintainer'] = line[12:]
return package
def main():
# Command line arguments
parser = argparse.ArgumentParser(description='Scan a set of APT repos and '
'generate a JSON catalog of '
'the contents.')
parser.add_argument("repo", help="the repo directory, or directory "
"containing multiple repos")
args = parser.parse_args()
package_info = []
package_files = get_package_files(args.repo)
for package_file in package_files:
packages = get_packages(package_file)
for package in packages:
package_info.append(get_package(package))
with open('apt.json', 'w') as output_file:
json.dump(package_info, output_file, indent=2, sort_keys=True)
if __name__ == "__main__":
main()
[text/x-sh] sync-repos.sh (483B, 4-sync-repos.sh)
download | inline:
#!/bin/sh
echo "Syncing APT repos..."
rsync -avz --delete --include "*/" --include="Release" --include="Packages" --exclude="*" rsync://ftp.postgresql.org/pgsql-ftp/repos/apt .
echo "Syncing Yum repos..."
rsync -avz --delete --include "*/" --include="*/repodata/*" --exclude="*" rsync://ftp.postgresql.org/pgsql-ftp/repos/yum .
echo "Syncing Zypp repos..."
rsync -avz --delete --include "*/" --include="*/repodata/*" --exclude="*" rsync://ftp.postgresql.org/pgsql-ftp/repos/zypp .
[text/x-python-script] scan-yum.py (2.1K, 5-scan-yum.py)
download | inline:
import argparse
import json
import os
from pathlib import Path
import repomd
def get_repos(repo_dir):
# Get a list of all repo dirs
repos = []
for path in Path(repo_dir).rglob('repomd.xml'):
repos.append(path.parent.parent.absolute())
return repos
def get_distribution(repo):
distribution_dir = repo.name
parts = distribution_dir.split('-')
distribution = '-'.join(parts[:2])
return distribution
def get_package_info(repo, base_path):
packages = []
repo_data = repomd.load('file://' + str(repo))
for package_data in repo_data:
package = {'Package': package_data.name,
'Version': package_data.version,
'Build': package_data.release,
'Architecture': package_data.arch,
'Filename': os.path.relpath(str(repo) + '/' +
package_data.location,
base_path),
'Description': package_data.description,
'Distribution': get_distribution(repo),
'Url': package_data.url,
'Maintainer': package_data.vendor,
'Repo': os.path.relpath(str(repo), base_path).split('/')[0],
'Licence': package_data.license
}
packages.append(package)
return packages
def main():
# Command line arguments
parser = argparse.ArgumentParser(description='Scan a set of APT repos and '
'generate a JSON catalog of '
'the contents.')
parser.add_argument("repo", help="the repo directory, or directory "
"containing multiple repos")
args = parser.parse_args()
package_info = []
repos = get_repos(args.repo)
for repo in repos:
package_info.extend(get_package_info(repo, args.repo))
with open('yum.json', 'w') as output_file:
json.dump(package_info, output_file, indent=2, sort_keys=True)
if __name__ == "__main__":
main()
view thread (12+ messages)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: apt.postgresql.org django app for www.postgresql.org
In-Reply-To: <CA+OCxoxK1KR7cyn_wn-VwVfYOvpUXhooNC0VAuOysPp2uoM8=Q@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox