DDX for PostgreSQL — Community

OVERVIEW

Every PostgreSQL contribution we can extract from public sources, attributed to a person, indexed by release. contributors so far across 18 authoritative git repos (postgres core plus JDBC/ODBC/psycopg/pgpool/pgdog and extensions), the mailing-list archives, and the GitHub issue / pull-request feeds for pgjdbc and psqlodbc (those two projects don't use mailing lists; their issue trackers are the primary record). The PG core toggle below restricts to the canonical postgres repo only.

Disclaimer. This is one estimation among many. It is not meant to shame, promote, rank, or compare anyone. Contribution is not measurable; what we count here are observable interactions in public archives, weighted equally regardless of size, impact, or difficulty. A 12-line bug fix counts the same as 6,000-line architectural rewrite. A drive-by typo correction counts the same as a multi-year rebase campaign. The numbers below should be read as a rough activity signal across the ecosystem, nothing more. People appear here because their work shows up in public archives; people not here are not less valuable contributors.

The employer column is best-effort: it uses a manual override where one is set, falling back to the email-domain mapping for everyone else, so many personal-email contributors land in Other. See the org rollup page for the per-employer view.

Sources, methodology, the data-flow diagram, and identity coalescing details are further down the page, after the contributor list.

CONTRIBUTORS

Scope
Time
Branch
people
total contributions
median per person
average
stddev

TOP 60 CONTRIBUTORS

Loading…

ALL CONTRIBUTORS

Every person who's authored a commit, reviewed a patch, reported an issue, or contributed in any other tracked role. Click any name to jump to their per-person page.

Loading…

SOURCES

The contributor list above pulls from six public data sources, deduplicated by stable identifiers and cross-linked into a single per-person view.

DATA FLOW

The pipeline from upstream sources to the per-person card on this page:

SOURCES INGEST IDENTITY AGGREGATE RENDER git mirrors postgresql.git + 17 more mailing lists ~385k messages, 50+ lists github issues + PRs pgjdbc, psqlodbc buildfarm animal owners tier scrape postgresql.org/community committer scrape postgresql.org/developer operator overrides community-identities.yaml agora indexer commits → ag_commits trailers → contributions_raw messages → ag_messages scrapers community.tiers community.committers bot filter community.is_bot() 4,700+ patterns YAML loader → person_aliases coalescing community.persons + community.person_aliases canonical_email + slug org inference community.org_for_person() override → domain → Other role rollup mv_person_contribution _summary refresh every 6h org rollup mv_org_contribution _summary /community cards + per-person /<hash>/ /community /orgs/ + chart

Each box is a system component; arrows mark data dependence (left side computed first). The aggregate boxes are REFRESH MATERIALIZED VIEW CONCURRENTLY'd every six hours on a postgres cron schedule (HH:17), then the static rendering job fires eight minutes later (HH:25) to pre-build per-person card pages.

HOW THE NUMBERS ARE COMPUTED

For each person p, we count their interactions across thirteen role categories: r ∈ { authored, committed, reviewed_by, tested_by, reported_by, coauthored_by, suggested_by, signed_off, discussion, backpatch, wiki_edit, email_from, email_to }. Let bb be a time bucket (a release like PG18, a year like 2026, or all), s{core,ecosystem}s \in \{\text{core}, \text{ecosystem}\}, and m{true,false}m \in \{\text{true}, \text{false}\} a flag for restricting to master-branch commits only.

The per-(person, bucket, scope, master-only) total is then:

Total(p,b,s,m)  =  rC(p,r,b,s,m)\mathrm{Total}(p, b, s, m) \;=\; \sum_{r} C(p, r, b, s, m)

where C(p,r,b,s,m)C(p, r, b, s, m) is the number of distinct interactions of role rr for person pp that match the given bucket / scope / master filter. Each event (commit, review trailer, mailing-list message, GitHub comment) is counted at most once per role, deduplicated by its stable identifier (commit_sha for git, mid for mail and GitHub).

Identity coalescing groups multiple aliases under one canonical person:

hash(p)  =  sha256(canonical_email(p))[1 ⁣ ⁣: ⁣ ⁣8]\mathrm{hash}(p) \;=\; \mathrm{sha256}\bigl(\mathrm{canonical\_email}(p)\bigr)[1\!\!:\!\!8]

Each person has a single canonical email; every git Author:, mailing-list From:, and GitHub handle that's been observed for them maps to that canonical email via community.person_aliases. The 8-character SHA-256 prefix is the URL slug at /community/<hash>/; it's stable across re-syncs and avoids leaking email plaintext.

The org rollup at /community/orgs/ is the same total grouped by employer:

Org_total(o,b,s,m)  =  p:Org(p)=oTotal(p,b,s,m)\mathrm{Org\_total}(o, b, s, m) \;=\; \sum_{p\,:\,\mathrm{Org}(p) = o} \mathrm{Total}(p, b, s, m)

where Org(p)\mathrm{Org}(p) is the operator-curated employer override for pp if set, otherwise the inferred organisation from pp's email domain via community.org_for_person().

IDENTITY COALESCING

One person can show up in our data as many email addresses — a personal account, a current employer, a past employer, a GitHub noreply+<handle>@github.com bouncer, an @oldco.example address that became @newco.example five years ago. Without coalescing, every alias is its own row; totals are scattered, the per-person card under-reports.

The fix is a manual mapping in data/community-identities.yaml (also served at /community/identities.yaml for direct download / audit). Each entry pairs one canonical email with a list of aliases:

- canonical: [email protected]
  display_name: Alex Rivera
  aliases:
    - [email protected]
    - [email protected]
- canonical: [email protected]
  display_name: Sam Okafor
  aliases:
    - [email protected]
    - [email protected]

The list is operator-curated, intentionally conservative, and auditable. Anyone can propose additions via the community-correction template. On each per-person page (/community/<hash>/), the "Aliases" section lists every email + display-name pair the mapping has folded into that one canonical record — so disagreement is visible, not hidden.

The 8-character slug at /community/<hash>/ is sha256 of the canonical email, prefix-8. Stable across re-syncs; doesn't leak the email plaintext; the same person always gets the same URL no matter which alias we encountered first.

What we do not do: heuristic clustering on display name, email-prefix similarity, GitHub-handle guessing, or any automatic multi-source merging without a human in the loop. The cost of a false-positive merge (two distinct people collapsed into one) is much higher than the cost of a false-negative split (one person appearing as two rows until somebody files an issue), so we deliberately err on the side of leaving identities split.

COLUMN MEANINGS

Counts come from community.mv_person_contribution_summary, a materialized view rebuilt every six hours. Per the algorithm above, each role r corresponds to one of the columns in the per-person card:

The master-only toggle restricts the underlying contribution rows to those whose first-seen branch is master/main/HEAD (excluding backported commits on stable branches). It's only meaningful for the PG core scope; ecosystem repos use varying branch conventions.

KNOWN GAPS

Raw data and methodology in /community/data.sql.

FIX YOUR ENTRY

If we got your name, emails, employer, committer status, or tier wrong — or if you're appearing as two separate /community/<hash>/ rows that should be merged — file an issue against ddx/site on Codeberg using the community-correction template. The template asks for everything we'd want to know:

Edits land in data/community-identities.yaml in the infra repo and propagate on the next nightly identity sync (usually within 24 hours).

COMMUNITY RESOURCES

Other hosted tools and resources in the broader PostgreSQL community ecosystem.

commitfest.postgresql.org Official commitfest patch tracker — submit, review, and track patches through the release cycle.
cfbot.cputube.org Commitfest CI bot — automated patch application and build testing for open commitfest entries.
buildfarm.postgresql.org PostgreSQL build farm — continuous build and test across dozens of OS/arch/compiler combinations. Animal ownership shown on per-person pages here.
coverage.postgresql.org PostgreSQL code coverage reports — line and branch coverage for the postgres source tree.
planet.postgresql.org Planet PostgreSQL — community blog aggregator.
wiki.postgresql.org PostgreSQL wiki — community documentation, HowTos, TODO lists, and developer notes.
postgresql.org/list Official mailing list archives — pgsql-hackers, pgsql-general, pgsql-bugs, and 40+ others. Also mirrored on this site.