GitHub has the opportunity to streamline and secure the package management layer. Here's how.
GitHub is the system of record for code. But the company rarely takes advantage of this. GitLab, on the other hand, has used this fact to build products that span the entire software development lifecycle. But GitHub's strength is its sheer number of public projects – projects that end users consume mostly through package managers.
How does it work today? When a developer updates a package, they follow roughly these steps:
- Make some code changes and push to Github
- Tag that revision in git (e.g., v1.0.1) on GitHub
- Publish a release on GitHub
- Use that same tag and bundle the code into a zip file
- Publish to a package manager (e.g. npm for JavaScript/TypeScript)
Not only does the package manager have three pieces of redundant information (code, version, and package name), there's no guarantee that these correspond to the open-source code on GitHub. Here's a quick list of a few things that go wrong in this process.
- Squatters sit on a popular name, so a project needs to publish its packages under a slightly different name.
- Malicious code is uploaded – does not match what's on GitHub
- The package is maintained by someone else, not the author of the code
- GitHub is updated, but the author hasn't published the release to a package manager yet, so users can't use it
GitHub can fix all these issues simply by maintaining its own package registries that conform to each language's requirements (i.e., npm endpoint for JavaScript, pip for Python). Packages would correspond 1 to 1 with published releases.
Users could either configure their existing tools to point to GitHub's endpoints, or GitHub could publish its own tool that covers the most popular languages.
Why would GitHub do this?
- It already owns npm, so that seems like aligned incentives to me
- Better data on library usage – downloads through package managers don't go through GitHub
- Quality-of-life improvements for open source maintainers
- Easier (and safer) to use third-party code which turns the flywheel at GitHub