Today, we are excited to announce the deps.dev API, which provides free access to the deps.dev dataset of security metadata, including dependencies, licenses, advisories, and other critical health and security signals for more than 50 million open source package versions.
Software supply chain attacks are increasingly common and harmful, with high profile incidents such as Log4Shell, Codecov, and the recent 3CX hack. The overwhelming complexity of the software ecosystem causes trouble for even the most diligent and well-resourced developers.
We hope the deps.dev API will help the community make sense of complex dependency data that allows them to respond to—or even prevent—these types of attacks. By integrating this data into tools, workflows, and analyses, developers can more easily understand the risks in their software supply chains.
As part of Google’s ongoing efforts to improve open source security, the Open Source Insights team has built a reliable view of software metadata across 5 packaging ecosystems. The deps.dev data set is continuously updated from a range of sources: package registries, the Open Source Vulnerability database, code hosts such as GitHub and GitLab, and the software artifacts themselves. This includes 5 million packages, more than 50 million versions, from the Go, Maven, PyPI, npm, and Cargo ecosystems—and you'd better believe we're counting them!
We collect and aggregate this data and derive transitive dependency graphs, advisory impact reports, OpenSSF Security Scorecard information, and more. Where the deps.dev website allows human exploration and examination, and the BigQuery dataset supports large-scale bulk data analysis, this new API enables programmatic, real-time access to the corpus for integration into tools, workflows, and analyses.
The API is used by a number of teams internally at Google to support the security of our own products. One of the first publicly visible uses is the GUAC integration, which uses the deps.dev data to enrich SBOMs. We have more exciting integrations in the works, but we’re most excited to see what the greater open source community builds!
We see the API as being useful for tool builders, researchers, and tinkerers who want to answer questions like:
Taken together, this information can help answer the most important overarching question: how much risk would this dependency add to my project?
The API can help surface critical security information where and when developers can act. This data can be integrated into:
The API has a couple of great features that aren’t available through the deps.dev website.
A unique feature of the API is hash queries: you can look up the hash of a file's contents and find all the package versions that contain that file. This can help figure out what version of which package you have even absent other build metadata, which is useful in areas such as SBOMs, container analysis, incident response, and forensics.
The deps.dev dependency data is not just what a package declares (its manifests, lock files, etc.), but rather a full dependency graph computed using the same algorithms as the packaging tools (Maven, npm, Pip, Go, Cargo). This gives a real set of dependencies similar to what you would get by actually installing the package, which is useful when a package changes but the developer doesn’t update the lock file. With the deps.dev API, tools can assess, monitor, or visualize expected (or unexpected!) dependencies.
For a demonstration of how the API can help software supply chain security efforts, consider the questions it could answer in a situation like the Log4Shell discovery:
The API service is globally replicated and highly available, meaning that you and your tools can depend on it being there when you need it.
It's also free and immediately available—no need to register for an API key. It's just a simple, unauthenticated HTTPS API that returns JSON objects:
# List the advisories affecting log4j 1.2.17 $ curl https://api.deps.dev/v3alpha/systems/maven/packages/log4j%3Alog4j/versions/1.2.17 \ | jq '.advisoryKeys[].id' "GHSA-2qrg-x229-3v8q" "GHSA-65fg-84f6-3jq3" "GHSA-f7vh-qwp3-x37m" "GHSA-fp5r-v3w9-4333" "GHSA-w9p3-5cr8-m3jj"
A single API call to list all the GHSA advisories affecting a specific version of log4j.
Check out the API Documentation to get started, or jump straight into the code with some examples.
Software supply chain security is hard, but it’s in all our interests to make it easier. Every day, Google works hard to create a safer internet, and we’re proud to be releasing this API to help do just that, and make this data universally accessible and useful to everyone.
We look forward to seeing what you might do with the API, and would appreciate your feedback. (What works? What doesn't? What makes it better?) You can reach us at depsdev@google.com, or by filing an issue on our GitHub repo.
Post a Comment
No comments :
Post a Comment