Posted by Asra Ali and Laurent Simon, Google Open Source Security Team (GOSST)

Many of the recent high-profile software attacks that have alarmed open-source users globally were consequences of supply chain integrity vulnerabilities: attackers gained control of a build server to use malicious source files, inject malicious artifacts into a compromised build platform, and bypass trusted builders to upload malicious artifacts.

Each of these attacks could have been prevented if there were a way to detect that the delivered artifacts diverged from the expected origin of the software. But until now, generating verifiable information that described where, when, and how software artifacts were produced (information known as provenance) was difficult. This information allows users to trace artifacts verifiably back to the source and develop risk-based policies around what they consume. Currently, provenance generation is not widely supported, and solutions that do exist may require migrating build processes to services like Tekton Chains.

This blog post describes a new method of generating non-forgeable provenance using GitHub Actions workflows for isolation and Sigstore’s signing tools for authenticity. Using this approach, projects building on GitHub runners can achieve SLSA 3 (the third of four progressive SLSA “levels”), which affirms to consumers that your artifacts are authentic and trustworthy.

Provenance

SLSA ("Supply-chain Levels for Software Artifacts”) is a framework to help improve the integrity of your project throughout its development cycle, allowing consumers to trace the final piece of software you release all the way back to the source. Achieving a high SLSA level helps to improve the trust that your artifacts are what you say they are.

This blog post focuses on build provenance, which gives users important information about the build: who performed the release process? Was the build artifact protected against malicious tampering? Source provenance describes how the source code was protected, which we’ll cover in future blog posts, so stay tuned.

Go prototype to generate non-forgeable build provenance

To create tamperless evidence of the build and allow consumer verification, you need to:

Isolate the provenance generation from the build process;
Isolate against maintainers interfering in the workflow;
Provide a mechanism to identify the builder during provenance verification.

The full isolation described in the first two points allows consumers to trust that the provenance was faithfully recorded; entities that provide this guarantee are called trusted builders.

Our Go prototype solves all three challenges. It also includes running the build inside the trusted builder, which provides a strong guarantee that the build achieves SLSA 3’s ephemeral and isolated requirement.

How does it work?

The following steps create the trusted builder that is necessary to generate provenance in isolation from the build and maintainer’s interference.

Step One: Create a reusable workflow on GitHub runners

Leveraging GitHub’s reusable workflows provides the isolation mechanism from both maintainers’ caller workflows and from the build process. Within the workflow, Github Actions creates fresh instances of virtual machines (VMs), called runners, for each job. These separate VMs give the necessary isolation for a trusted builder, so that different VMs compile the project and generate and sign the SLSA provenance (see diagram below).

Running the workflow on GitHub-hosted runners gives the guarantee that the code run is in fact the intended workflow, which self-hosted runners do not. This prototype relies on GitHub to run the exact code defined in the workflow.

The reusable workflow also protects against possible interference from maintainers, who could otherwise try to define the workflow in a way that interferes with the builder. The only way to interact with a reusable workflow is through the input parameters it exposes to the calling workflow, which stops maintainers from altering information via environment variables, steps, services and defaults.

To protect against the possibility of one job (e.g. the build step) tampering with the other artifacts used by another job (the provenance step), this approach uses a trusted channel to protect the integrity of the data. We use job outputs to send hashes (due to size limitations) and then use the hashes to verify the binary received via the untrusted artifact registry.

Step 2: Use OpenID Connect (OIDC) to prove the identity of the workflow to an external service (Sigstore)

OpenID Connect (OIDC) is a standard used across the web for identity providers (e.g., Google) to attest to the identity of a user for a third party. GitHub now supports OIDC in their workflows. Each time a workflow is run, a runner can mint a unique JWT token from GitHub’s OIDC provider. The token contains verifiable information of the workflow identity, including the caller repository, commit hash, trigger, and the current (reuseable) workflow path and reference.

Using OIDC, the workflow proves its identity to Sigstore's Fulcio root Certificate Authority, which acts as an external verification service. Fulcio signs a short-lived certificate attesting to an ephemeral signing key generated in the runner and tying it to the workload identity. A record of signing the provenance is kept in Sigstore’s transparency log Rekor. Users can use the signing certificate as a trust anchor to verify that the provenance was authenticated and non-forgeable; it must have been created inside the trusted builder.

Verification

The consumer can verify the artifact and its signed provenance with these steps:

Look up the corresponding Rekor log entry and verify the signature;
Verify the trusted builder identity by extracting it from the signing certificate;
Check that the provenance information matches the expected source and build.

See an example in action in the official repository.

Performing these steps guarantees to the consumer that the binary was produced in the trusted builder at a given commit hash attested to in the provenance. They can trust that the information in the provenance was non-forgeable, allowing them to trust the build “recipe” and trace their artifact verifiably back to the source.

Extra Bonus: Keyless signing

One extra benefit of this method is that maintainers don’t need to manage or distribute cryptographic keys for signing, avoiding the notoriously difficult problem of key management. The OIDC protocol requires no hardcoded, long-term secrets be stored in GitHub's secrets, which sidesteps the potential problem of key mismanagement invalidating the SLSA provenance. Consumers simply use OIDC to verify that the binary artifact was built from a trusted builder that produced the expected provenance.

Next Steps

Utilizing the SLSA framework is a proven way for ensuring software supply-chain integrity at scale. This prototype shows that achieving high SLSA levels is easier than ever thanks to the newest features of popular CI/CD systems and open-source tooling. Increased adoption of tamper-safe (SLSA 3+) build services will contribute to a stronger open-source ecosystem and help close one easily exploited gap in the current supply chain.

We encourage testing and adoption and welcome any improvements to the project. Please share feedback, comments and suggestions at slsa-github-generator-go and slsa-verifier project repositories. We will officially release v1 in a few weeks!

In follow-up posts, we will demonstrate adding non-forgeable source provenance attesting to secure repository settings, and showcase the same techniques for other build toolchains and package managers, etc. Stay tuned!

April 7, 2022

Improving software supply chain security with tamper-proof builds