Software Bill Of Materials that you can trust

You might be on the journey of adopting Software Bill of Materials (SBOMs) in your organization to strengthen your security or to comply with external regulations, such as US Executive Order 14028. However, what if I tell you that your SBOMs might be useless and even harmful?

In this blog post, we’ll discuss what it means to have SBOMs you can trust and how you can achieve it with the help of Open Source tools.

What’s Trust?

When implementing a system that covers the SBOM lifecycle, from generation to storage, distribution, and processing, it is common to overlook the importance of ensuring the trustworthiness of the SBOM documents.

What does it mean “trust” in this context? To me, the easiest way to define it is to ask some questions:

  • Can I uniquely identify an SBOM document? If I know the SBOM I am looking for, can I find it? Can I make sure it has not been overwritten? 
  • Can I ensure the document will be there whenever I need it for processing, auditing, etc?
  • Can I trust that the content has not been tampered with since it was generated/provided to the moment it was analyzed?
  • How was it built, who did it, or where did it come from?
  • Is it complete? Does it have all the information I need? 

If the answer to some of those questions is no, I have bad news for you: an SBOM you can’t trust is useless and can lead to a false sense of security.

You need SBOMs to be uniquely identifiable, unforgeable, complete, and available.

But why is this important now? 

Two main reasons:

1 - Software Supply Chain (SSC) Metadata, such as SBOM, is increasingly taking center stage in SSC operations as a key tool for security, visibility, and compliance purposes. They are as important as the software artifact (i.e. container image) they reference. 

2 - An SBOM is just another deliverable assembled in your SSC, and hence, they are subject to supply chain attacks or misconfiguration.

The community has done a great job already putting together the SLSA framework. A reference threat modeling and security posture spec to help you build software safer. A quote that I like from that project:

“Any software can introduce vulnerabilities into a supply chain[…] it’s critical to already have checks and best practices in place to guarantee artifact integrity, that the source code you’re relying on is the code you’re actually using[…]”

In other words, we need mechanisms to guarantee integrity and add provenance. And this applies to metadata, too, to your SBOMs.

Building the Supply Chain Metadata Trust Layer

To implement a system that covers a trusted end-to-end for SBOMs or any other piece of metadata (VEX files, test reports, CVE scans, etc.), 5 properties must be met: availability, uniqueness, integrity, provenance, and enforcement/completeness.

  • Availability: Control the storage backend replication, retention policy, or geolocation compliance requirements.
  • Uniqueness: Offer global, immutable, unique identification of a piece of metadata or artifact.
  • Integrity: Have a mechanism to prevent and detect overrides.
  • Provenance/Verification: Provide verifiable, additional context attached to the metadata/artifact. Information related to who, how, and when the metadata was crafted or collected 
  • Enforcement: To be able to set up policies that guarantee the collection of the piece of evidence/artifact. 

Building blocks

To implement a system with these properties, we must introduce two key components: attestations and content addressable storage

“A software attestation is an authenticated statement (metadata) about a software artifact or collection of software artifacts” - slsa.dev

In other words, it’s another metadata file that adds context about a step in your Software Delivery Lifecycle (SDLC), i.e., binary build step, SBOM generation at packaging time, but the interesting bit here is the phrase “authenticated”. Meaning that it’s signed and can be verified later on.

We can use attestations to wrap our SBOMs to enable integrity and provenance verifications. A popular combination of tools would be using in-toto for attestation + Sigstore for signing/verifying. 

At this point, we have verifiable SBOMs with additional provenance information, but how do we store them so they meet the availability, uniqueness, and integrity requirements, too? Enter Content Addressable Storage (CAS).

We’ve written extensively about the Content-Addressable Storage (CAS) role. Still, in a nutshell, “CAS is a system that organizes and retrieves data based on the data's content, rather than its location or name, ensuring data integrity and immutability.”  

By using a CAS, stored SBOMs will be unique, identifiable, and integrity-verifiable. A popular Open Source CAS implementation in the cloud native world is an OCI registry.

You can implement this pattern by creating an in-toto attestation for your SBOM, signing in, and pushing it to an OCI registry using cosign.

At this point, we’ll be standing in pretty good shape in our trusted SBOM journey. 

End-to-end solution with Chainloop

An alternative to implementing the attestation and metadata pipeline yourself is to use Chainloop, an opinionated implementation of this pattern.

Chainloop is an Open Source Metadata Vault for your SSC metadata, SBOMs, VEX, SARIF files, and more.

Let’s layer Chainloop functionality over our trusted storage pattern side-by-side.

  • The provenance/verification layer stays the same. Chainloop uses sigstore, in-toto, and SLSA as building blocks.
  • The storage layer has been extended by supporting multiple storage backends (OCI, S3, Azure Blob Storage, etc.) abstracted behind a federated content addressable API
  • Metadata collection can be enforced with the use of declarative contracts
  • Routing metadata to non-storage endpoints is enabled through third-party integrations.

Final thoughts

SBOMs and supply chain metadata are just yet another artifact assembled in your Software Supply Chain, as important as your actual software artifact. That’s why you need to make sure your SBOMs are trustworthy all the way from generation/collection to storage/analysis.

The good news is that metadata trust is becoming top of mind. Not only is the community starting to discuss what trusted SBOMs mean, but also the fact that there are good Open Source Tools out there; whether you decide to build an end-to-end solution yourself or use Chainloop, you’ll be in good hands. 

If you want to learn more, last week, we gave a talk that expanded on this topic and more. Give it a look!  

Please send feedback our way, and if you like what we do, give our GitHub repository a star :)