Debian's quest for reproducible builds

Bit by Bit

New Tools

Eighty percent of the technical problems related to reproducibility are caused by new timestamps, as well as different time zones and locales.

Many build tools capture data such as the time and date of the build in a non-deterministic way (i.e., the same steps, given the same input values, produce different results). The strip-nondeterminism library [9] is meant to correct this problem. In the run-up it removes information from the build that could lead to incorrect results (Figure 5).

Figure 5: The toolbox: The most important tools in the reproducible builds project [1].

The Source_Date_Epoch specification [10] was devised to replace these timestamps with deterministic values by using a string to define the seconds since the last modification of the source text after January 1, 1970, 0 hours UTC (i.e., the start of the Unix epoch).

However, not all problems can be solved automatically, so a great deal of additional manual work is needed. Detailed information about the practices used can be found on the Documentation page [11]. Diffoscope [12] also plays a prominent role in understanding why a package cannot be built reproducibly. This tool can compare files or folders in depth (Figure 6) by unpacking the program archive and transforming the binary packages into human-readable formats.

Figure 6: Diffoscope allows in-depth comparisons for troubleshooting. Here, it is investigating two versions of a Firefox extension [12].

Reproducible builds represents an important step toward the kind of security needed in the future to improve the integrity of a digital existence. However, it is not a panacea: Secret services and organized criminals have means to foist malicious software on the world, with hardly any defense against it.

An example of this is the Trusting Trust attack first described in 1974 [13]. This attack manipulated the binary file of a compiler in such a way that it first recompiled itself. This manipulated compiler then delivered software that was modified in a virtually undetectable way, such as containing backdoors or other nasties.

One Bit Is Enough

Ken Thompson [14], one of the legendary fathers of Unix, also made it quite clear in an amusing speech [15] at an awards ceremony that there is no such thing as absolutely secure software – unless you write your own compiler and build all of the software yourself. Often, a single bit in a 500KB binary package makes the difference between secure and vulnerable software.

This was demonstrated by Tor developers in a talk show [16] at the 31st Chaos Computer Congress (31C3) in 2014 on the basis of an error in the SSH network protocol from 2002 (CVE-2002-0083). There, a single bit decided whether an attacker could gain root privileges on the machine. The lecture also demonstrated an attack that changed the code of a kernel module in memory only. On the screen it appeared to be intact, but the resulting binary package was compromised.

To prevent similar attacks, the idea of reproducible builds began to take shape, initially in projects such as Tor and Bitcoin, because security and trust play a central role in these applications. Both projects had been built in an entirely reproducible way since 2012. Gitian [17], a distribution method that offers a deterministic build process in a container or a virtual machine, provides the underpinnings.

Using Gitian Builder [18], several people can build the package independently of each other, and the environment. If everything is okay, bitwise identical binary packages are the result. Gitian is fine for smaller projects, but for a complete distribution such as Debian, with thousands of source packages, time and staff overhead prevent its use.


Among distributions, Debian is a pioneer of verifiable packages. Meanwhile, other distributions are adopting the infrastructure Debian created in Jenkins and the tools it uses to solve problems and validate the results. For example, Fedora, openSUSE, and Arch Linux can verify the results of Debian and vice versa.

Last year, a cross-distribution meeting was held in Athens, Greece, with more 40 participants; this year will see a repeat (Figure 7). At the beginning of August, the developers of Hardened GNU/Linux [19], who harden the Linux kernel with PaX/grsecurity, reported the availability of patches to build PaX/grsecurity reproducibly.

Figure 7: European developer meeting and Google Summer of Code participation in 2015/2016. (Slide by Holger Levsen [2])

This collaboration creates a web of trust for users. It makes reproducible builds trustworthy for those who either cannot or do not want to perform the checks themselves to discover whether two packages match up at the bit level. The future shape of the entire process and the expected results need further clarification – but these problems can be solved as soon as an archive exists that is almost entirely built in a reproducible way.


  1. Reproducible builds:
  2. "Reproducible Builds for Debian and a Hope for a More Secure Future" by Holger Levsen, RIPE 72, May 23-27, 2016, Copenhagen, Denmark,
  3. Statistics:
  4. Weekly blog:
  5. Jenkins:
  6. Holger Levsen:
  7. srebuild:
  8. Snapshot archive:
  9. Strip nondeterminism:
  10. Source_Date_Epoch:
  11. Documentation:
  12. Diffoscope:
  13. Trusting Trust attack:
  14. Ken Thompson:
  15. Thompson on security:
  16. Tor developers lecture:
  17. Gitian:
  18. Gitian Builder:
  19. Hardened GNU/Linux:

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus