openSUSE:Reproducible openSUSE/Part1
This is a report of Part 1 of my journey: building 100% bit-reproducible packages for every package that makes up our minimalVM image. It is sponsored by a grant from the nice people at the NLNet foundation
This target was chosen as the smallest useful result/artifact. The larger package-sets get, the more disk-space and build-power is required to build/verify all of them.
Design
The many sources are kept in https://build.opensuse.org/project/show/home:bmwiedemann:reproducible:distribution:ring0 . As usual, we use the original upstream tarball and apply patches as needed in the .spec file (build description).
It also keeps the scripts in the 000pbuildconf pseudo-package. With a checkout of these, you can verify the source checksums and start building locally using pbuild from the build package (obs-build on Debian). This results in 81 GB of binaries for ring0 and 448 GB of binaries for ring1. And calculating a hash over all these binaries produces a constant result in builds on different machines on different days.
What I did and How to check
https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/thread/2XMBAI6G4ZFAS7LFWM2XPYHZYMG5C2GF/ already documents some details of this.
However, after that I found a few more issues: perl embeds the build machine kernel version. This showed up twice. Once when I ran pbuild without --kvm so that the build host kernel version went in there. And the second time when I switched the base snapshot in the _pbuild config and that came with a different kernel. pbuild did not trigger a re-build of perl, likely because the kernel-obs-build package counts as "Support" and not "BuildRequires". I resolved this by explicitly adding kernel-source into ring0. We need a kernel for minimalVM anyway.
Then, the next issue that showed up was that kernel-docs was unreproducible. Some experiments led to a workaround of running sphinx without -jauto . There is some discussion in https://lore.kernel.org/linux-doc/33018311-0bdf-4258-b0c0-428a548c710d@suse.de/T/#t and some renewed activity in the related Sphinx issue tracker after 5 years.
Another issue was noticed with kernel-vanilla. I had analyzed this earlier and found that I had to add the _projectcert.crt from OBS so that the build process would not create + embed a random pubkey.
Some additional notes: the ML post suggested to use 'rbk' for test-builds. However, this will only build the main package. If there is a _multibuild file involved, 'multibuildrbkall' will loop through all of them and stop at the first unreproducible result. The 'nachbau' script can only verify the main package so far.
However, the easiest is to use pbuild via
osc co home:bmwiedemann:reproducible:distribution:ring0 && cd $_ # verify source hashes sh 000pbuildconf/sha256sums.src #=> 43f88d34a93e6a91a99276a4913c2cb89bdde7d619fffe33102645215902aa62 000pbuildconf/sha256sums.src.out ln -sf 000pbuildconf/_* . sh 000pbuildconf/build.sh pbuild --result # optional # if more than 2 failed builds occur (e.g. see util-linux below), retry it with: sh 000pbuildconf/build.sh --rebuild failed
It needs around 150GB in /var/tmp and ~80 GB for the ring0 binaries. And after a day it should compute the hash of all binary hashes as
623d34c72d08d9bddba6efc59cb57c9c68d204a935e94b6eadbba195f53ca915 .sha256sums
If you get a different hash, Bernhard is very interested in getting a diff with http://rb.zq1.de/rbos/ring0-sha256sums (send mail to bwiedemann+rbos at suse de)
ring0 sources and binaries are published. The latter are used for building in part2.
Automated testing
Since new code gets submitted daily towards openSUSE:Factory, I added automated checking via a loop around https://github.com/bmwiedemann/osc-plugin-factory/blob/rbchecker/rbcheck2.sh that sets up two different builds in OBS that apply some artificial variations: building +1 year later vs building in a 1-core-VM. This uses https://github.com/bmwiedemann/reproducible-faketools '-future1y' and '-j1' parts.
List of issues+fixes
In the process of making this distribution, I created, collected and applied several code fixes.
Merged fixes:
Pending/WIP fixes + open discussions:
- kernel-docs / Sphinx
- xmlgraphics-fop
- rpm
- util-linux random build failure
- perl uname -r
- gcc-13 and gcc-14 don't build after 2038-01-19
- openssl-3 fails build-tests after 2035-07-02