I'm working again on making reproducible .exe-s. I thought I'd share my process:
Pros:
End users get a bit-for-bit reproducible .exe, known not to contain trojan and auditable from sources
Point releases can reuse the exact same build process and avoid introducing bugs
Steps:
Generate a source tarball (non reproducibly)
Debian Docker as a base, with fixed version + snapshot.debian.org sources.list
Dockerfile: install packaged dependencies and MXE(.cc) from a fixed Git revision
Dockerfile: compile MXE with SOURCE_DATE_EPOCH + fix-ups
Build my project in the container with SOURCE_DATE_EPOCH and check SHA256
Copy-on-release
Result:
git.savannah.gnu.org/gitweb/?p=freedink/dfarc.git;a=tree;f=autobuild/dfarc-w32-snapshot
Generate a source tarball (non reproducibly)
This is not reproducible due to using non-reproducible tools (gettext, automake tarballs, etc.) but it doesn't matter: only building from source needs to be reproducible, and the source is the tarball.
It would be better if the source tarball were perfectly reproducible, especially for large generated content (./configure, wxGlade-generated GUI source code...), but that can be a second step.
Debian Docker as a base
AFAIU the Debian Docker images are made by Debian developers but are in no way official images. That's a pity, and to be 100% safe I should start anew from debootstrap, but Docker is providing a very efficient framework to build images, notably with caching of every build steps, immediate fresh containers, and public images repository.
This means with a single:
sudo -g docker make
you get my project reproducibly built from scratch with nothing to setup at all.
I avoid using a :latest tag, since it will change, and also backports, since they can be updated anytime. Here I'm using stretch:9.4 and no backports.
Using snapshot.debian.org in sources.list makes sure the installed packaged dependencies won't change at next build. For a dot release however (not for a rebuild), they should be updated in case there was a security fix that has an effect on built software (rare, but exists).
Last but not least, APT::Install-Recommends «false»; for better dependency control.
MXE
mxe.cc is compilation environment to get MingGW (GCC for Windows) and selected dependencies rebuilt unattended with a single make. Doing this manually would be tedious because every other day, upstream breaks MinGW cross-compilation, and debugging an hour-long build process takes ages. Been there, done that.
MXE has a reproducible-boosted binutils with a patch for SOURCE_DATE_EPOCH that avoids getting date-based and/or random build timestamps in the PE (.exe/.dll) files. It's also compiled with --enable-deterministic-archives to avoid timestamp issues in .a files (but no automatic ordering).
I set SOURCE_DATE_EPOCH to the fixed Git commit date and I run MXE's build.
This does not apply to GCC however, so I needed to e.g. patch a __DATE__ in wxWidgets.
In addition, libstdc++.a has a file ordering issue (said ordering surprisingly stays stable between a container and a host build, but varies when using a different computer with the same distros and tools versions). I hence re-archive libstdc++.a manually.
It's worth noting that PE files don't have issues with build paths (and varying BuildID-s - unlike ELF... T_T).
Again, for a dot release, it makes sense to update the MXE Git revision so as to catch security fixes, but at least I have the choice.
Build project
With this I can start a fresh Docker container and run the compilation process inside, as a non-privileged user just in case.
I set SOURCE_DATE_EPOCH to the release date at 00:00UTC, or the Git revision date for snapshots.
This rebuild framework is excluded from the source tarball, so the latter stays stable during build tuning. I see it as a post-release tool, hence not part of the release (just like distros packaging).
The generated .exe is statically compiled which helps getting a stable result (only the few needed parts of dependencies get included in the final executable).
Since MXE is not itself reproducible differences may come from MXE itself, which may need fixes as explained above. This is annoying and hopefully will be easier once they ship GCC6. To debug I unzip the different .zip-s, upx -d my .exe-s, and run diffoscope.
I use various tricks (stable ordering, stable timestamping, metadata cleaning) to make the final .zip reproducible as well. Post-processing tools would be an alternative if they were fixed.
reprotest
Any process is moot if it can't be tested.
reprotest helps by running 2 successive compilations with varying factors (build path, file system ordering, etc.), and check that we get the exact same binary. As a trade-off, I don't run it on the full build environment, just on the project itself. I plugged reprotest to the Docker container by running a sshd on the fly. I have another Makefile target to run reprotest in my host system where I also installed MXE, so I can compare results and sometimes find differences (e.g. due to using a different filesystem). In addition this is faster for debugging since changing anything in the early Dockerfile steps means a full 1h rebuild.
Copy-on-release
At release time I make a copy of the directory that contains all the self-contained build scripts and the Dockerfile, and rename it after the new release version. I'll continue improving upon the reproducible build system in the 'snapshot' directory, but the versioned directory will stay as-is and can be used in the future to get the same bit-for-bit identical .exe anytime.
This is the technique I used in my Android Rebuilds project.
Other platforms
For now I don't control the build process for other platforms: distros have their own autobuilders, so does F-Droid. Their problem :P
I have plans to make reproducible GNU/Linux AppImage-based builds in the future though. I should be able to use a finer-grained, per-dependency process rather than the huge MXE-based chunk I currently do.
I hope this helps other projects provide reproducible binaries directly! Comments/suggestions welcome.