Abstract: This piece is written by Bitcoin Core maintainer and BitMEX guest writer, Michael Ford. Michael is the first recipient of the expanding HDR Global Trading Limited Bitcoin developer grant program. This follows on from Michael’s first piece for us: Build Systems & Security – Bitcoin Is Improving. In this new piece, Michael explains four security improvements to Bitcoin Core that he has been working on:
- Fixing a hidden bug which prevented a security check from occurring,
- Fixing a security weakness on Windows,
- Solving a problem which weakened the random number generator caused by a failure to detect a function on MacOS, and
- Adding tests for a discrepancy in the behaviour of the macOS linker, when compared to the documentation.
This work illustrates the importance of testing on multiple platforms.
Recently I’ve spent some time working on improving the security of Bitcoin Core. Not only by finding and fixing security-related issues in Bitcoin Core itself, but also by improving our build system and extending our security-check and backwards compatibility scripts to ensure that regressions don’t occur. This post documents a few of these changes, as well as some of the things I am still working on.
For a project like Bitcoin Core, a well-functioning build system is vitally important. It controls the version, and features of c++ that we can use, our dependencies, and how they are configured, as well as the hardening and security features that are applied to our binaries. It also ensures cross-platform, as well as backwards compatibility and ultimately it gives us fine grained control over how bitcoind is constructed.
security-check.py & symbol-check.py skipping bitcoind
The `check-symbols` and `check-security` rules were added to our Makefile in #7424. These rules are a convenient way to pass our executables into a series of security and backwards-compatibility checks, and they are used as part of our release process.
However since they were added, they have been impacted by a subtle bug which silently prevented `bitcoind` from being passed into either of the scripts. The problem can be demonstrated with a few lines of Python:
# touch a, touch b, touch c
# python3 args.py < a b c
if __name__ == '__main__':
# ['args.py', 'b', 'c']
# if you add some lines to "a",
# you'll see them here.
for line in sys.stdin:
Usage of the `<` character in bash is an input redirection; which in our case, caused the first argument passed to either of the scripts (which is bitcoind) to be opened for reading on standard input. This meant it was essentially “unseen” by both scripts, which read their arguments from sys.argv.
This was fixed in #17857, and backported to some branches. Note that the potential impact of this issue was actually fairly low, as bitcoind is essentially a subset (code and dependency wise) of bitcoin-qt, which had still been getting checked by both scripts. However, it does serve as a good example of a subtle issue that could have had more severe consequences, which managed to sit, unnoticed, in the codebase for nearly 4 years.
ASLR was not functioning for bitcoin-cli on Windows
Address Space Layout Randomization (ASLR) is a security technique, which intends to make it harder for an attacker to reuse the data used by a process as part of an attack, by randomly positioning that data inside the processes address space. A combination of a bug in the linker we use for Windows releases (binutils-mingw-w64-x86-64) and the fact that one of our executables, `bitcoin-cli`, did not happen to export any symbols, meant that until recently, the cli binary was not receiving the benefits of ASLR at runtime.
The linker provided with binutils (GNU ld) has historically had a few issues with producing Windows binaries. One of these issues was that it would strip the .reloc section from binaries, even though this section is required for functioning ASLR on Windows (along with additional header bits etc), unless your binary happened to export any symbols.
All of our Windows binaries, apart from bitcoin-cli, do currently export secp256k1 symbols, and as such, have been unaffected by this problem. We’ve temporarily fixed the problem for bitcoin-cli, #18702, by exporting a single symbol (`main()`). Ultimately this issue will be resolved, and the workaround removed when we can use a very-new version of binutils to build our releases.
This fix was also backported to be part of the 0.20.0 release. In #18629 an additional test was added to our security-check script, to ensure that all Windows release binaries do contain a .reloc section. This coupled with the existing security checks, should ensure that all binaries are receiving the benefits of ASLR going forward.
sysctl() API differs between macOS and *BSD platforms
The `sysctl()` system call can be used to retrieve and set certain system information. This information includes things like the kernel boot time, cpu type, amount of RAM etc. Depending on whether you’re calling `sysctl` on macOS, or a BSD variant (FreeBSD, NetBSD etc), you’ll need to use slightly different function parameters. On macOS, the first argument to sysctl() is an `int *name`, whereas on *BSD, the first argument is a `const int *name`.
This slight difference meant that the calls in our build system we were using to detect sysctl() availability would sometimes fail when they shouldn’t have. We were originally using a `const int *name` parameter to for the call, which meant that when compiling on macOS, detection would fail silently (unless you went looking in config.log).
You may be wondering why this is interesting, and the reason was not mentioned in the description of the PR that fixed the issue, #18359, or the original change, which did not touch random.cpp.
Prior to the removal of OpenSSL, a new randomenv module was added to our RNG, which is used once a minute to collect additional entropy from the environment; sysctl() is one of the methods used to collect entropy. This meant that if the build system failed to detect the availability of sysctl() at compile time, we would not end up with those entropy collecting calls being used in our binary.
On a macOS system, this resulted in roughly 22 bytes less entropy being fed into the RNG, than what would have been if the sysctl() calls were working as intended.
This was not a critical issue, and did not compromise the new RNG module (to be released with 0.20.0), as the sysctl() calls are an “if-supported” addition on some systems. However this issue was another good example of a subtle build time failure that can have a non-obvious effect elsewhere in the binary.
Lazy binding, ld64 and the loader on macOS
Back in #17686, I added `-bind_at_load` as one of the “hardened” flags for our macOS build. This flag instructs ld64 to set a bit in the program header, which indicates to the dynamic loader (dyld) that you’d like to resolve all symbols at program launch, rather than lazily (when first used). This is similar to the behaviour that can be achieved by passing the `-z,now` flag to GNU ld.
However it was clear that contrary to the documentation, ld64 was not actually setting the `MH_BINDATLOAD` bit in the header of the binary. This posed two questions: Was the functionality actually working as expected, and if it was, how should we test for it in our security checks, if there is no header bit to look for?
Further testing indicated that when building with `-bind_at_load`, symbols were being bound at launch, and that the key to testing was likely in the lazy_bind_* structures in a different section of the MACHO header.
In a belated follow up, #18295, and after some time spent looking at the source of some of the Apple tools, we were originally going to patch the ld64 we use to build our releases to insert the missing bit. However, after some additional insight from Nick Kledzik, who works on the linker and loader at Apple, it was decided that any patching would not be necessary, and we would just add security checks for the header structures mentioned in #17686, while essentially ignoring the MH_BINDATLOAD bit, similar to the current behaviour of dyld.
As of the latest release of Apple’s command line tools (ld64-556.6), the manpage for ld64 still indicates that it will set the header bit when using the bind_at_load flag.
While this was just a quick summary of the recent changes I’ve been working on, hopefully they can serve as some examples as to why:
- Cross-platform testing is always important.
- You can’t always trust upstream tools to do the right thing.
- Even when they are doing the right thing you can’t always trust the documentation.
- Regardless of the above, you still need your own (working) checks and balances.
Michael Ford, Bitcoin Core Developer