Progress Towards Utreexo Goals

Abstract: In this piece, 100x Group grantee Calvin Kim explores several claims about the benefits of Utreexo and the extent these advantages have been realised. Calvin explains the significant progress which has been made in the latest implementation of Utreexo, however there is plenty of additional work to do before ordinary users will be using the technology.

In the last demonstration release in July of 2020, we pointed out that the Utreexo project going forward would implement the Utreexo accumulator into btcd, an existing Bitcoin implementation in Go. I’m excited to say that the implementation is ready for another demonstration release. In this release, a new pruned mode called a “Compact State Node” (CSN for short) can be demonstrated.

In my article published back in April of 2020 called “ELI5: Utreexo – A scaling solution”, I made several claims about the advantages that a CSN can offer. These claims were:

  1. A new full node mode in a few kilobytes that syncs as fast as a ssd on a hdd.
  2. Allows for parallelization of the initial block download.
  3. Strengthens the security of Bitcoin by allowing consensus to be independent of the database implementation (the current one in usage is made by Google).
  4. No forks needed to bring Utreexo to Bitcoin.

With the current state of development, Claims 3 and 4 are currently realized. Claim 1 is partly realized as the reduction of the node size to kilobytes is bottlenecked by non-Utreexo data. Claim 2 is currently still in progress.

Why Claim 3 is important

Strengthens the security of Bitcoin by allowing consensus to be independent of the database implementation (the current one in usage is made by Google)

For a few years now, one of the major focuses in improving Bitcoin’s security has been to remove any external dependencies that Bitcoin has. An external dependency is any code that the Bitcoin developers did not write but is necessary for Bitcoin software to be able to function. External dependencies are avoided as much as possible in any security critical project as they may be a source of bugs. The risk of this is minimized by reviewing the external dependency then keeping a copy of the reviewed code. However, this method is not perfect and is inferior to having code that is written, tested, and reviewed directly by the Bitcoin developers. For this reason, Bitcoin developers have been removing various external dependencies from Bitcoin (removal of OpenSSH code being one example).

Currently, the biggest external dependency is the database for which the Unspent Transaction Output set or UTXO set and block indexes are stored in. The database used is from Google and is called “LevelDB”. LevelDB being bug-free is critical for the security of Bitcoin. If there is a bug in LevelDB, it may allow for double spends or cause an unintended fork to happen. In fact in 2013, there was a bug with how Berkeley DB (the database used before LevelDB) was used in Bitcoin Core which caused older Bitcoin Core nodes to fail in reading block 225430, resulting in an unintended fork.

(The aforementioned UTXO set is a set of all the Bitcoin in the world that is currently able to be spent. The UTXO set is critical for Bitcoin security as this is directly part of the Bitcoin consensus and removing LevelDB from it would greatly help improving Bitcoin’s resilience.)

Realizing Claim 3

The reason there needs to be a database at all is because the UTXO set consists of more than 60 million UTXOs, all of which must be kept track of and be quickly accessible as slow access would slow down the initial block download. A database is often used in cases like this where a lot of small data need to be quickly accessible.

With Utreexo CSNs however, we don’t need a database at all. Instead, what we do is we make the spender of a UTXO provide the UTXO data and the Utreexo accumulator proof proving that the UTXO exists. This rids us of the need to keep a UTXO set in the Utreexo CSN implementation. Since we don’t need to keep a UTXO set, this allows us to remove LevelDB from another critical section of the Bitcoin consensus.

Here is how current block validation works compared to Utreexo CSN block validation for extending the main chain:

Current Block validation 

Utreexo CSN block validation

1. Check proof of work

1. Check proof of work

2. For each input, check that the referenced output exists in the UTXO set (LevelDB)

2. Check that each of the UTXOs referenced in the block has a Utreexo accumulator proof and verify.

3. Verify scripts and signatures for each input

3. Verify scripts and signatures for every input

The only difference here is that there is no database access for Utreexo CSN block validation. Instead, it uses Utreexo proof validation.

The code change was fairly minimal in that most of the block validation functions were kept the same. After checking the accumulator proofs, the now validated UTXO data (required for validating the block) is converted into the existing UTXO set cache structure called UtxoViewpoint (this would be CCoinsView in Bitcoin Core) and is then passed off to the existing verification functions.

Why Claim 4 is important

No forks needed to bring Utreexo to Bitcoin

Needing a fork for a new feature is a big risk/challenge in a decentralized system like Bitcoin. A hard fork for Bitcoin is mostly out of the question as the benefits of enabling a feature are outweighed by the potential for a chain split. Soft forks are also very difficult to roll out as they need a significant community buy-in.

On the other hand, if a new feature can just be opt-in without requiring a fork, the deployment of such a feature can be much simpler. BIP-152: Compact Block Relay is an example of a feature that has been widely adopted and doesn’t require a fork. The nodes that opt in to the Compact Blocks can choose to do so and since the feature is opt-in, nothing changes for anyone not interested in using it.

Realizing Claim 4

This is the easiest claim to realize as it was solved when Tadge Dryja first wrote up the Utreexo paper. We avoid having to soft fork by having transitional nodes called bridge nodes that bridge the new Utreexo nodes and current Bitcoin nodes.

A bridge node behaves the same as current Bitcoin full nodes when a non-Utreexo node connects to it. However, when a Utreexo node connects to a bridge node, it will serve Utreexo proofs on top of the normal block it would serve a non-Utreexo node.

As mentioned in the release article, the Utreexo binary provided is hardcoded to *only* connect to the bridge nodes that we run to avoid interference with the Bitcoin testnet network.

Why Claim 1 is important

A new full node mode in a few kilobytes that syncs as fast as a ssd on a hdd

The aforementioned UTXO set is a required part of running a full node. However, as adoption increases and the total Bitcoin available is split into finer and finer amounts, this UTXO set is going to increase in size. Currently, the UTXO set is around 4GB but this may grow to a size unwieldy for lower-cost devices. Slimming this UTXO set is critical if Bitcoin is to be widely adopted.

In current Bitcoin nodes, when any of the UTXOs are referenced in a block, the node will then need to fetch that UTXO, whether from disk or from the cache in memory. This is one of the present bottlenecks in Bitcoin if the node has a slow disk. With pruned nodes, this is even more of a constraint as the cached UTXOs are written to disk when the blocks are pruned. This results in pruned nodes being slower to sync versus non-pruned nodes as Bitcoin developer Pieter Wuille notes here.

With Utreexo CSNs, there are no slowdowns since there are no disk reads for the UTXO set. This results in the Utreexo CSNs performing about the same on any storage, whether that’s a nvme ssd or on a hdd.

Current Progress on Claim 1

The claim for “a full node in a few kilobytes” was not achieved as other metadata, including the block headers, took up a few hundred megabytes. Even though the chainstate is tiny, the other data became non-negligible in reaching the goal for “a full node in a few kilobytes”. For this release, we settled for a couple hundred megabytes.

Comparing just the chainstate with Bitcoin Core, it looks like so:

As you can see, the chainstate size for Utreexo CSN is mere 424 bytes, thus making it a rounding error for the entire size taken up by the node. In fact, the peers.json file, used for reconnecting to known peers on restart, took up 205 kilobytes, making it roughly 483 times bigger than the chainstate.

Here’s the performance difference between pruned Bitcoin Core and Utreexo csn when using a NVMe SSD vs a HDD.

The test was done by pointing the node being benchmarked to a different local Utreexo bridge node reading off a seperate NVMe SSD. The default testnet3 assumevalid of block 1,864,000 was used in Bitcoin Core. In CSN, assumevalid was implemented and was set to block 1,864,000. The testing was done up to block height 1,906,000 on testnet3.

The hardware used is:

  • CPU: AMD Ryzen 3600
  • Memory: Samsung 32GB DDR4 2666MHz
  • Local serving node NVMe drive: 2TB Sandisk ULTRA M.2 NVMe
  • Testing node NVMe drive: 1TB HP SSD EX950 M.2
  • Testing node HDD: Western Digital WD10EZEX-22BN5A0 1TB 7200RPM

Flags given to the Bitcoin Core node were:

  • -prune=550
  • -connect=127.0.0.1
  • -disablewallet
  • -blocksonly 
  • -testnet

With Bitcoin Core, running it on the NVMe SSD took 784 seconds while it took 1,066 seconds on the HDD. For Utreexo CSN, running it on the NVMe SSD took 1,643 seconds while it took 1,700 seconds on the HDD.

Note that there are still many performance optimizations to be made with the current Utreexo CSN implementation. It is currently slower than Bitcoin Core since we forked off btcd, a node that is much slower than Bitcoin Core. We will be doing a follow up release and article which will focus on performance.

Why Claim 2 is important

Allows for parallelization of the initial block download

To avoid confusion, the parallelization claim made here is referring to the parallelization in the chain level. This would mean that a single node will be simultaneously validating multiple block ranges e.g.: 100,001~200,000 and 200,001~300,000. The claim here is not referring to block level parallelization where transaction signatures in a block are validated in parallel (as this is already implemented in both btcd and bitcoin core).

Parallelization in computing refers to execution of multiple processes simultaneously. This can allow for a greater utilization of the available hardware (CPU), and may increase performance if there is any idle hardware. In recent years, CPU development has had trouble in increasing the clock speeds because of physical restraints. In turn, instead of increasing the clock speed, there has been more focus on increasing the number of cores. Software development paradigm has followed and there is a greater emphasis on parallelization today in an effort to take advantage of more CPU cores.

Parallelization of the initial block download has an incredible potential to be a game changer in the time taken to sync a full node, making it easier for individuals to run a full node. More nodes will make the Bitcoin network more resilient to attacks. In this sense, parallelization can be looked at as increasing the security of Bitcoin.

Current Progress on Claim 2

The verification of any block depends on the UTXO set at the previous block. For example, if we were verifying block 501, we need the UTXO set at block 500. However, to have the UTXO set at block 500, we need the UTXO set at block 499. This problem of being dependent on the UTXO set at the previous block goes all the way back to the genesis block (which is hardcoded in). This is where the difficulty in chain level parallelization comes in.

With Utreexo, this problem becomes much easier because the UTXO set is a few hundred bytes rather than a few gigabytes. This lets us hard-code the entire UTXO set representations into the software as starting points for parallel validation.

Note that the peer can be malicious and give a false UTXO set. However, this doesn’t lower our security assumptions since we have different CPU cores working to verify up to block height 499 from the genesis block. We continue verifying from block 501 and onwards, taking advantage of the fact that these CPU cores would have been idle otherwise. When the verification of genesis to height 499 completes, we can now check the block to see if the block and the UTXO set at block height 500 match up. Thus the hard-coded UTXO set representations are only used as hints to allow faster processing, and everything is still validated.

In order to support this type of chain level parallelization, the codebase must support having multiple active chainstates. The main difficulty of having an arbitrary n number of chainstates (or even just two) is having to keep track of n number of UTXO sets. Since a UTXO set requires a database and a cache of the UTXO set on disk for speedups, this makes the hardware requirements of running the node a lot higher. However, since Utreexo CSN does away with the database for the UTXO set store, this is not a problem.

The implementation for having multiple chainstates (CChainState in Bitcoin Core, type Blockchain in btcd) is already underway. The work is vastly simplified for Utreexo CSNs as there is no need to have a database for each chainstate. This allows the implementation of any n number of chainstates doable.

Currently we’re still working on how the network p2p messages should be handled per chainstate. We’re trying out different methods (having two initial block download managers; keeping track of which chainstate asked for which blocks; etc) but still have a ways to go.

Limitations of this release

In the current release, there is no support for chain reorganization and mempools. Thus the node will run in “blocksonly” mode and if there is a reorg, the node will crash. These two things are yet to be implemented in the Utreexo library and is the reason why this release is only meant to be a demonstration. We don’t support bitcoin mainnet with this release, and it should not be used with real money, as it’s still in an early stage with known bugs.

Going forward

As mentioned in “Current Progress on Claim 1”, we will be doing more performance optimizations on the Utreexo CSN. This involves both speeding up the Utreexo accumulator and btcd components. We are currently aware of many issues that will speed up the CSN once fixed (which is just a matter of implementation and testing).

The work for implementing reorganization support began last year and was on hold since there were many other pressing issues that needed to be prioritized. Reorgs will be implemented in the near future. While the implementation of mempool support has not yet begun, we have been planning out how this will be implemented for a while now. I fully expect mempool support to be implemented this year.

The current Utreexo accumulator code is in Go. Porting the accumulator code to Rust and C++ is an ongoing effort. We’re not sure how long everything will take, but we’re at a point in the code where the foundations are there, and human-level parallelism can be readily exploited. There are plenty of things to work on if people would like to contribute!