Fixing The Privacy Gap In Proof Of Liability Protocols

Abstract: We examine the state of proof of assets and proof of liability schemes in the cryptocurrency space, systems whereby exchanges publish a list of user balances, which add up to the total exchange liability. Users are then able to verify their balance is included in the list, proving assurance over the solvency of the exchange. We look at Coinfloor’s impressive system, which includes monthly attestations. However, we discuss the low level of adoption from large exchanges in the space, which may be partly driven by privacy concerns related to such protocols, which can reveal information detrimental to user interests, such as account balances and how these balances change over time. We propose a new proof of liability scheme, which mitigates many of these privacy problems. The key innovation is that user account balances are split into multiple chunks in the liability list, with the ratio of the split determined by a random number generator, thereby information about the distribution of account balances on the participating platform is no longer revealed.

Overview

In traditional finance, one of the tools available for potential clients when evaluating the solvency of deposit taking financial institutions are third party audits. This same procedure, third party audits and the publication of financial statements, can of course still occur in the cryptocurrency space. However there are other, much stronger tools available, most famously proof of liabilities and proof of reserves. Combined, these two proofs can become a proof of solvency, at least with respect to a particular token.

Proof of Reserves > Proof of Liabilities ⇒ Solvency

In the cryptocurrency space non-custodial exchanges or decentralised exchanges are becoming increasingly popular and in this area, where there is no custodial risk, solvency is not normally an issue, assuming there are no critical bugs. However, centralised custodial exchanges remain popular, due to several advantages they have over non-custodial systems, such as lower latency, higher capacity, and superior characteristics in the area of maintaining the privacy of orders until execution. 

In this report, BitMEX Research works in collaboration with the BitMEX Wallet team to propose a new proof of liability proposal. We build on Greg Maxwell’s proof of liability scheme, which involves constructing a Merkle tree, with each leaf representing a user’s balance, by adding new mechanisms to the tree construction to improve user privacy.

Proof of Assets

In the cryptocurrency space, implementing a proof of assets system is simpler than a proof of liabilities system. For proof of assets, all a centralised platform needs to do is periodically publish a full list of all the Bitcoin addresses the exchange claims to control. Any third party can then take this list and sum up all the bitcoins associated with these addresses on the blockchain at particular block height. Therefore, essentially all that is required to prove an exchange’s cryptocurrency assets is the publication of a list of addresses. An improvement on this would be to publish or disclose how the addresses are derived from a set of public keys, for instance redeem scripts or derivation paths for every address under the exchange’s control.

The above system is not perfect. There’s a possibility the exchange could lie about having the private key associated with a particular address. However, if the exchange demonstrates the ability to sign messages or transactions regularly with all such keys, then third parties may have a high degree of assurance over the proof of assets. This is the case, for example, with the BitMEX exchange. BitMEX uses the same three public keys to generate all of its addresses and conducts daily withdrawals.

An exchange could of course exclude some assets from the list. However, the proof of assets value can be considered as the minimum value of assets under an exchange’s control, and as long as the assets are greater than the liabilities, the exchange can be considered solvent.

Proof of Liabilities

In contrast to a proof of assets, a proof of liabilities is a much more challenging concept. An exchange could publish a list of all the liabilities at a particular point in time, just like with the proof of assets. This list could theoretically include the account balance and name of every creditor (more on why this isn’t feasible for privacy reasons below). However, an exchange could always exclude certain liabilities from the list, thereby making the company appear solvent when actually it is in an insolvent position. To mitigate this problem, there is the idea that every individual creditor could look through the public list of liabilities and see if they are included. If any one creditor claims they are not in the list, the public could potentially consider the exchange as fraudulent. There is of course the problem that not all creditors check the list. However, this system is likely to be reasonably robust, in the sense that users can have far more assurance over the solvency of the exchange compared to platforms that do not conduct this process at all.

The above methodology is obviously particularly bad for user privacy, in that the public would know every user’s name and account balance history. Prior to this paper being published, there were two methodologies that could be used in conjunction with each other to improve user privacy. These two ideas were initially described by Greg Maxwell in around 2014.

  1. Obfuscating the link between published balances and user identities – The user’s account name could be excluded from the list; instead the list could contain a hash of the user’s name/email conjugated with a random nonce assigned to each user. That way an individual user could ensure they were included in the list, while their name would not be revealed to the public.
  2. Partial publication of user balances – The full list of liabilities is not published, instead the liabilities are all arranged as leaves in a Merkle tree structure, with the total liabilities of the exchange at the top of the Merkle tree and the hash at the top of the Merkle tree published. Each user is then shown the minimum information necessary to get from their account balance to the root of the Merkle tree. Thereby, each user can have a high degree of confidence that their balance is included in the total, whilst only learning minimal detail about other users. 

We have provided a simple Merkle tree below as an example, with an exchange that has only four accounts. If you are the holder of account 001, then only the information inside the red area is required in order to ensure your balance of 200 satoshis is included in the exchange liability balance of 1,400 satoshis.

Figure 1 – Example of proof of liability scheme

The approach of only revealing different parts of the tree to different account holders may not be particularly strong. For example, third parties such as media organisations or regulators gain little assurance as to the liabilities. In addition to this, if enough account holders publish the parts of the tree they know (which could for example be one entity with multiple accounts trying to learn about the tree structure) then the details of the tree could gradually enter the public domain anyway. At the same time, if enough privacy techniques are used to obfuscate the link between the leaves and individual creditors, then failing to disclose the full tree to the public may provide limited privacy benefits. Therefore, in our view, it’s probably necessary and effective for an exchange to publish the entire tree. One could argue that this makes the Merkle tree structure pointless and the exchange should use the list approach, however, adopting the tree provides flexibility, should the exchange wish to cease publication of the full list at some point.

Current State of Proof of Liabilities

To our knowledge, the UK-based spot Bitcoin exchange Coinfloor is the only exchange which publishes a full proof of solvency. Coinfloor publishes a full proof of assets and proof of liabilities each month. The exchange publishes a simple list of all the user account balances in a table with two columns. The first column is a hash of a random string assigned to each user conjugated with the timestamp of the exercise. The second column is the user balance in Satoshis. After Coinfloor does this exercise, they then take a hash of the proof of liability table and add this hash in the OP_Return field of a Bitcoin transaction, which spends a greater number of Bitcoin than were included in the liabilities. Therefore this acts as a proof of asset scheme, proving the solvency of Coinfloor. One can see an example of such a Bitcoin transaction here.

Kraken is another exchange who has experimented with proof of liability protocols. However, as far as we know the scheme they have developed is not currently active. One can also read a recent article by Nic Carter, which includes a summary of the adoption of proof of reserve schemes by various cryptocurrency exchanges. In our view, Coinfloor has the most robust system currently deployed.

Privacy Problems

The Coinfloor system appears reasonably strong and probably provides as good of an assurance one could ever hope for about the solvency of the Coinfloor platform. It is not perfect, at least from a user privacy perspective. There are also potential commercial considerations for Coinfloor. The company publishes the exact distribution of all their customer account balances every month. As Figure 2 and 3 below indicate, industry analysts and perhaps competitors can obtain useful information about Coinfloor’s customers and potentially use this information in a manner not conducive to Coinfloor’s interests.

Figure 2 – Coinfloor user account distribution by account balance (April 2021)

Source: Coinfloor, BitMEX Research

Figure 3 – Coinfloor Bitcoin held under custody by account balance (April 2021)

Source: Coinfloor, BitMEX Research

Analysts can also use the information to infer more details about individual Coinfloor customers. For example the public can see all the account balances and if account balances do not change (for example dormant accounts) it may be possible to link the accounts together. For example Coinfloor’s April 2021 solvency report includes an account with a balance of 49,456,707 satoshis. The March 2021 solvency report also contains an account with a balance of 49,456,707 satoshis. Therefore one may believe with a reasonably high degree of probability that there exists an account on Coinfloor with a balance of 49,456,707 satoshis, which has remained unused for at least one month.

The Coinfloor system appears robust and effective and Coinfloor’s conduct in this area should be commended and the weaknesses we point out here are reasonably small. In reality it is likely Coinfloor management are not particularly concerned by the potentially commercially sensitive information they are revealing and that Coinfloor users are not especially bothered by this either. However, if a major exchange like Binance, Coinbase or FTX conducted this process, it is not too difficult to imagine that cryptocurrency hedge funds and industry analysts poring over the data each month, making inferences that would not always be especially helpful to the exchanges or their depositing customers. In our view these privacy concerns are preventing major exchanges from adopting these proof of liability schemes.

Proposed Solutions to Privacy Issues

The BitMEX Wallet team and Research team have developed a scheme to mitigate some of the concerns mentioned above. The core idea is that individual account balances are randomly split between multiple leaves. Each user balance is split at least once and therefore goes into at least two leaves. At each snapshot point, perhaps every month, a random real number between 0 and 1 is generated for each user. The user balance is then split according to this fraction, for example if a user had a balance of 200 satoshis and the number 0.400 was chosen, the balance would be split into two chunks, the first with 80 satoshis and the second with 120 satoshis. This splitting strategy happens to the list until a sufficient amount of padding is achieved, resulting in two or more splits for each user balance. To verify your balance, you would need to find it in multiple leaves of the tree (or multiple entries in a list) and then sum them up to check it matches your total expected account balance. The main details of the newly proposed scheme are summarised below.

A random account nonce is generated for each user and disclosed to each user. This account nonce is static throughout the lifetime of the account. Users are expected to keep this private to ensure the privacy of their account balances.

Example Account nonce: 8h9n09il7sdftg74r1

In addition to the Account nonce, there is also a sub-nonce, which changes depending on the Bitcoin block height. This works in a similar fashion to Coinfloor’s SHA1 hash which has a timestamp as an input in the hash function and therefore changes for each proof of liability snapshot.

Sub-nonce = Sha256(Account nonce, Bitcoin block height)
Example sub-nonce: sha256(8h9n09il7sdftg74r1,680359)=3d70334e74430e0dc4b4e4c8a4152a604a6183428247cf151daa991eb89e9a41

There is then one more hash to calculate, this hash will be included in the Merkle tree, which may be disclosed to the public.

Hash(Account ID, Sub-nonce, Leaf balance in Satoshis, leaf position index)
sha256(001,3d70334e74430e0dc4b4e4c8a4152a604a6183428247cf151daa991eb89e9a41,80,2)=4bf681c36300cbf94cbcd0456ede75826ed90c51497821fd732adba76b22fe51

Figure 4 below illustrates the Merkle tree that could be published using this scheme and this example. The positions of the leaves are shuffled for each snapshot. A liability balance is displayed in plain text on each leaf in the tree, which adds up to the total liabilities at the top of the tree. At the same time the hash digest of each leaf is conjugated with a neighbouring leaf and hashed again, to move up the tree. Each hash includes both the hash digest and the balances in the branches below as inputs.

Figure 4 – Example of proof of liability scheme with improved privacy characteristics

The account balances in Figure 4 are the same as for Figure 1, except that each account has been split into two leaves and the position of the leaves has been shuffled. For example account 001 has been split into two leaves, leaf index two with an 80 sat balance and leaf index four with a 120 sat balance. Together these two leaves add to the total balance of 200 satoshis, which is visible in Figure 1. The sub-nonce is provided by the exchange to account holder 001. The client can now see their account balance in two leaves, by searching accross the bottom of the entire tree and looking for a collision.

The combined impact of the two privacy techniques, shuffling and splitting, ensures many privacy characteristics for the exchange and users are preserved. It is no longer possible to see a distribution of all the account balances or track the balances of any users over time.

The hypothetical illustration below, Figure 5, shows the BitMEX exchange revealing the necessary information to a user, when logged in to the platform.

Figure 5 – Illustration of the BitMEX exchange platform disclosing the necessary proof of liability information to a user

How to Conduct the New Proof of Liability Process

Assuming that the exchange publishes the full Merkle tree, a user can verify their liabilities are included in the total as follows:

  1. Download the entire Merkle tree over the internet, this could be several hundred MB of data
  2. Download a provided software package that can scan across the Merkle tree leaves
  3. Obtain the user account ID and latest sub-nonce from the exchange after logging into the platform
  4. Locally, the installed software will scan across all the leaves of the Merkle tree, performing hash functions and looking for a collision in each leaf. The inputs to the hash function are the account ID and sub-nonce provided, as well as the leaf index and balance. The leaf index and balance can be inferred from the tree itself, the leaf index is known by the position in the tree and the balance is already provided in plain text in each leaf.
  5. Once all the collisions have been identified, the user is told which leaves contain their balances. These can be summed up to make sure they match the user’s expectations and their account balance at the snapshot point. The user can then be reasonably certain their balance is included in the list of liabilities.

Summary of Fields in the Merkle Tree

The following table includes all the fields which are inputs to the hash where the output string is displayed in the leaves of the Merkle tree. The table explains why it is necessary to include each field.

Item

Why is it necessary?

Sub-nonce

The sub-nonce is necessary because it enables users to reveal information to the public or third parties that can be used to prove their balance is contained in a particular snapshot, without revealing the entire account balance history (and future balances). The account nonce can be kept secret.

Leaf index

The leaf index is a necessary field because it prevents collisions between leaves in the same Merkle tree. Such collisions would prove that both leaves belong to the same account holder (since the sub-nonces would match) and therefore potentially reveal too much information about a user’s balance to the public.

Balance

Including the balance is necessary to ensure the hashes commit to the values being added up and that these values flow up the tree. This then enables individuals to compare the root hash with each other and as long as the hashes match, they know they have the same Merkle tree, commiting to the same balances.

Four Types of Assurance

Using these proof of liability schemes there are essentially four of types of assurance stakeholders can use to evaluate the exchange:

  1. Examine the Merkle tree and liability list to ensure that it sums up to the total liability balance and that this amount is lower than the reserves
  2. Verify that one’s own balance is included in the public liability list
  3. Verify that the balances of other users are included in the liability list, by receiving the sub-nonce and account ID from other users, either directly from them or if they have chosen to make them available to the public
  4. Observe the lack of other reputable account holders claiming their liability is absent from the liability list (A hostile entity can always claim this and we are unsure there is a perfect way to prevent this)

Conclusion

We have described a new proof of liability proposal, which splits user balances into multiple chunks and randomly shuffles these as leaves in a Merkle sum tree. The proposal includes other fields to prevent collisions and a nonce that changes each snapshot. When combined together these innovations allow users to have a higher degree of confidence over the solvency of the exchange, without any significant privacy leakages. The downside to this new scheme is complexity, however the exchanges which adopt this scheme should bear the burden of the complexity and it should be possible to provide a set of tools allowing users to conduct this process themselves with a reasonably low degree of effort.

Adoption of proof of solvency schemes has been low to date. This low adoption appears to be driven by both a lack of consumer demand and also the potential privacy issues. We believe this proposal mostly solves the problems associated with the latter. As for the former, this may remain an issue.