WebSocket Systems Upgrade Complete: Significant End-User Latency Improvements

From 17:00 to 19:30 UTC on 21 April 2020, BitMEX deployed the next-generation version of its Feeds architecture. This system is internally known as the “Publisher”. It is responsible for receiving the firehose of raw data from the trading engine, parsing it from our internal IPC format to JSON, splitting it into subscriptions, and publishing it to the edge web servers.

The result is a 10x improvement in p90, p95, and p99 latencies in time spent in our internal Publisher system for most data feeds, with some feeds reaching nearly 20x. The most-impacted feeds were trade, orderBook (all types), order, and execution, across all symbols. The least-impacted feeds were position and margin.

Most of this latency advantage is passed directly onto the end-user. While the Publisher is not the entire source of latency between Engine events and your application, it was the largest, and produced the most outliers. We expect a significant improvement in variance to be seen by your applications. We are now targeting and stamping out lingering sources of latency in our WebSocket implementation to further improve these numbers.

This is the fourth generation of our Publisher architecture, and the fastest by far. It is capable of processing a very large volume of messages in parallel, while preserving intra-table ordering and accurate subscription building. However, inter-table ordering is no longer guaranteed.

The resultant improvement in latency variance is dramatic, as seen in this chart showing the average processing time of orderBookL2_25 updates before they reach the WebSocket servers.

A similar improvement is seen on the trade feed:

The vast majority of BitMEX subscriptions follow the above pattern. The following chart shows the improvement in mean, p90, p95, and p99 across all tables combined:

We hope you enjoy this improvement to the trading experience. Our teams are working hard to deliver more infrastructure upgrades, from engine, matching and re-margining throughput, database throughput and capacity, web-tier response times and auto-scaling, and even new order and contract types. We will announce them as they launch in the coming months.