Low Latency Trading Insights

Share this post

Hierarchical Timing Wheels

lucisqr.substack.com

Discover more from Low Latency Trading Insights

Fresh insights on low latency and high frequency trading. Mostly new relevant technology, C++, FPGA, Verilog, tutorials, short notes and of course, rants. I intend this blog to have a more instructional rather than purely technical focus.
Over 2,000 subscribers
Continue reading
Sign in

Hierarchical Timing Wheels

Henrique Bucher
Oct 13, 2023
7
Share this post

Hierarchical Timing Wheels

lucisqr.substack.com
Share

Hierarchical timing wheels are a clever solution for the scheduler problem. The event scheduler is a recurrent theme in event-driven architectures: something triggers an event that needs to be checked in the future one or more times. I has particular use in trading where you want to correlate one market event with price movements in the future at several time deltas (1ms, 100ms, 1s, 1 min, etc). During strategy discovery and backtesting the scheduling can reach billions of events.

The scheduler problem can be summarized in the following interface:

No alt text provided for this image

Low Latency Trading Insights is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

The scheduler can be trivially coded using the standard containers but it will lead to log(N) algorithms for scheduling. Other trivial solutions will lead to O(1) for scheduling but O(logN) for checking. 

The main requirements are:

  1. Both scheduling and checking need to be O(1) even for extremely large number of events (eg 1E8). It is acceptable if it is slower at times (hiccups) for rebalance for example.

  2. Individual events can be submitted multiple times

  3. Distinct events can be submitted under the same time (same key in a map)

  4. Events need to be fired in strict order by time.

Hierarchical timed wheels solve this problem beautifully by creating a hash map where the hash function is a straightforward combination of the bytes that compose the timestamp (assuming an integer). Therefore there is no overhead of computing hashes.

The HTW algorithm has been used in many places like in the Linux Kernel, in libuv (a popular network event library).

I recently posted a challenge on Code-golf but unfortunately it was closed with the comment "this problem cannot be solved". By that you see the quality of the C++ moderators at that site. That goes as a warning about relying on StackOverflow for anything more complex than simple language constructs.

A 1987 seminal paper by Varguese and Lauck describe in great detail the algorithm and also proves mathematically that the solution is O(1) constant time for both check() and schedule().

Link to Paper

The way that I usually code these wheels is by creating a 16-by-16 matrix of buckets. Each bucket contains a linked list of events. Each row corresponds to 16 times the duration of the row below. The lowest row contains the time "quantum" or smallest unit of time. Let's say for the sake of an example that it is one microsecond.

It's easier to understand with a picture, below.

No alt text provided for this image

At each moment in time the timestamp corresponds to a location in the matrix. The current moment "now" is always at the lowest row so any event scheduled for right now will be put in the bucket (0,12), as in the example.

But what about events in the future? The algorithm is very easy to calculate. The future event bucket will be the located at the first row that is different from the event hash and the now hash. In the example, it will be row 5, column 9.

As time progresses, the bottom bucket advances, firing events in each of the bucket's linked list. As it reaches the right of the wheel, it circles back to the first bucket (zero) and the bucket in row 1, (1,8) in the example, is downgraded into the row zero. And that recursively.

The algorithm scales massively with billions of events tested in our practice without a glitch. We use it to synchronize large number of data sources, as for example reading capture files from several exchanges at the same time.

This technique is also used in backtesting, to schedule the calculation of returns from a given event at different horizons, which tends to create a large number of future scheduled events.

The algorithm's structure is well suited for hardware development (eg FPGAs). We have implemented it with great success on Xilinx devices.

Care needs to be taken though with occasions where time jumps like when dealing with historical data. In this case the advance and promotion of levels can take time. To solve this problem each row contains its own bitmap signalling that it is empty or not. That allows for fast skipping through large periods with no events.

Low Latency Trading Insights is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

7
Share this post

Hierarchical Timing Wheels

lucisqr.substack.com
Share
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Henrique Bucher
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing