Fast or Focused? Making Low Latency Accurate
By Ali Pichvai
Quod Financial CEO Ali Pichvai advocates a re-examination of speed relative to risk.
The oversimplified debate on latency, which states ‘trading is all about speed’, does not represent the true situation. Latency is primarily a consequence of the market participant’s business model and goals. A liquidity provider sees latency competitiveness as vital, whilst a price taker considers it of less importance in the overall list of success factors. This article uniquely focuses on processing efficiency, considering that distance/ co-location has long been debated. The processing efficiency is determined by:
The processing efficiency is determined by:
*Number of processes:
The number of processes and the time an instruction spends in a given process will give a good measure of latency. As a general rule of thumb, the fewer the number of processes, the lower the latency. An arbitrage system will most likely consist of as few processes as possible, with a limited objective. For instance, a single instrument arbitrage between two exchanges can be built around three processes – two market gateways and one arbitrage calculator/order generator. An agency broker system will host more processes, with pre-trade risk management, order management and routing, intelligence for dealing with multi-listing and the gateway, as the minimum number of processes. The trend of latency reduction was sometimes at the expense of the amount of critical processing; for instance in the pursuit of attracting HFT houses, some brokerage houses provide naked direct market access, which removes pre-trade risk management from the processing chain. An initial conclusion is that it is very hard to reconcile a simplistic and, limited-in-scope, liquidity taker system with more onerous price taker systems.
This is where the process flow between different processing points is as efficient as possible, with minimal loops between processes, waiting time and bottlenecks. It also considers the comprehensive view of the architecture between the network and the application.
*Single process efficiency:
Two important areas must be reviewed:
There is an on-going debate on what the best language for trading applications is. On one side there are the Java/.NET proponents, who invoke that ease of coding and maintaining a high-level development language (at the expense of the need to re-engineer large parts of the Java JVMs). On the other side there are the C++ evangelists, who utilise better control of the resources, such as persistence, I/O and physical access to the different hardware devices, as a demonstration of better performance. The migration from main exchanges and trading applications (away from Java) to C++ seem to indicate that the second church is on the ascendancy. Beyond the coding language, building good parallelism in processing information, within the same component, also called multithreading, has been a critical element in increasing capacity and reducing overall system latency (but not unit latency).
Finally, there are attempts at putting trading applications or components of trading applications on hardware, which is often referred to as hardware acceleration. The current technology can be very useful for latency sensitive firms, to accelerate the most commoditised components of the overall architecture. For instance, vendors are providing specific solutions for market data feedhandlers (in single microseconds), that would result in market data to arbitrage signal detection of tens of microseconds. Yet trading is not standard enough to be easily implemented on such architecture. Another approach is the acceleration of some of the messaging flow, by accelerating middleware and network level content management. This goes in hand with the attempts of leading network providers to move more application programming lower into the network stack.
o Exploiting multi-core processors:
With the rapid increase of the number of cores available on the same processor, new techniques to exploit this new chip architecture are essential in successful software development. The acceleration techniques, such as by-pass kernels or by-pass network layers, require a rethink of software design. This leads to maximising the processing power available on the same chip rather than on another server.
Liquidity providers and price takers have very distinctive aims, in some instances, leading to contradicting trading execution objectives. The current latency push is as much derived by technology change, as it is a marketing tool by vendors and venues that indiscriminately and solely focus on latency to prove excellence. In effect, a fit-for-all formula, which is uniquely based on latency, is counter-productive. Latency should be approached with an understanding of the current technology challenges and what the upcoming changes, whilst taking into account the overall execution and trading goals.
Defining latency requirements
Latency requirements are primarily defined by the business objectives, with a broad distinction between liquidity providers and price takers.
For liquidity providers, the latency that matters is relative latency, defined as their ability to be faster than their peers and the exchanges. The aim is then to execute within the shortest time possible both in terms of detecting price discrepancy and in terms of execution. This has become the primary driver of the current arms race for exchanges and venues, (to attract liquidity) as well as the liquidity providers (to beat their competitors). Interestingly, liquidity takers focus mostly on top of the book price (and spread), which tend to have low fill rates, and the important latency is the single order latency. For price takers, the latency that matters is absolute latency, which is their ability to take liquidity in a fragmented market place. This category is focused on the time it takes to execute an overall investment strategy. Consequently, they are mostly interested in the fill rate and have a much higher appetite to take liquidity within the order book.
An example to illustrate the difference between these categories, we can consider two systems, with different latency and hit ratios: System A, has an average hit ratio of 90% and round trip latency of 1ms, and system B has hit ratio of 30% at 0.5ms. Therefore, for every 1,000 orders executed, system A would provide better results (+50%) for a price taker than system B. In real life, the former example, System A (price taker-oriented) and System B (liquidity provider-oriented) would have much larger performance differences. System A would create real capital risks for a liquidity provider, and System B would have very poor execution performance.
In reality, it is much more complex than the example above, with latency, hit ratio and standard deviation (which in some cases, provide the predictability of the execution), entering into the equation. The table below illustrates some large differences between the two main categories:
As explained in the analysis above, a narrow emphasis on latency would not only give poor results, but it also carries risks. A better approach is to look at smarter latency as the shortest amount of time that it takes to execute an order/instructions with the highest success rate and lowest capital and execution risk.
Smarter latency: Risk and rewards
The smarter latency benefits for a price taker include:
- Higher fill rates: Visiting as many lit and dark venues, by focusing on all levels of liquidity and hunting for hidden liquidity.
- Lower cost: Avoid paying expensive fees for co-location, mixing execution on different venues with competitive price structures, examples include rebates offered by the ECN/MTF for passive orders, and by tapping into liquidity venues, which provide special liquidity, such as dark pools.
- Lower risk: Putting more risk management in place for managing position risk, but also execution risks, which is inherent with millisecond machine-to-machine interaction (e.g. a rogue algorithm would be able to quickly create a huge number of unwanted orders/quotes before ahuman can detect it).
Such results can be obtained through algorithms and smart order routing applications focusing more time discovering higher quality liquidity (e.g. level 2 prices across different lit venues and dark pools), minimising costs and fees incurred at a lower risk.
The debate is clear; it is inevitable the exponential increase in trading speeds, albeit at a less aggressive pace, will continue as technology evolves. However, this creates systemic risks that are too important to neglect, as the Flash Crash bore witness to. HFTs, and in particular, market makers, have an essential role in the health of the market, by
providing liquidity and lowering spreads, but an onerous clampdown of these market participants would harm the market, rather than cure the risks it has created. Liquidity providers benefit from simple algorithms and very fast infrastructure, yet this is to the detriment of price takers, who are the majority of market participants.
Smarter latency is the response to systems that have been designed with one single purpose, latency reduction, which often delivers a poor level of performance and a high degree of risk.
Investment in key technology platforms, by all market participants, namely, the buy-side, sell-side, exchanges, but also regulators, would establish a reasonable risk managed market. Smarter latency would enable primarily price takers to further benefit from the market today through the addition of intelligence to the more latent components of the trading ecosystem.