Understand the Power of Data Convergence

By Max Colas

Max Colas of CameronTec looks at smarter approaches to information overload and explains how improved management of data convergence can result in greater business insight and edge.

Max Colas, CameronTecEvery day Twitter delivers 300 million messages, 4% of which are actual news. Every 20 minutes, 3 million messages are published on Facebook and 10 million comments are added. Such mind-blowing numbers would be anecdotic if they did not highlight a trend – perhaps even a threat – that is also relevant in the trading world: information overload.

As usage of FIX grows globally and firms increasingly rely on their trading platform to contribute to their business edge, the risk for FIX users is that they focus on the wrong snippets of information, or miss the truly relevant trends. Addressing those challenges becomes a differentiator for FIX technology providers.

Previous generations of monitoring systems focused on displaying information, for instance by adding value in the shaping of data or user-friendliness of the interface like displaying logs with FIX tag/value expansion or showing “conversation views” that gathered together relevant messages. The mostly static log formats even allowed vendors to claim some degree of compatibility across FIX engines. Although useful, such systems are inherently flawed for two reasons:

1.              They assume that FIX operators should approach information linearly, and

2.              They expect all information that is relevant to a business to be contained in the logs.

Neither assumption proves true in today’s environment.

When algorithmic trading is involved, it is not unusual for FIX logs to grow by 10,000 lines per second for each session. When data flows converge from a number of FIX nodes across a pan-European topology, the dataset size can increase by multiple orders of magnitude. We are way past the display of logs on a screen. Gone is the linear approach to FIX data; gone is the time of perusing pages of logs one after another, of X-term windows scrolling slowly on a screen.

In fact, the only approach that remains at this point is to expect monitoring systems to deliver on two channels: “I tell you in advance what I am interested in and you notify me when it occurs” and “I tell you what I am interested in and you bring me the relevant results”. These approaches are not new: in the outside world, they are called Google Alerts and Google Queries. Technologies developed to implement this paradigm in the financial industry, such as California-based Splunk, have been in use for a few years. They all tend to gravitate around the convergence of data into one central repository to broaden the breath of searches. This, too, is an industry trend that is highly relevant to the FIX world, with a peculiar edge that is worth analyzing.

FIX data alone is bare and lacks the information needed to build a business edge, and systems that solely seek to draw insight from them miss out on extra dimensions that make a real difference. A broker who enters performance-related service level agreements with time sensitive customers should proactively monitor their adherence to the contracted latency terms. This reasonable business need underpins four requirements:

1.              Being able to  monitor FIX message latency (i.e. generating performance data),

2.              The ability to analyse the data on a continuous and rolling basis,

3.              To be able to compare results against contracted thresholds (which can differ from one client to another and therefore requires client data), and

4.              Dissemination of the results (as confirmation of adherence or alerts of breaches) to account managers and perhaps the customer itself, thereby requiring contact details data normally held in corporate directories or configuration datasets.

In other words, business edge is no longer drawn from FIX data alone: a body of derived and peripheral, yet necessary, data gravitates around order messages, which contribute decisively to the shaping of the business. Separate solutions, each looking after one aspect of the process such as monitoring the latency figures, also prove suboptimal unless they truly integrate with all other data sources to nourish a central pool of data that provides the substance for business alerts and queries. The need for integration – which acknowledges that ‘niche’ expertise might be used if the business calls for it – places the onus on FIX infrastructure providers to open themselves through the use of industry standards, open and extensible languages, and rich and actionable APIs. Functional modularity and technical openness simply reflect the necessary diversity in the technology landscape which CTOs are called to draw.

Concentrating the actionable data in itself is hardly sufficient without analytical capabilities – a view that leading vendors have embraced as they now combine broad data sets with tools to build intelligence as part of their product suite. Similar capabilities are required for the lower level of trading backbones and infrastructures so that most specific business interests should be covered. To address these interests, advanced technology such as complex events processing and data mining analytics are entirely relevant and leading vendors make use of them. Such tools make it possible, for instance, to alert traders to perceived connectivity sluggishness whilst the FIX node reconfigures its FIX sessions on the fly to transit through alternate networks when, say, the average execution time for market orders from the algo desk exceeds 10 milliseconds for a rolling time window of 1 minute – all this automatically with complete control and, of course, the ability to simulate and test those scenarios as part of the daily automated QA process.

This final example sheds another light on the business requirements: understanding the gist of FIX data and all the data that gravitates around it hardly matters until the information is brought to the attention of the consumer. Avoiding information overload means sifting through a multitude of updates and notifications to separate the wheat from the chaff and then serving up the result however the consumer wants to absorb it. Twitter exists for virtually every platform precisely because different people call for different media. Escalation paths are also important aspects when information is deemed actionable. The ability for notification systems to integrate with existing technologies /platforms, and the flexibility to tailor the message with the adequate management of escalation is the precise differential between technology-centric and business-centric solutions.

Click to contact the author:

Open for Discussion

recommend to friends
  • gplus
  • pinterest