An even-handed story on high frequency trading?
After months of reading negative spin pieces on high frequency trading (HFT) in the press, I was pleased to see Reuters publish a fairly even-handed piece on the subject that not only captures the usual criticisms of HFT, but also a few of the positive aspects. For example:
“A misunderstood dynamic of high-frequency trading is that it thrives off volatility, thereby reducing it. The clear winners in the revolution are small investors, who have seen their trading costs fall remarkably and markets price shares far more efficiently.“
The article also highlights the degree to which these firms keep tabs on and try to one-up each other:
“Lotus Capital Management LP of New York earlier this year realized that a competitor was beating it to a trade it had programmed by exactly 3 microseconds, day after day. The loss meant Lotus was forfeiting about $1,000 in daily revenue on that particular trading strategy. Lotus, a quantitative trading firm that uses high-frequency strategies, invested and tinkered, eventually shaving five microseconds from the router and two microseconds from the execution server.”
I don’t know if that particular story is true or folklore, but it underscored the effort that goes into being in the pole position for a given trading strategy.
Trading microseconds for nanoseconds

The co-location of market data systems near or inside exchanges is becoming big business. The ultra-low latency high frequency trading systems that you find in these facilities are niche applications to be sure, but what a niche! NYSE Euronext recently committed to build a 400,000 square foot co-location facility in New Jersey. That’s a big investment to make in something NYSE Euronext CEO Steve Rubinow describes as being for “only the most obsessive traders.”
How obsessive? Architects building these systems measure latency in microseconds, and the best applications exhibit just tens of microseconds of end-to-end latency. Shaving microseconds is like dropping weight before your prize fight weigh-in—whatever it takes, get it down.
To help these latency obsessed traders develop even faster trading systems, Solace has extended its Unified Messaging API to include a shared-memory transport based on inter-process communication (IPC). This capability lets two applications share information using Solace’s API with less than 700 nanoseconds of average latency in a shared memory environment. Yes, I said nano — billionths of a second. Remember the famous Tabb Report on The Value of a Millisecond? There are a million nanoseconds in a millisecond. 700 nanoseconds is a scant seven-tenths of a microsecond.
To be clear, IPC is a highly-specialized technique that only certain systems can leverage because it occurs within the confines of a single server. For example, when the components of a high-frequency trading system (feed handler, algo, risk assessment, order execution) have been consolidated onto a high-powered multi-core server within a collocation facility. Today these applications run on many machines and share data using low latency messaging (like Solace’s). Shared memory transport among applications running on a single server eliminates the few microseconds associated with network hops and additional time lags associated with copying memory around between applications. And since IPC is now available as part of the same API customers already use for ultra low latency and other kinds of messaging, applications get the speed they need without giving up the familiar API or the flexibility to redeploy in a networked scenario as needed.
As always, we’re not publishing some mysterious single number with no detail on what it means. A white paper describing the environment and parameters of the tests is available for download on our website so customers can dig into the facts and even reproduce the results using their own systems and data. In fact, we did all the testing a quad-core 3GHz Intel Xeon E5450 server because not everyone has the latest Intel Nehalem.
HFT architects have generally been exempt from corporate technology standards because the stakes are so high they can justify whatever makes them faster. With Solace, HFT no longer needs to be an exception. The same messaging API that is speeding up back office and front office networked trading can be used to speed up collocated HFT trading as well.
Listen all y’all it’s an arbitrage…
Arbitrage is back, that is, if it ever really went away. Rob Curran recently wrote a piece in the Wall Street Journal’s MarketBeat blog on how the 10-15 millisecond gap between the National Best Bid and Offer (NBBO) and the pricing algorithms in most dark pools of liquidity is making money for technologically advanced traders using latency arbitrage. Basically if you can calculate the NBBO a handful of milliseconds before the market does, you know where the market will be before it gets there. Easy money, and not violating any current laws.
In fact, the current economic downturn ensures that this strategy will remain valid for years to come, since many of the sources of liquidity are experiencing budget freezes and will inevitably experience new regulatory distractions as the recovery begins. This locks them into current (comparatively slow) latencies for several years. Meanwhile, many of the smaller, more nimble hedge funds and private equity firms are aggressively investing in high-volume, ultra-low-latency infrastructure that measure decision making, order routing and order execution in 10s of microseconds or less.
This is just another validation of the “value of a millisecond” made popular by the Tabb Group in a report last year. As the government slowly shuts down the exclusive night club formerly known as Wall Street, it’s good to know that someone out there is still fighting for their right to party.
Spotlight on Risk Management

There is a good story in Advanced Trading this week about the challenges of applying yesterday’s risk management solutions to today’s market requirements. The whole article is a good read, but you can cut to the chase and just read the summary:
There are three specific data pitfalls that can obscure risk analysis at the portfolio manager and risk officer levels, according to Adam Sussman, director of research at TABB Group:
1. Out-of-Sync. The frequency of the risk data updates lags behind the fast- moving markets. Similarly, the time horizon of the analysis can be misaligned with the investment objective of the portfolio.
2. Opacity: Unfamiliarity with the model behind the analytics puts people at greater risk of making bad decisions.
3. Rigidity: By looking at the same data in the same way, funds are more likely to be negatively impacted by one another. Similarly, approaching risk from a too narrow or rigid viewpoint can obscure vital changes to the risk of a portfolio.
Low Latency Spending Chugs Along
Yesterday Greg MacSweeney at Wall Street & Technology wrote a story about continued robust spending in low latency spending. This reinforces the trend that we noted here a few weeks ago, that being that algorithmic activity is spreading rapidly from equities to options and FX. From the WS&T article:
Why is it so important in the options market? “Options is the epicenter of market data, with 2 billion messages a day,” McPartland tells WS&T. “There are a lot of people executing high-speed options strategies.” With the Options Price Reporting Authority (OPRA) recommending that market participants have the capacity to handle nearly 2 million messages a second (10 billion a day) by January 2009, simply handling all of that data is going to require firms to use technology that reduces latency in all parts of the process. (my emphasis)
These kinds of numbers are also behind the increasing interest in hardware solutions for handling the combination of high volume and low-latency. Software is notorious for requiring you to choose one or the other. Hardware middleware gives you both.
Ultra-low latency: when latency focus goes pathological
Last week I discussed general latency issues that affect all kinds of businesses, but the ultra-low latency space within financial services is its own animal. And it’s a real beast, too…mere microseconds can cost companies and their clients countless dollars. So you’ll have to excuse the architects tasked with squeezing them out of the system for being downright pathological about latency.
When people talk about ultra-low latency in financial services, they usually mean front office market data delivery — the elapsed time between a buy or sell occurring and a trading application becoming aware of it. Here the laundry list of issues boils down to extreme focus on just a few:
Blogs abuzz over latency
While financial markets have always cared about latency, it’s not a topic often discussed in other industries. This past week, I stumbled across a couple of excellent blog posts on the subject of latency, primarily as related to web and cloud infrastructures.
First Todd Hoff posted on website scalability with his expansive Latency is everywhere and it costs you sales post. Nati Shalom of Gigaspaces built upon that post to include sweeping architectural suggestions from the perspective of a leading data grid provider.
Both are fascinating reads and highlight all the usual sources of latency as well as some that are less obvious. Highly recommended.
There is no one-size-fits-all set of suggestions for dealing with latency, which is why both of these posts are so encyclopedic. The latency hot spots in a web application with 500 milliseconds of latency will be completely different than a 50 millisecond database access application and different still from a market data system with less than 100 microseconds of latency. The same way you wouldn’t use your hammer and drill to fix a watch, you need special precision tools.
I’ll go more into detail on ultra-low latency issues tomorrow.
Hardware acceleration in the spotlight at HPoWS
Last week at the High Performance on Wall Street event, conversation was dominated by the role that specialty hardware can play in accelerating financial trading environments. Technologies like FPGAs, network processors, ASICs and even GPUs were the center of discussion as firms with direct experience shared performance metrics, specialty hardware providers communicated their advantages and software-only solutions took pot shots.
There are now compelling specialty hardware components for nearly all of the end-to-end performance chain inlcuding:
- GPU assist for algo and Monte Carlo simulations (NVIDIA)
- FPGA-based feed handlers (Celoxica , Exegy, Red Line )
- Network processor & FPGA-driven messaging (Solace Systems, TIBCO)
- Network acceleration technologies (Cisco, NetEffect, Arastra)
- Analytics (XtremeData )
At last year’s HPoWS, hardware acceleration was an emerging story. This year it was center stage in the industry’s premier event focusing on ultra-low latency and coping with data volume growth. There is a good summary of the event from HPCWire here.
Further thoughts on FPGA co-processing and performance
I’d like to go one layer down on a point that was introduced in the last post and why you can’t lump all non-software approaches into one hardware bucket. Software suffers in terms of performance in two fundamental ways, heavy CPU loading when tasks are complex, and kernel to application space context switching when iteration counts are high (ie. millions of anything per second). Let’s go through how hardware helps with each.
Issue 1: Heavy CPU Loading/CPU Offload
CPU intensive operations are difficult for general purpose software to execute on general purpose CPUs. Examples might be monte carlo simulations, complex algorithms or complex transformation of large data records. Think of it this way: let’s say that the CPU cost of running simulations in software is as follows:

Intel promoting on-board FPGAs to address low-latency financial market
Rik Turner of CBR recently wrote an interesting story about Intel trying to break into the low-latency financial services space by courting FPGA chip manufacturers and solution providers leveraging FPGAs to partner with Intel as they launch their Nehalem technology, faster front-side bus and FPGA co-processing capabilities.
Of course the idea of hosting FPGA co-processing is not new, AMD has been offering a version of this approach for over two years. Intel is clearly playing catch-up here. It’s also not surprising that Intel would be concerned about any key market moving processing to specialized hardware that can outperform software on Intel processors by 10, 20 or even 50 times. Especially if one box of non-Intel special-purpose hardware can replace the work of 10 to 30 Intel boxes running software.
This is a classic case of if you can’t beat ‘em join ‘em, which has been a successful strategy for Intel in the past. The question is, who exactly will they be joining?


