Streaming, low-latency communication in on-line trading systems

This paper presents and evaluates the performance of a prototype of an on-line OPRA data feed decoder. Our work demonstrates that, by using best-in-class commodity hardware, algorithmic innovations and careful design, it is possible to obtain the performance of custom-designed hardware solutions. Our prototype system integrates the latest Intel Nehalem processors and Myricom 10 Gigabit Ethernet technologies with an innovative algorithmic design based on the DotStar compilation tool. The resulting system can provide low latency, high bandwidth and the flexibility of commodity components in a single framework, with an end-to-end latency of less then four microseconds and an OPRA feed processing rate of almost 3 million messages per second per core, with a packet payload of only 256 bytes.

[1]  David A. Bader,et al.  Financial modeling on the cell broadband engine , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[2]  Dinesh Manocha,et al.  General-Purpose Computations Using Graphics Processors , 2005, Computer.

[3]  Wayne Luk,et al.  FPGA Accelerated Low-Latency Market Data Feed Processing , 2009, 2009 17th IEEE Symposium on High Performance Interconnects.

[4]  David A. Bader,et al.  Faster FAST: multicore acceleration of streaming financial data , 2009, Computer Science - Research and Development.

[5]  Fabrizio Petrini,et al.  DotStar : Breaking the Scalability and Performance Barriers in Regular Expression Set Matching , 2008 .

[6]  Fabrizio Petrini,et al.  Cell Multiprocessor Communication Network: Built for Speed , 2006, IEEE Micro.