Evolutionary approach to optimization of data representation for classification of patterns in financial ultra-high frequency time series

This paper proposes an evolutionary approach to optimization of data representation for classification of patterns in financial ultra-high frequency time series. Input data describe order book shapes defined by sequences of price-capital pairs coming from ask and bid orders registered on the stock market. Target data describe the change of the mid price. Classifiers based on SVM try to predict the direction of changes of the mid price on the basis of the input data. An important problem in such an approach is the representation of the input data, because the raw input data consist of long and irregular queues of orders, and need to be transformed into a feature vector. An evolutionary approach is proposed to optimize the representation of the input data by gaussian curves. Experiments performed on real-world data from the London Stock Exchange Rebuilt Order Book database confirms that the evolutionary algorithm is capable of improving significantly the results of classification.

[1]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[2]  Anthony Brabazon,et al.  Characterising order book evolution using self-organising maps , 2016, Evol. Intell..

[3]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[4]  Charles Cao,et al.  The Informational Content of an Open Limit Order Book , 2004 .

[5]  Stacy Williams,et al.  Limit order books , 2010, 1012.0349.

[6]  Lingjiong Zhu,et al.  A Reduced-Form Model for Level-1 Limit Order Books , 2015 .

[7]  Julius Bonart,et al.  Latency and Liquidity Provision in a Limit Order Book , 2015 .

[8]  Sebastian Jaimungal,et al.  Enhancing trading strategies with order book signals , 2015 .

[9]  Krzysztof Michalak,et al.  Multiobjective optimization of frequent pattern models in ultra-high frequency time series: Stability versus universality , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[10]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[11]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[12]  Martin D. Gould,et al.  Queue Imbalance as a One-Tick-Ahead Price Predictor in a Limit Order Book , 2015, 1512.03492.

[13]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[14]  Rama Cont,et al.  A Stochastic Model for Order Book Dynamics , 2008, Oper. Res..

[15]  Rama Cont,et al.  Price Dynamics in a Markovian Limit Order Market , 2011, SIAM J. Financial Math..

[16]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[17]  Krzysztof Michalak,et al.  Improving Classification of Patterns in Ultra-High Frequency Time Series with Evolutionary Algorithms , 2016, GECCO.

[18]  Szabolcs Mike,et al.  An Empirical Behavioral Model of Liquidity and Volatility , 2007, 0709.0159.

[19]  Armin Shmilovici,et al.  Support Vector Machines , 2005, Data Mining and Knowledge Discovery Handbook.

[20]  Piotr Lipinski,et al.  Optimization of representation for extracting knowledge from ultra-high frequency time series , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[21]  C. Goodhart,et al.  High frequency data in financial markets: Issues and applications , 1997 .

[22]  Anthony Brabazon,et al.  Pattern Mining in Ultra-High Frequency Order Books with Self-Organizing Maps , 2014, EvoApplications.

[23]  Ioanid Roşu A Dynamic Model of the Limit Order Book , 2008 .