A novel text mining approach to financial time series forecasting

Financial time series forecasting has become a challenge because it is noisy, non-stationary and chaotic. Most of the existing forecasting models for this problem do not take market sentiment into consideration. To overcome this limitation, motivated by the fact that market sentiment contains some useful forecasting information, this paper uses textual information to aid the financial time series forecasting and presents a novel text mining approach via combining ARIMA and SVR (Support Vector Regression) to forecasting. The approach contains three steps: representing textual data as feature vectors, using ARIMA to analyze the linear part and developing a SVR model based only on textual feature vector to model the nonlinear part. To verify the effectiveness of the proposed approach, quarterly ROEs (Return of Equity) of six security companies are chosen as the forecasting targets. Comparing with some existing state-of-the-art models, the proposed approach gives superior results. It indicates that the proposed model that uses additional market sentiment provides a promising alternative to financial time series prediction.

[1]  Jian Zhang,et al.  Daily stock market forecast from textual web data , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[2]  Francis Eng Hock Tay,et al.  Support vector machine with adaptive parameters in financial time series forecasting , 2003, IEEE Trans. Neural Networks.

[3]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[4]  Kyoung-jae Kim,et al.  Financial time series forecasting using support vector machines , 2003, Neurocomputing.

[5]  Kurt Hornik,et al.  Text Mining Infrastructure in R , 2008 .

[6]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[7]  Arnold Zellner,et al.  To combine or not to combine? Issues of combining forecasts , 1992 .

[8]  Stan Matwin,et al.  Feature Engineering for Text Classification , 1999, ICML.

[9]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[10]  F. Girosi,et al.  Nonlinear prediction of chaotic time series using support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[11]  Amir F. Atiya,et al.  Introduction to financial forecasting , 1996, Applied Intelligence.

[12]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[13]  Ping-Feng Pai,et al.  A hybrid ARIMA and support vector machines model in stock price forecasting , 2005 .

[14]  Chris Chatfield,et al.  What is the ‘best’ method of forecasting? , 1988 .

[15]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[16]  Hujun Yin,et al.  Exchange rate prediction using hybrid neural networks and trading indicators , 2009, Neurocomputing.

[17]  Kuan-Yu Chen,et al.  A hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan , 2007, Expert Syst. Appl..

[18]  Hujun Yin,et al.  Self-Organising Mixture autoregressive Model for Non-Stationary Time Series Modelling , 2008, Int. J. Neural Syst..

[19]  Nobuhiko Terui,et al.  Testing Gaussianity and Linearity of Japanese Stock Returns , 1997 .

[20]  Krzysztof J. Cios,et al.  Time series forecasting by combining RBF networks, certainty factors, and the Box-Jenkins model , 1996, Neurocomputing.

[21]  Hannu Vanharanta,et al.  Combining data and text mining techniques for analysing financial reports , 2004, Intell. Syst. Account. Finance Manag..

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23]  F. Tay,et al.  Application of support vector machines in financial time series forecasting , 2001 .

[24]  Wai Lam,et al.  Stock prediction: Integrating text mining approach using real-time news , 2003, 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings..

[25]  Francis Eng Hock Tay,et al.  Modified support vector machines in financial time series forecasting , 2002, Neurocomputing.

[26]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[27]  Chih-Chou Chiu,et al.  Financial time series forecasting using independent component analysis and support vector regression , 2009, Decis. Support Syst..

[28]  Jens Ove Riis,et al.  A hybrid econometric—neural network modeling approach for sales forecasting , 1996 .

[29]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[30]  H. V. Dijk,et al.  Combined forecasts from linear and nonlinear time series models , 1999 .

[31]  Mehdi Khashei,et al.  Improvement of Auto-Regressive Integrated Moving Average models using Fuzzy logic and Artificial Neural Networks (ANNs) , 2009, Neurocomputing.