Textual Analysis of Stock Market Prediction Using Financial News Articles

This paper examines the role of financial news articles on three different textual representations; Bag of Words, Noun Phrases, and Named Entities and their ability to predict discrete number stock prices twenty minutes after an article release. Using a Support Vector Machine (SVM) derivative, we show that our model had a statistically significant impact on predicting future stock prices compared to linear regression. We further demonstrate that using a Noun Phrase representation scheme performs better than the de facto standard of Bag of Words.

[1]  Junbin Gao,et al.  A Probabilistic Framework for SVM Regression and Error Bar Estimation , 2002, Machine Learning.

[2]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[3]  Marc-André Mittermayer,et al.  Forecasting Intraday stock price trends with text mining techniques , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[4]  Julian F. Miller,et al.  Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.

[5]  Jack G. Conrad,et al.  Early user---system interaction for database selection in massive domain-specific online environments , 2003, TOIS.

[6]  Satoshi Sekine,et al.  Definition, Dictionaries and Tagger for Extended Named Entity Hierarchy , 2004, LREC.

[7]  James D. Thomas Integrating Genetic Algorithms and Text Learning for Financial Prediction , 2000 .

[8]  Young-Woo Seo,et al.  Text Classification for Intelligent Portfolio Management , 2002 .

[9]  James Allan,et al.  Language models for financial news recommendation , 2000, CIKM '00.

[10]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[12]  Ping-Feng Pai,et al.  A hybrid ARIMA and support vector machines model in stock price forecasting , 2005 .

[13]  Wai Lam,et al.  News Sensitive Stock Trend Prediction , 2002, PAKDD.

[14]  F. Tay,et al.  Application of support vector machines in financial time series forecasting , 2001 .

[15]  Wing-Sing Vincent Cho,et al.  Knowledge discovery from distributed and textual data , 1999 .

[16]  R. Palmer,et al.  Time series properties of an artificial stock market , 1999 .

[17]  E. Fama The Behavior of Stock-Market Prices , 1965 .

[18]  Hsinchun Chen,et al.  Comparing noun phrasing techniques for use with medical digital library tools , 2000 .

[19]  Hsinchun Chen,et al.  Transforming Open-Source Documents to Terror Networks: The Arizona TerrorNet , 2005, AAAI Spring Symposium: AI Technologies for Homeland Security.

[20]  Sanda M. Harabagiu,et al.  Performance Issues and Error Analysis in an Open-Domain Question Answering System , 2002, ACL.

[21]  Hannu Vanharanta,et al.  Combining data and text mining techniques for analysing financial reports , 2004, Intell. Syst. Account. Finance Manag..

[22]  G Burton,et al.  MALKIEL, . Walk Down Wall StreetNew York: W. W. Norton & Company , 1973 .

[23]  Gyözö Gidófalvi Using News Articles to Predict Stock Price Movements , 2001 .