The Predicting Power of Textual Information on Financial Markets

Mining textual documents and time series concur- rently, such as predicting the movements of stock prices based on the contents of the news stories, is an emerging topic in data mining community. Previous researches have shown that there is a strong relationship between the time when the news stories are released and the time when the stock prices fluctuate. In this paper, we propose a systematic framework for predicting the tertiary movements of stock prices by analyzing the impacts of the news stories on the stocks. To be more specific, we investigate the immediate impacts of news stories on the stocks based on the Efficient Markets Hypothesis. Several data mining and text mining techniques are used in a novel way. Extensive experiments using real-life data are conducted, and encouraging results are obtained.

[1]  Changzhou Wang,et al.  Supporting fast search in time series for movement patterns in multiple scales , 1998, CIKM '98.

[2]  Andrew B. Whinston,et al.  A design of a DSS intermediary for electronic markets , 1999, Decis. Support Syst..

[3]  Shichao Zhang,et al.  Association Rule Mining: Models and Algorithms , 2002 .

[4]  Lotfi A. Zadeh,et al.  The concept of a linguistic variable and its application to approximate reasoning - II , 1975, Inf. Sci..

[5]  Changzhou Wang,et al.  Supporting content-based searches on time series via approximation , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[6]  Shyi-Ming Chen,et al.  Aggregating Fuzzy Opinions in the Group Decision-making Environment , 1998, Cybern. Syst..

[7]  Lars Tvede,et al.  The Psychology Of Finance , 1991 .

[8]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[9]  Christer Carlsson,et al.  Past, present, and future of decision support technology , 2002, Decis. Support Syst..

[10]  Kim-Leng Poh,et al.  A knowledge-based guidance system for multi-attribute decision making , 1998, Artif. Intell. Eng..

[11]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[12]  J. Kacprzyk,et al.  Group decision making and consensus under fuzzy preferences and fuzzy majority , 1992 .

[13]  Theodosios Pavlidis,et al.  Segmentation of Plane Curves , 1974, IEEE Transactions on Computers.

[14]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[15]  Osmar R. Zaïane,et al.  Text document categorization by term association , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[16]  James D. Thomas Integrating Genetic Algorithms and Text Learning for Financial Prediction , 2000 .

[17]  L. Festinger,et al.  A Theory of Cognitive Dissonance , 2017 .

[18]  Yehuda Lindell,et al.  Text Mining at the Term Level , 1998, PKDD.

[19]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[20]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[21]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[22]  Jie Lu,et al.  An Integrated Group Decision-Making Method Dealing with Fuzzy Preferences for Alternatives and Individual Judgments for Selection Criteria , 2003 .

[23]  Chengqi Zhang,et al.  Post-mining: maintenance of association rules by weighting , 2003, Inf. Syst..

[24]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[25]  Jean-Pierre Chevallet,et al.  Relations between Terms Discovered by Association Rules , 2000 .

[26]  W. Bruce Croft Boolean queries and term dependencies in probabilistic retrieval models , 1986, J. Am. Soc. Inf. Sci..

[27]  Chen-Tung Chen,et al.  Aggregation of fuzzy opinions under group decision making , 1996, Fuzzy Sets Syst..

[28]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[29]  Huey-Ming Lee,et al.  Group decision making using fuzzy sets theory for evaluating the rate of aggregative risk in software development , 1996, Fuzzy Sets Syst..

[30]  Marjorie A. Lyles,et al.  Strategic Problem Formulation: Biases and Assumptions Embedded in Alternative Decision-Making Models , 1988 .

[31]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[32]  Suresh Sridhar,et al.  Decision support using the Intranet , 1998, Decis. Support Syst..

[33]  Nikos I. Karacapilidis,et al.  Computer-supported collaborative argumentation and fuzzy similarity measures in multiple criteria decision making , 2000, Comput. Oper. Res..

[34]  Jian Zhang,et al.  Daily Prediction of Major Stock Indices from Textual WWW Data , 1998, KDD.

[35]  Chengqi Zhang,et al.  Identifying frequent terms in text databases by association semantics , 2003, Proceedings ITCC 2003. International Conference on Information Technology: Coding and Computing.

[36]  Daniel J. Power,et al.  Building Web-based Decision Support Systems , 2002 .

[37]  Raymond K. Wong,et al.  Currency Exchange Rate Forecasting From News Headlines , 2002, Australasian Database Conference.

[38]  Patricia A. Adler,et al.  The Social dynamics of financial markets , 1984 .

[39]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[40]  Wai Lam,et al.  News Sensitive Stock Trend Prediction , 2002, PAKDD.

[41]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[42]  Kai-Yuan Cai,et al.  Software Pattern Laws and Partial Repeatability , 1999 .

[43]  Wai Lam,et al.  Stock prediction: Integrating text mining approach using real-time news , 2003, 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings..

[44]  Slawomir Zadrozny,et al.  An interactive multi-user decision support system for consensus reaching processes using fuzzy logic with linguistic quantifiers , 1988, Decis. Support Syst..

[45]  Iftikhar U. Sikder,et al.  Design and Implementation of a Web-Based Collaborative Spatial Decision Support System: Organizational and Managerial Implications , 2002, Inf. Resour. Manag. J..

[46]  Tung Xuan Bui,et al.  Co-oP: A Group Decision Support System for Cooperative Multiple Criteria Group Decision Making , 1987 .

[47]  David L. Olson,et al.  Multi-attribute utility methods in group decision making: Past applications and potential for inclusion in GDSS , 1997 .

[48]  Ralph H. Sprague,et al.  Decision support systems: Putting theory into practice , 1986 .

[49]  Ron Chi-Wai Kwok,et al.  Improving group decision making: a fuzzy GSS approach , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[50]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[51]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[52]  Hemant K. Bhargava,et al.  Decision support on demand: Emerging electronic markets for decision technologies , 1997, Decis. Support Syst..

[53]  Andrew B. Whinston,et al.  Using client-broker-server architecture for Intranet decision support , 1997, Decis. Support Syst..

[54]  Usama M. Fayyad,et al.  Data mining and KDD: Promise and challenges , 1997, Future Gener. Comput. Syst..

[55]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[56]  Andreas Rudolph,et al.  Techniques of Cluster Algorithms in Data Mining , 2002, Data Mining and Knowledge Discovery.

[57]  I. Nishizaki,et al.  Interactive support for fuzzy trade-off evaluation in group decision making , 1994 .

[58]  Beng Chin Ooi,et al.  Mining term association rules for automatic global query expansion: methodology and preliminary results , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[59]  Xindong Wu,et al.  Building Intelligent Learning Database Systems , 2000, AI Mag..

[60]  Itsuo Hatono,et al.  Linguistic labels for expressing fuzzy preference relations in fuzzy group decision making , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[61]  Charles Amos Dice,et al.  The stock market , 1952 .

[62]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[63]  Yuefeng Li,et al.  Interpretations of association rules by granular computing , 2003, Third IEEE International Conference on Data Mining.

[64]  Chen-Tung Chen,et al.  Extensions of the TOPSIS for group decision-making under fuzzy environment , 2000, Fuzzy Sets Syst..

[65]  Kai-Yuan Cai,et al.  Analyzing software science data with partial repeatability , 2002, J. Syst. Softw..

[66]  Chien Chen,et al.  Designing an Internet-based group decision support system , 2003 .

[67]  Tom Fawcett,et al.  Activity monitoring: noticing interesting changes in behavior , 1999, KDD '99.

[68]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[69]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[70]  Beat Wüthrich Probabilistic Knowledge Bases , 1995, IEEE Trans. Knowl. Data Eng..

[71]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[72]  J. M. Clark Economics and Modern Psycholoy: I , 1918, Journal of Political Economy.

[73]  Mathias Géry,et al.  Knowledge Discovery for Automatic Query Expansion on the World-Wide Web , 1999, ER.