A Combined Approach for Extracting Financial Instrument-Specific Investor Sentiment from Weblogs

Investor sentiment about future returns of financial instruments is a highly relevant information source for investment managers and other stake- holders in the financial industry. Investor sentiments are abundant in financial blog texts. Making use of these sentiments constitutes a massive information management challenge when considering the millions of blog articles with ever- changing and growing amounts of information that need to be acquired and in- terpreted. We propose a novel approach for investor sentiment extraction from blogs by combining machine-learning on the document-level and knowledge- based information extraction on the sentence-level. The proposed artifact is a financial instrument-specific investor sentiment extraction method, which we apply to a set of blog articles. The evaluation suggests that the combined ap- proach achieves a higher precision compared to a standalone knowledge-based approach.

[1]  João Francisco Valiati,et al.  Document-level sentiment classification: An empirical comparison between SVM and ANN , 2013, Expert Syst. Appl..

[2]  Ying-Wong Cheung,et al.  International evidence on the stock market and aggregate economic activity , 1998 .

[3]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[4]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[5]  Songbo Tan,et al.  A survey on sentiment detection of reviews , 2009, Expert Syst. Appl..

[6]  Andreas Humpe,et al.  Can Macroeconomic Variables Explain Long Term Stock Market Movements? A Comparison of the US and Japan , 2007 .

[7]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[8]  Jin-Cheon Na,et al.  Sentiment analysis of movie reviews on discussion boards using a linguistic approach , 2009, CIKM 2009.

[9]  S. Ross,et al.  Economic Forces and the Stock Market , 1986 .

[10]  B. Lev,et al.  Fundamental Information Analysis , 1993 .

[11]  L. Summers,et al.  The Noise Trader Approach to Finance , 1990 .

[12]  Steven Skiena,et al.  Trading Strategies to Exploit Blog and News Sentiment , 2010, ICWSM.

[13]  Kalina Bontcheva,et al.  Architectural elements of language engineering robustness , 2002, Natural Language Engineering.

[14]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[15]  Eric Gilbert,et al.  Widespread Worry and the Stock Market , 2010, ICWSM.

[16]  Michael T. Cliff,et al.  Investor Sentiment and Asset Valuation , 2001 .

[17]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[18]  Jan Muntermann,et al.  An intraday market risk management approach based on textual analysis , 2011, Decis. Support Syst..

[19]  Achim Klein,et al.  Extracting Investor Sentiment from Weblog Texts: A Knowledge-based Approach , 2011, 2011 IEEE 13th Conference on Commerce and Enterprise Computing.

[20]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[21]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[22]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[23]  Sofus A. Macskassy,et al.  More than Words: Quantifying Language to Measure Firms' Fundamentals the Authors Are Grateful for Assiduous Research Assistance from Jie Cao and Shuming Liu. We Appreciate Helpful Comments From , 2007 .

[24]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[25]  Anat R. Admati,et al.  Selling and Trading on Information in Financial Markets , 1988 .

[26]  Marc-André Mittermayer,et al.  Text Mining Systems for Market Response to News: A Survey , 2007 .

[27]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[28]  Alan F. Smeaton,et al.  Topic-dependent sentiment analysis of financial blogs , 2009, TSA@CIKM.

[29]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[30]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[31]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[32]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[33]  Hsinchun Chen,et al.  Evaluating sentiment in financial news articles , 2012, Decis. Support Syst..

[34]  Evgeniy Gabrilovich,et al.  Feature Generation for Text Categorization Using World Knowledge , 2005, IJCAI.

[35]  Steven Skiena,et al.  Large-Scale Sentiment Analysis for News and Blogs (system demonstration) , 2007, ICWSM.