The Performance Evaluation of Machine Learning Classifiers on Financial Microblogging Platforms

As technological advancements facilitate democratization of knowledge, Microblogging platforms are vying to become the premier source of knowledge and are competing with news outlets. A huge number of messages is generated on different microblogging platforms. In financial markets, microblogging websites, such as StockTwits, have become a rich source for amateur investors, which make them ideal sources for market sentiment analysis. Indeed, StockTwit has been widely used by researchers for sentiment analytics and market predictions. However, the quality of the sentiment analysis is highly dependent on the machine learning classifiers used as well as the preprocessing of data. In this study, we compare the performance efficiency of different machine learning classifiers on the user-generated content on StockTwits. We find that Logistic Regression Classifier performs best in a 2-way classification of StockTwits data. Our results report better classification accuracy than a similar research using data from Twitter. We have discussed managerial implications of our results.

[1]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[2]  Gerard J. Tellis,et al.  Does Chatter Really Matter? Dynamics of User-Generated Content and Stock Performance , 2011, Mark. Sci..

[3]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[4]  Paulo Cortez,et al.  On the Predictability of Stock Market Behavior Using StockTwits Sentiment and Posting Volume , 2013, EPIA.

[5]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[6]  E. Henry Are Investors Influenced By How Earnings Press Releases Are Written? , 2006 .

[7]  Henry Leung,et al.  The impact of internet stock message boards on cross-sectional returns of small-capitalization stocks , 2015 .

[8]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[9]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[10]  Tianyou Hu,et al.  The Performance Evaluation of Textual Analysis Tools in Financial Markets , 2015 .

[11]  Olivia Sheng,et al.  Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement , 2011, ICIS.

[12]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[13]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[14]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[15]  Bill McDonald,et al.  Textual Analysis in Accounting and Finance: A Survey , 2016 .

[16]  P. De,et al.  Wisdom of Crowds: The Value of Stock Opinions Transmitted Through Social Media , 2013 .