Value and Misinformation in Collaborative Investing Platforms

It is often difficult to separate the highly capable “experts” from the average worker in crowdsourced systems. This is especially true for challenge application domains that require extensive domain knowledge. The problem of stock analysis is one such domain, where even the highly paid, well-educated domain experts are prone to make mistakes. As an extremely challenging problem space, the “wisdom of the crowds” property that many crowdsourced applications rely on may not hold. In this article, we study the problem of evaluating and identifying experts in the context of SeekingAlpha and StockTwits, two crowdsourced investment services that have recently begun to encroach on a space dominated for decades by large investment banks. We seek to understand the quality and impact of content on collaborative investment platforms, by empirically analyzing complete datasets of SeekingAlpha articles (9 years) and StockTwits messages (4 years). We develop sentiment analysis tools and correlate contributed content to the historical performance of relevant stocks. While SeekingAlpha articles and StockTwits messages provide minimal correlation to stock performance in aggregate, a subset of experts contribute more valuable (predictive) content. We show that these authors can be easily identified by user interactions, and investments based on their analysis significantly outperform broader markets. This effectively shows that even in challenging application domains, there is a secondary or indirect wisdom of the crowds. Finally, we conduct a user survey that sheds light on users’ views of SeekingAlpha content and stock manipulation. We also devote efforts to identify potential manipulation of stocks by detecting authors controlling multiple identities.

[1]  Steven Skiena,et al.  Large-Scale Sentiment Analysis for News and Blogs (system demonstration) , 2007, ICWSM.

[2]  Claire Cardie,et al.  39. Opinion mining and sentiment analysis , 2014 .

[3]  Rong Zheng,et al.  A framework for authorship identification of online messages: Writing-style features and classification techniques , 2006, J. Assoc. Inf. Sci. Technol..

[4]  Ben Y. Zhao,et al.  Uncovering social network Sybils in the wild , 2011, ACM Trans. Knowl. Discov. Data.

[5]  S. Pokharel Wisdom of Crowds: The Value of Stock Opinions Transmitted through Social Media , 2014 .

[6]  Jeffrey Nichols,et al.  Analyzing the quality of information solicited from targeted strangers on social media , 2013, CSCW '13.

[7]  Brian P. Bailey,et al.  Voyant: generating structured feedback on visual designs using a crowd of non-experts , 2014, CSCW.

[8]  Clifton Forlines,et al.  Crowdsourcing the future: predictions made with a social network , 2014, CHI.

[9]  Dawn Xiaodong Song,et al.  On the Feasibility of Internet-Scale Author Identification , 2012, 2012 IEEE Symposium on Security and Privacy.

[10]  Jung-Tae Lee,et al.  The Contribution of Stylistic Information to Content-based Mobile Spam Filtering , 2009, ACL.

[11]  Michael S. Bernstein,et al.  Ensemble: exploring complementary strengths of leaders and crowds in creative collaboration , 2014, CSCW.

[12]  Yigitcan Karabulut Can Facebook Predict Stock Market Activity? , 2013 .

[13]  Eric Gilbert,et al.  Widespread Worry and the Stock Market , 2010, ICWSM.

[14]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[15]  Krishna P. Gummadi,et al.  An analysis of social network-based Sybil defenses , 2010, SIGCOMM '10.

[16]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[17]  Judith S. Olson,et al.  Ways of Knowing in HCI , 2014, Springer New York.

[18]  Wai Lam,et al.  Stock prediction: Integrating text mining approach using real-time news , 2003, 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings..

[19]  Michael Kaminsky,et al.  SybilGuard: Defending Against Sybil Attacks via Social Networks , 2008, IEEE/ACM Transactions on Networking.

[20]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[21]  Rachel Greenstadt,et al.  Detecting Hoaxes, Frauds, and Deception in Writing Style Online , 2012, 2012 IEEE Symposium on Security and Privacy.

[22]  George K. Mikros,et al.  Investigating Topic Influence in Authorship Attribution , 2007, PAN.

[23]  Gang Wang,et al.  Crowds on Wall Street: Extracting Value from Collaborative Investing Platforms , 2015, CSCW.

[24]  Gang Wang,et al.  Wisdom in the social crowd: an analysis of quora , 2013, WWW.

[25]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[26]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[27]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[28]  Andrea Esuli,et al.  PageRanking WordNet Synsets: An Application to Opinion Mining , 2007, ACL.

[29]  Abhinav Kumar,et al.  Spotting opinion spammers using behavioral footprints , 2013, KDD.

[30]  Ido Guy,et al.  The perception of others: inferring reputation from social media in the enterprise , 2014, CSCW.

[31]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[32]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[33]  Sameena Shah,et al.  Winning by Following the Winners: Mining the Behaviour of Stock Market Experts in Social Media , 2014, SBP.

[34]  Jun Hu,et al.  Detecting and characterizing social spam campaigns , 2010, CCS '10.

[35]  Sheizaf Rafaeli,et al.  Predictors of answer quality in online Q&A sites , 2008, CHI.

[36]  T. Rao,et al.  Analyzing Stock Market Movements Using Twitter Sentiment Analysis , 2012, ASONAM 2012.

[37]  Aniket Kittur,et al.  Collaborative problem solving: a study of MathOverflow , 2014, CSCW.

[38]  Fabrício Benevenuto,et al.  Comparing and combining sentiment analysis methods , 2013, COSN '13.

[39]  Gang Wang,et al.  Northeastern University , 2021, IEEE Pulse.

[40]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[41]  Jahna Otterbacher,et al.  'Helpfulness' in online communities: a measure of message quality , 2009, CHI.

[42]  Gang Wang,et al.  Social Turing Tests: Crowdsourcing Sybil Detection , 2012, NDSS.

[43]  Olivia Sheng,et al.  Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement , 2011, ICIS.

[44]  H. Eugene Stanley,et al.  Quantifying Wikipedia Usage Patterns Before Stock Market Moves , 2013, Scientific Reports.

[45]  H. Stanley,et al.  Quantifying Trading Behavior in Financial Markets Using Google Trends , 2013, Scientific Reports.

[46]  Robert E. Verrecchia,et al.  Constraints on short-selling and asset price adjustment to private information , 1987 .

[47]  Mudit Bhargava,et al.  Stylometric Analysis for Authorship Attribution on Twitter , 2013, BDA.

[48]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[49]  Ronen Feldman,et al.  Identifying and Following Expert Investors in Stock Microblogs , 2011, EMNLP.

[50]  Christos Faloutsos,et al.  Opinion Fraud Detection in Online Reviews by Network Effects , 2013, ICWSM.

[51]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[52]  Gang Wang,et al.  Man vs. Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers , 2014, USENIX Security Symposium.

[53]  Wai-Tat Fu,et al.  Understanding experts' and novices' expertise judgment of twitter users , 2012, CHI.

[54]  Michael S. Bernstein,et al.  The future of crowd work , 2013, CSCW.

[55]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[56]  Sameena Shah,et al.  Stock Prediction Using Event-Based Sentiment Analysis , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[57]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[58]  Munmun De Choudhury,et al.  Can blog communication dynamics be correlated with stock market activity? , 2008, Hypertext.

[59]  Paulo Cortez,et al.  On the Predictability of Stock Market Behavior Using StockTwits Sentiment and Posting Volume , 2013, EPIA.

[60]  Hsinchun Chen,et al.  Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace , 2008, TOIS.

[61]  Gang Wang,et al.  Serf and turf: crowdturfing for fun and profit , 2011, WWW.

[62]  Eric D. Brown Will Twitter Make You a Better Investor? A Look at Sentiment, User Reputation and Their Effect on the Stock Market , 2012 .

[63]  K. Pearson Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material , 1895 .

[64]  Yu-An Sun,et al.  When majority voting fails: Comparing quality assurance methods for noisy human computation environment , 2012, ArXiv.