GubaLex: Guba-Oriented Sentiment Lexicon for Big Texts in Finance

Trading in stock market depends mostly on investor's emotions though technical analysis is a viable tool there. In China, Guba is a typical platform for individual investors to share news and opinions on their favorite stocks. The texts posted in Guba by investors involve in richful emotions which can reflect their willingness on the stock. Few works focus on Guba sentiment analysis though numerous have been done on investor sentiment analysis in finance market for the purpose of understanding the market. Text mining is the most popular method to analyze the sentiment implied in the web text, which depends heavily on the lexicon. Existed lexicons for general purpose work badly on sentiment analysis for Guba messages. In this work, we construct a specified lexicon for Chinese Guba, named GubaLex, in considerations of the characteristics of the Guba text: short, emotion enriched, colloquial (informal), and stock market oriented. It is constructed by using the merge of HowNet and NTUSD as the basic sentiment lexicon, then adding stock terms from the Guba corpus and information in the area of stock market. Based on GubaLex, we develop the bullish lexicon GLBull and the bearish lexicon GL-Bear especially including bullish and bearish sentiment terms for further sentiment analysis. We also proposed an auto update module and sentiment classification algorithm for Guba texts. The experiments show the proposed lexicon works better in sentiment analysis than the previous, like HowNet and NTUSD.

[1]  B. Lucey,et al.  The Role of Feelings in Investor Decision-Making , 2004 .

[2]  Pierre Peterlongo,et al.  In-Place Update of Suffix Array while Recoding Words , 2008, Int. J. Found. Comput. Sci..

[3]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[4]  Isa Maks,et al.  A lexicon model for deep sentiment analysis and opinion mining applications , 2012, Decis. Support Syst..

[5]  Julien Velcin,et al.  Sentiment analysis on social media for stock movement prediction , 2015, Expert Syst. Appl..

[6]  Mei Zhou,et al.  The Suffix-Signature Method for Searching for Phrases in Text , 1998, Inf. Syst..

[7]  Josef Ruppenhofer,et al.  FrameNet II: Extended theory and practice , 2006 .

[8]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[9]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[10]  Blanca Piñeiro Torres,et al.  Evolution of the Semantic Web Towards the Intelligent Web: From Conceptualization to Personalization of Contents , 2017 .

[11]  Alaa Hamouda,et al.  Building Machine Learning Based Senti-word Lexicon for Sentiment Analysis , 2011 .

[12]  Qing Ma,et al.  Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17 , 2003 .

[13]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[14]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[15]  Christopher K. Hsee,et al.  Risk as Feelings , 2001, Psychological bulletin.

[16]  Qiang Dong,et al.  HowNet - a hybrid language and knowledge resource , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[17]  J. Forgas Mood and judgment: the affect infusion model (AIM). , 1995, Psychological bulletin.

[18]  Keh-Jiann Chen,et al.  Extended-HowNet: A Representational Framework for Concepts , 2005, IJCNLP.

[19]  Ciprian Dobre,et al.  Towards Mobile Cloud Computing in 5G Mobile Networks: Applications, Big Data Services and Future Opportunities , 2017 .

[20]  Yigitcan Karabulut Can Facebook Predict Stock Market Activity? , 2013 .