Teaching Machines to Understand Chinese Investment Slang

The Chinese A-shares equity market is characterized by heavy retail investor participation. Furthermore, much of the Chinese retail trading is sentiment driven. In this article, the authors discuss their approach of combining a recently developed natural language processing (NLP) technique with online Chinese-language investment blogs favored by Chinese retail investors to understand their views toward various A-share stocks. In addition to complexities associated with NLP application to the Chinese language, one particular challenge posed by the retail investor corpus the authors use is the frequent occurrence of slang. They demonstrate that their approach is able to overcome the challenges encountered. The authors also show how modifying the standard neural network underpinning their NLP approach makes it more suitable for application to the financial domain. TOPICS: Equity portfolio management, statistical methods, simulations, big data/machine learning Key Findings • To understand Chinese A-share sentiments, it is essential to understand the sentiment of retail investors. • The Chinese language poses several distinct challenges that make application of traditional NLP techniques not particularly fruitful. • Using the latest neural network–based NLP approach, we can overcome these challenges and gain insights into the A-shares market.

[1]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[2]  Zhi-Hong Deng,et al.  A Gap-Based Framework for Chinese Word Segmentation via Very Deep Convolutional Networks , 2017, ArXiv.

[3]  Jonathan Hassid Censorship, the Media, and the Market in China , 2017, Journal of Chinese Political Science.

[4]  Hai Zhao,et al.  Fast Neural Chinese Word Segmentation for Long Sentences , 2018, ArXiv.

[5]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[6]  Suiping Wang,et al.  Processing of compound-word characters in reading Chinese: An eye-movement-contingent display change study , 2013, Quarterly journal of experimental psychology.

[7]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[8]  Erik Cambria,et al.  A Review of Sentiment Analysis Research in Chinese Language , 2017, Cognitive Computation.

[9]  Shunxiang Zhang,et al.  Sentiment analysis of Chinese micro-blog text based on extended sentiment dictionary , 2018, Future Gener. Comput. Syst..

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Hang Lei,et al.  An Empirical Study on Sentiment Classification of Chinese Review using Word Embedding , 2015, PACLIC.

[12]  Bill McDonald,et al.  Textual Analysis in Accounting and Finance: A Survey , 2016 .

[13]  Harith Alani,et al.  Exploring English Lexicon Knowledge for Chinese Sentiment Analysis , 2010, CIPS-SIGHAN.