Stance Analysis for Debates on Traditional Chinese Medicine at Tianya Forum

Internet and social media devices have created a new public space for debates on societal topics. This paper applies text mining methods to conduct stance analysis of on-line debates with the illustration of debates on traditional Chinese medicine (TCM) at one famous Chinese BBS Tianya Froum. After crawling and preprocessing data, logistic regression is adopted to get a domain lexicon. Words in the lexicon are taken as features to automatically distinguish stances. Furthermore a topic model latent Dirichlet allocation (LDA) is utilized to discover shared topics of different camps. Then further analysis is conducted to detect the focused technical terms of TCM and human names referred during the debates. The classification results reveal that using domain discriminating words as features of classifier outperforms taking nouns, verbs, adjectives and adverbs as features. The results of topic modeling and further analysis enable us to see how the different camps express their stances.

[1]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[2]  Swapna Somasundaran,et al.  Recognizing Stances in Ideological On-Line Debates , 2010, HLT-NAACL 2010.

[3]  Rui Fan,et al.  Sentiment Analysis Based on User Tags for Traditional Chinese Medicine in Weibo , 2014, NLPCC.

[4]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[5]  Diego Reforgiato Recupero,et al.  AVA: Adjective-Verb-Adverb Combinations for Sentiment Analysis , 2008, IEEE Intelligent Systems.

[6]  M. Walker,et al.  How can you say such things?!?: Recognizing Disagreement in Informal Political Argument , 2011 .

[7]  Hasan Davulcu,et al.  A system for ranking organizations using social scale analysis , 2011, European Intelligence and Security Informatics Conference.

[8]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Wei-Hao Lin,et al.  Which Side are You on? Identifying Perspectives at the Document and Sentence Levels , 2006, CoNLL.

[11]  Jieping Ye,et al.  Perspective Analysis for Online Debates , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[12]  Vincent Ng,et al.  Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews , 2006, ACL.

[13]  Marilyn A. Walker,et al.  That is your evidence?: Classifying stance in online political debate , 2012, Decis. Support Syst..

[14]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[15]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[16]  Diego Reforgiato Recupero,et al.  Sentiment Analysis: Adjectives and Adverbs are Better than Adjectives Alone , 2007, ICWSM.

[17]  Bing Liu,et al.  Opinion Mining and Sentiment Analysis , 2011 .

[18]  Vibhu O. Mittal,et al.  Comparative Experiments on Sentiment Classification for Online Product Reviews , 2006, AAAI.

[19]  Wojciech Gryc,et al.  Leveraging Textual Sentiment Analysis with Social Network Modelling , 2014 .

[20]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[21]  Marilyn A. Walker,et al.  Cats Rule and Dogs Drool!: Classifying Stance in Online Debate , 2011, WASSA@ACL.