DWWP: Domain-specific new words detection and word propagation system for sentiment analysis in the tourism domain

Abstract Online travel has developed dramatically during the past three years in China. This results in a large amount of unstructured data like tourism reviews from which it is hard to extract useful knowledge. In this paper, a DWWP system consisting of domain-specific new words detection (DW) and word propagation (WP) is presented. DW deals with the negligence of user-invented new words and converted sentiment words by means of AMI (Assembled Mutual Information). Inspired by social networks, the new method WP incorporates manually calibrated sentiment scores, semantic and statistical similarity information, which improves the quality of sentiment lexicon in comparison with existing data-driven methods. Experimental results show that DWWP improves seventeen percentage points compared with graph propagation and four percentage points compared with label propagation in terms of accuracy on Dataset I and Dataset II, respectively.

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  Vasiliki Baka,et al.  The becoming of user-generated reviews: Looking at the past to understand the future of managing reputation in the travel sector , 2016 .

[3]  Ray Jackendoff,et al.  The Architecture of the Language Faculty , 1996 .

[4]  Erik Cambria,et al.  Aspect extraction for opinion mining with a deep convolutional neural network , 2016, Knowl. Based Syst..

[5]  Yaacov Choueka,et al.  Looking for Needles in a Haystack or Locating Interesting Collocational Expressions in Large Textual Databases , 1988, RIAO Conference.

[6]  Delip Rao,et al.  Semi-Supervised Polarity Lexicon Induction , 2009, EACL.

[7]  Haiqiang Chen,et al.  New Word Detection for Sentiment Analysis , 2014, ACL.

[8]  Sergio L. Toral Marín,et al.  Monitoring Travel-Related Information on Social Media through Sentiment Analysis , 2014, 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing.

[9]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[10]  Changning Huang,et al.  The Use of SVM for Chinese New Word Identification , 2004, IJCNLP.

[11]  Daling Wang,et al.  Unsupervised Learning Chinese Sentiment Lexicon from Massive Microblog Data , 2012, ADMA.

[12]  Xu Sun,et al.  Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection , 2012, ACL.

[13]  Yo-Sub Han,et al.  A movie recommendation algorithm based on genre correlations , 2012, Expert Syst. Appl..

[14]  Sergio Toral,et al.  Post-visit and pre-visit tourist destination image through eWOM sentiment analysis and perceived helpfulness , 2016 .

[15]  Xianghua Fu,et al.  Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon , 2013, Knowl. Based Syst..

[16]  Tat-Seng Chua,et al.  Computational Intelligence for Big Social Data Analysis [Guest Editorial] , 2016, IEEE Comput. Intell. Mag..

[17]  Keh-Jiann Chen,et al.  Unknown Word Extraction for Chinese Documents , 2002, COLING.

[18]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[19]  Haejung Yun,et al.  What makes tourists feel negatively about tourism destinations? Application of hybrid text mining methodology to smart destination management , 2017 .

[20]  Likun Qiu,et al.  Expanding Chinese Sentiment Dictionaries from Large Scale Unlabeled Corpus , 2010, PACLIC.

[21]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[22]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[23]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[24]  Suzanne Stevenson Review of The architecture of the language faculty by Ray Jackendoff. The MIT Press 1997. , 1998 .

[25]  Erik Cambria,et al.  Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[26]  Felipe Bravo-Marquez,et al.  Meta-level sentiment models for big social data analysis , 2014, Knowl. Based Syst..

[27]  Harith Alani,et al.  Contextual semantics for sentiment analysis of Twitter , 2016, Inf. Process. Manag..

[28]  Felipe Bravo-Marquez,et al.  Building a Twitter opinion lexicon from automatically-annotated tweets , 2016, Knowl. Based Syst..

[29]  Estela Marine-Roig,et al.  Online Travel Reviews: A Massive Paratextual Analysis , 2017 .

[30]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[31]  Sasha Blair-Goldensohn,et al.  The viability of web-derived polarity lexicons , 2010, NAACL.

[32]  Xiaoyan Zhu,et al.  Measuring the Non-compositionality of Multiword Expressions , 2010, COLING.

[33]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[34]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[35]  Catuscia Palamidessi,et al.  Constructing elastic distinguishability metrics for location privacy , 2015, Proc. Priv. Enhancing Technol..

[36]  Erik Cambria,et al.  Bayesian Deep Convolution Belief Networks for Subjectivity Detection , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[37]  Ming Zhou,et al.  Building Large-Scale Twitter-Specific Sentiment Lexicon : A Representation Learning Approach , 2014, COLING.

[38]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[39]  Tu Bao Ho,et al.  Improving effectiveness of mutual information for substantival multiword expression extraction , 2009, Expert Syst. Appl..

[40]  Renato D. C. Monteiro A globally convergent primal—dual interior point algorithm for convex programming , 1994, Math. Program..

[41]  G. Wells,et al.  A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. Transfusion Requirements in Critical Care Investigators, Canadian Critical Care Trials Group. , 1999, The New England journal of medicine.

[42]  Lucy Vanderwende,et al.  MindNet: Acquiring and Structuring Semantic Information from Text , 1998, COLING-ACL.

[43]  Usman Qamar,et al.  Lexicon based semantic detection of sentiments using expected likelihood estimate smoothed odds ratio , 2016, Artificial Intelligence Review.

[44]  Jong-Seok Lee,et al.  Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews , 2014, Knowl. Based Syst..

[45]  Jian Ma,et al.  Sentiment classification: The contribution of ensemble learning , 2014, Decis. Support Syst..

[46]  Uzay Kaymak,et al.  Multi-lingual support for lexicon-based sentiment analysis guided by semantics , 2014, Decis. Support Syst..

[47]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[48]  Björn W. Schuller,et al.  New avenues in knowledge bases for natural language processing , 2016, Knowl. Based Syst..

[49]  Sergio L. Toral Marín,et al.  Examining the power-law distribution among eWOM communities: a characterisation approach of the Long Tail , 2016, Technol. Anal. Strateg. Manag..

[50]  Fangzhao Wu,et al.  Towards building a high-quality microblog-specific Chinese sentiment lexicon , 2016, Decis. Support Syst..

[51]  Usman Qamar,et al.  SWIMS: Semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis , 2016, Knowl. Based Syst..

[52]  Lorenz T. Biegler,et al.  On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming , 2006, Math. Program..

[53]  Erik Cambria,et al.  Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis , 2015, EMNLP.

[54]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[55]  Erik Cambria,et al.  Affective Computing and Sentiment Analysis , 2016, IEEE Intelligent Systems.

[56]  Usman Qamar,et al.  eSAP: A decision support framework for enhanced sentiment analysis and polarity classification , 2016, Inf. Sci..