Training attractive attribute classifiers based on opinion features extracted from review data

Abstract Researchers have proposed statistical regression models that analyse on-line review data to identify attractive attributes of a product or service. This research has the same aim, but with an approach based on machine learning models instead of statistical models. The proposed approach first extracts attribute-level sentiments from the review text by natural language processing techniques, then derives features that reflect the non-linear relations between attribute performance and customer satisfaction based on the sentiments. The non-linear features are fed to the Support Vector Machine (SVM) model to train predictive attractive attribute classifiers. The proposed approach is evaluated on a hotel review dataset crawled from TripAdvisor. The experiment results indicate that the classifiers reach a precision of 79.3% and outperform the existing statistical models by a margin of over 10%.

[1]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[2]  Hao Yu,et al.  Structure-Aware Review Mining and Summarization , 2010, COLING.

[3]  Thomas Demeester,et al.  Representation learning for very short texts using weighted word embedding aggregation , 2016, Pattern Recognit. Lett..

[4]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[5]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[6]  Zhong Su,et al.  Product feature categorization with multilevel latent semantic association , 2009, CIKM.

[7]  Dina L. Denham,et al.  Hinton diagrams: Viewing connection strengths in neural networks , 1994 .

[8]  S. Stepchenkova,et al.  Ecotourism experiences reported online: Classification of satisfaction attributes , 2012 .

[9]  Andrea Esuli,et al.  Hierarchical Multi-label Conditional Random Fields for Aspect-Oriented Opinion Mining , 2014, ECIR.

[10]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.

[11]  Josip Mikulić,et al.  A critical review of techniques for classifying quality attributes in the Kano model , 2011 .

[12]  Saif Mohammad,et al.  NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews , 2014, *SEMEVAL.

[13]  Josef Steinberger,et al.  UWB: Machine Learning Approach to Aspect-Based Sentiment Analysis , 2014, SemEval@COLING.

[14]  Xiaoyan Zhu,et al.  Sentiment Analysis with Global Topics and Local Dependency , 2010, AAAI.

[15]  Hua Xu,et al.  Grouping Product Features Using Semi-Supervised Learning with Soft-Constraints , 2010, COLING.

[16]  Iryna Gurevych,et al.  Sentence and Expression Level Annotation of Opinions in User-Generated Discourse , 2010, ACL.

[17]  K. Tan,et al.  Integrating Kano's model in the planning matrix of quality function deployment , 2000 .

[18]  Timothy W. Finin,et al.  Delta TFIDF: An Improved Feature Space for Sentiment Analysis , 2009, ICWSM.

[19]  Ye Zhang,et al.  Dimensions of lodging guest satisfaction among guests with mobility challenges: A mixed-method analysis of web-based texts , 2016 .

[20]  Haris Papageorgiou,et al.  SemEval-2016 Task 5: Aspect Based Sentiment Analysis , 2016, *SEMEVAL.

[21]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[22]  Xun Xu,et al.  The antecedents of customer satisfaction and dissatisfaction toward various types of hotels: A text mining approach , 2016 .

[23]  M. de Rijke,et al.  Siamese CBOW: Optimizing Word Embeddings for Sentence Representations , 2016, ACL.

[24]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[25]  Meng Wang,et al.  Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews , 2011, ACL.

[26]  Yanjun Qi,et al.  Sentiment classification based on supervised latent n-gram analysis , 2011, CIKM '11.

[27]  Yun Zhu,et al.  Support vector machines and Word2vec for text classification with semantic features , 2015, 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[28]  Amélia Silveira,et al.  Identification of satisfaction attributes using competitive analysis of the improvement gap , 2007 .

[29]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[30]  Arjun Mukherjee,et al.  Aspect Extraction through Semi-Supervised Modeling , 2012, ACL.

[31]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[32]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[33]  Gérson Tontini,et al.  Exploring the nonlinear impact of critical incidents on customers’ general evaluation of hospitality services , 2017 .

[34]  Kurt Matzler,et al.  The asymmetric relationship between attribute-level performance and overall customer satisfaction: a reconsideration of the importance–performance analysis , 2004 .

[35]  Dwayne D. Gremler The Critical Incident Technique in Service Research , 2004 .

[36]  Li-Fei Chen,et al.  A novel approach to regression analysis for the classification of quality attributes in the Kano model: an empirical test in the food and beverage industry , 2012 .

[37]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[38]  N. Kano,et al.  Attractive Quality and Must-Be Quality , 1984 .

[39]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[40]  Chong Long,et al.  A Review Selection Approach for Accurate Feature Rating Estimation , 2010, COLING.