Ensemble Classifier for Praise or Complaint Classification and Visualization from Big Data

With the advent in Big Data Analytics, IoT and Machine Learning newer opportunities are created for Business organizations to analyze, monitor and mine user-generated contents in real time for business intelligence using cognitive IoT. Customers share their opinions online through social media platforms like review sites, Twitter and Facebook, etc. Sentiment analysis combined with real-time reporting can provide precise valuable contextual insights enabling more improved decision making. The existing sentiment analysis techniques identify only positive, negative or neutral sentiments and do not consider informativeness of reviews while analyzing the sentiments. The extreme opinions like praise and complaint sentences are informative subsets of positive and negative sentences and are very difficult to find. This chapter proposes the Ensemble classifier using linguistic features for praise or complaint classification from big customer review datasets and visualization of it. The Praise and Complaint sentences are further classified based on aspect and analysis at aspect level is presented from business intelligence point of view. The performance of the four different supervised machine learning classifiers, namely Random forest, SVC, KNeighbours, MLP with linguistic hybrid features and Ensemble of above algorithms is evaluated on Hotel and Amazon product reviews dataset using parameters Accuracy, Precision, Recall, and F1-score. The proposed approach has given excellent 99.7% Accuracy and 99.6% F1-Measure and outperforms existing approaches.

[1]  Guangyu Zhou,et al.  Linguistic Understanding of Complaints and Praises in User Reviews , 2016, WASSA@NAACL-HLT.

[2]  Xun Xu,et al.  Predicting overall customer satisfaction: Big data evidence from hotel online textual reviews , 2019, International Journal of Hospitality Management.

[3]  Sonali Agarwal,et al.  Prediction of star ratings from online reviews , 2017, TENCON 2017 - 2017 IEEE Region 10 Conference.

[4]  Pablo Gamallo,et al.  Linguistic Features to Identify Extreme Opinions: An Empirical Study , 2018, IDEAL.

[5]  Srikumar Krishnamoorthy,et al.  Linguistic features for review helpfulness prediction , 2015, Expert Syst. Appl..

[6]  Yogesh Kumar Dwivedi,et al.  Ranking online consumer reviews , 2019, Electron. Commer. Res. Appl..

[7]  Weiguo Fan,et al.  An Integrated Text Analytic Framework for Product Defect Discovery , 2015 .

[8]  Santanu Kumar Rath,et al.  Classification of Sentimental Reviews Using Machine Learning Techniques , 2015 .

[9]  Dominik Kowald,et al.  High Enough?: Explaining and Predicting Traveler Satisfaction Using Airline Reviews , 2016, HT.

[10]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[11]  Yao Liu,et al.  Using contextual features and multi-view ensemble learning in product defect identification from online discussion forums , 2018, Decis. Support Syst..

[12]  Vishal. A. Kharde,et al.  Sentiment Analysis of Twitter Data : A Survey of Techniques , 2016, ArXiv.

[13]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[14]  Roberto Basili,et al.  Complex Linguistic Features for Text Classification: A Comprehensive Study , 2004, ECIR.

[15]  Indranil Bose,et al.  What do hotel customers complain about? Text analysis using structural topic model , 2019, Tourism Management.

[16]  Pablo Gamallo,et al.  A lexicon based method to search for extreme opinions , 2018, PloS one.

[17]  S. Becken,et al.  Sentiment Analysis in Tourism: Capitalizing on Big Data , 2019 .

[18]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[19]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.