Corpus Analysis and Annotation for Helpful Sentences in Product Reviews

For the last two decades, various studies on determining the quality of online product reviews have been concerned with the classification of complete documents into helpful or unhelpful classes using supervised learning methods. As in any supervised machine-learning task, a manually annotated corpus is required to train a model. Corpora annotated for helpful product reviews are an important resource for the understanding of what makes online product reviews helpful and of how to rank them according to their quality. However, most corpora for helpfulness are annotated on the document level: the full review. Little attention has been paid to carrying out a deeper analysis of helpful comments in reviews. In this article, a new annotation scheme is proposed to identify helpful sentences from each product review in the dataset. The annotation scheme, guidelines and the inter-annotator agreement scores are presented and discussed. A high level of inter-annotator agreement is obtained, indicating that the annotated corpus is suitable to support subsequent research.

[1]  Bing Liu,et al.  Mining Comparative Sentences and Relations , 2006, AAAI.

[2]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[3]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[4]  Zhu Zhang,et al.  Utility scoring of product reviews , 2006, CIKM '06.

[5]  Iryna Gurevych,et al.  Predicting the perceived quality of web forum posts , 2007 .

[6]  Xiaohui Yu,et al.  Modeling and Predicting the Helpfulness of Online Reviews , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[7]  Xiaohui Yu,et al.  ARSA: a sentiment-aware model for predicting sales performance using blogs , 2007, SIGIR.

[8]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[9]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[10]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[11]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[12]  Sung-Hyon Myaeng,et al.  Automatic extraction of advice-revealing sentences foradvice mining from online forums , 2013, K-CAP.

[13]  Janyce Wiebe,et al.  RECOGNIZING STRONG AND WEAK OPINION CLAUSES , 2006, Comput. Intell..

[14]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[15]  Suad Alhojely,et al.  Sentiment Analysis and Opinion Mining: A Survey , 2016 .

[16]  Eric K. Ringger,et al.  Pulse: Mining Customer Opinions from Free Text , 2005, IDA.

[17]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[18]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[19]  Areej Malibari,et al.  A Survey of Quality Prediction of Product Reviews , 2015 .

[20]  Dipankar Das,et al.  Identifying Emotional Expressions, Intensities and Sentence Level Emotion Tags Using a Supervised Framework , 2010, PACLIC.

[21]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[22]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[23]  Peter H. Reingen,et al.  Social Ties and Word-of-Mouth Referral Behavior , 1987 .

[24]  Jack G. Conrad,et al.  Opinion mining in legal blogs , 2007, ICAIL.

[25]  Mark G. Core,et al.  Coding Dialogs with the DAMSL Annotation Scheme , 1997 .

[26]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[27]  Alok N. Choudhary,et al.  Sentiment Analysis of Conditional Sentences , 2009, EMNLP.

[28]  Siddharth Patwardhan,et al.  Feature Subsumption for Opinion Analysis , 2006, EMNLP.

[29]  Raymond Y. K. Lau,et al.  Multi-facets Quality Assessment of Online Opinionated Expressions , 2010, WISE Workshops.

[30]  Shen Huang,et al.  Discovering clues for review quality from author's behaviors on e-commerce sites , 2009, ICEC.

[31]  Yuval Marom,et al.  Experiments with Sentence Classification , 2006, ALTA.

[32]  Panagiotis G. Ipeirotis,et al.  Designing novel review ranking systems: predicting the usefulness and impact of reviews , 2007, ICEC.

[33]  Carolyn Penstein Rosé,et al.  Identifying Types of Claims in Online Customer Reviews , 2009, NAACL.

[34]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[35]  Richard Y. K. Fung,et al.  Identifying helpful online reviews: A product designer's perspective , 2013, Comput. Aided Des..

[36]  Niranjan Pedanekar,et al.  Wishful Thinking - Finding suggestions and ’buy’ wishes from product reviews , 2010, HLT-NAACL 2010.

[37]  Ming Zhou,et al.  Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[38]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[39]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[40]  L. J. Harrison‐Walker The Measurement of Word-of-Mouth Communication and an Investigation of Service Quality and Customer Commitment As Potential Antecedents , 2001 .

[41]  Yue Lu Exploiting Social Context for Review Quality Prediction , 2010 .

[42]  Chien Chin Chen,et al.  Quality evaluation of product reviews using an information quality framework , 2011, Decis. Support Syst..