Investigating Determinants of Voting for the "Helpfulness" of Online Consumer Reviews: A Text Mining Approach

The “helpfulness” feature of online user reviews helps consumers cope with information overloads and facilitates decision making. However, many online user reviews lack sufficient helpfulness votes for other users to evaluate their true helpfulness level. This study empirically examines the impact of the various features, that is, basic, stylistic, and semantic, of online user reviews on the number of helpfulness votes those reviews receive. Text mining techniques are employed to extract semantic characteristics from review texts. Our findings show that the semantic characteristics are more influential than other characteristics in affecting how many helpfulness votes reviews receive. Our findings also suggest that reviews with extreme opinions receive more helpfulness votes than those with mixed or neutral opinions. This paper sheds light on the understanding of online users’ helpfulness voting behavior and the design of a better helpfulness voting mechanism for online user review systems.

[1]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[2]  D. Cox,et al.  The analysis of binary data , 1971 .

[3]  Bin Gu,et al.  Do online reviews matter? - An empirical investigation of panel data , 2008, Decis. Support Syst..

[4]  Strother H. Walker,et al.  Estimation of the probability of an event as a function of several independent variables. , 1967, Biometrika.

[5]  Chih-Ping Wei,et al.  A Latent Semantic Indexing-based approach to multilingual document clustering , 2008, Decis. Support Syst..

[6]  Gilad Ravid,et al.  Information overload and the message dynamics of online interaction spaces: a theoretical model and empirical exploration , 2004, IEEE Engineering Management Review.

[7]  H. Akaike A new look at the statistical model identification , 1974 .

[8]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[9]  R. Kolbe,et al.  Content-Analysis Research: An Examination of Applications with Directives for Improving Research Reliability and Objectivity , 1991 .

[10]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[11]  R. MacAvoy,et al.  Frictionless Commerce? A Comparison of Internet and Conventional Retailers , 1999 .

[12]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[13]  Xiaohui Yu,et al.  Modeling and Predicting the Helpfulness of Online Reviews , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[14]  H. H. Kassarjian Content Analysis in Consumer Research , 1977 .

[15]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[16]  Edward B. Royzman,et al.  Negativity Bias, Negativity Dominance, and Contagion , 2001 .

[17]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[18]  Anindya Ghose,et al.  Examining the Relationship Between Reviews and Sales: The Role of Reviewer Identity Disclosure in Electronic Markets , 2008, Inf. Syst. Res..