Review Selection Using Micro-Reviews

Given the proliferation of review content, and the fact that reviews are highly diverse and often unnecessarily verbose, users frequently face the problem of selecting the appropriate reviews to consume. Micro-reviews are emerging as a new type of online review content in the social media. Micro-reviews are posted by users of check-in services such as Foursquare. They are concise (up to 200 characters long) and highly focused, in contrast to the comprehensive and verbose reviews. In this paper, we propose a novel mining problem, which brings together these two disparate sources of review content. Specifically, we use coverage of micro-reviews as an objective for selecting a set of reviews that cover efficiently the salient aspects of an entity. Our approach consists of a two-step process: matching review sentences to micro-reviews, and selecting a small set of reviews that cover as many micro-reviews as possible, with few sentences. We formulate this objective as a combinatorial optimization problem, and show how to derive an optimal solution using Integer Linear Programming. We also propose an efficient heuristic algorithm that approximates the optimal solution. Finally, we perform a detailed evaluation of all the steps of our methodology using data collected from Foursquare and Yelp.

[1]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[2]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[3]  Panagiotis G. Ipeirotis,et al.  Designing novel review ranking systems: predicting the usefulness and impact of reviews , 2007, ICEC.

[4]  Yue Lu,et al.  Exploiting social context for review quality prediction , 2010, WWW '10.

[5]  Ramesh C. Jain,et al.  Summarization of personal photologs using multidimensional content and context , 2011, ICMR '11.

[6]  Hui Lin,et al.  Multi-document Summarization via Budgeted Maximization of Submodular Functions , 2010, NAACL.

[7]  ChengXiang Zhai,et al.  Micropinion generation: an unsupervised approach to generating ultra-concise summaries of opinions , 2012, WWW.

[8]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[9]  David Peleg,et al.  Approximation algorithms for the Label-CoverMAX and Red-Blue Set Cover problems , 2000, J. Discrete Algorithms.

[10]  Mark Crovella,et al.  Selecting a characteristic set of reviews , 2012, KDD.

[11]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[12]  Huan Liu,et al.  Exploring temporal effects for location recommendation on location-based social networks , 2013, RecSys.

[13]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Cecilia Mascolo,et al.  An Empirical Study of Geographic User Activity Patterns in Foursquare , 2011, ICWSM.

[16]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[17]  Evimaria Terzi,et al.  Selecting a comprehensive set of reviews , 2011, KDD.

[18]  Robert D. Carr,et al.  On the red-blue set cover problem , 2000, SODA '00.

[19]  Dan Klein,et al.  Optimization, Maxent Models, and Conditional Estimation without Magic , 2003, NAACL.

[20]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[21]  Dimitrios Gunopulos,et al.  Efficient Confident Search in Large Review Corpora , 2010, ECML/PKDD.

[22]  Virgílio A. F. Almeida,et al.  Tips, dones and todos: uncovering user profiles in foursquare , 2012, WSDM '12.

[23]  Virgílio A. F. Almeida,et al.  We know where you live: privacy characterization of foursquare behavior , 2012, UbiComp.

[24]  Yue Lu,et al.  Rated aspect summarization of short comments , 2009, WWW '09.

[25]  Rong Zhang,et al.  Selecting a Diversified Set of Reviews , 2013, APWeb.

[26]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[27]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[28]  Houfeng Wang,et al.  Mining User Reviews: from Specification to Summarization , 2009, ACL/IJCNLP.

[29]  Michael R. Lyu,et al.  Fused Matrix Factorization with Geographical and Social Influence in Location-Based Social Networks , 2012, AAAI.

[30]  Xiaohui Yu,et al.  Modeling and Predicting the Helpfulness of Online Reviews , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[31]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[32]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[33]  Nadia Magnenat-Thalmann,et al.  Time-aware point-of-interest recommendation , 2013, SIGIR.