The Pareto Principle Is Everywhere: Finding Informative Sentences for Opinion Summarization Through Leader Detection

Most previous works on opinion summarization focus on summarizing sentiment polarity distribution toward different aspects of an entity (e.g., battery life and screen of a mobile phone). However, users’ demand may be more beyond this kind of opinion summarization. Besides such coarse-grained summarization on aspects, one may prefer to read detailed but concise text of the opinion data for more information. In this paper, we propose a new framework for opinion summarization. Our goal is to assist users to get helpful opinion suggestions from reviews by only reading a short summary with a few informative sentences, where the quality of summary is evaluated in terms of both aspect coverage and viewpoints preservation. More specifically, we formulate the informative sentence selection problem in opinion summarization as a community leader detection problem, where a community consists of a cluster of sentences toward the same aspect of an entity and leaders can be considered as the most informative sentences of the corresponding aspect. We develop two effective algorithms to identify communities and leaders. Reviews of six products from Amazon.com are used to verify the effectiveness of our method for opinion summarization.

[1]  Bo Hu,et al.  User Features and Social Networks for Topic Modeling in Online Social Media , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[2]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[3]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[4]  Stefan Burr,et al.  The Mathematics of networks , 1982 .

[5]  Cyrill Gössi,et al.  Selecting a Comprehensive Set of Reviews , 2015 .

[6]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[7]  Meng Wang,et al.  Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews , 2011, ACL.

[8]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[9]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[10]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[11]  Tao Li,et al.  Document update summarization using incremental hierarchical clustering , 2010, CIKM.

[12]  Sasha Blair-Goldensohn,et al.  Sentiment Summarization: Evaluating and Learning User Preferences , 2009, EACL.

[13]  Maxim Sviridenko,et al.  Approximation Algorithms for Maximum Coverage and Max Cut with Given Sizes of Parts , 1999, IPCO.

[14]  Katja Filippova,et al.  Multi-Sentence Compression: Finding Shortest Paths in Word Graphs , 2010, COLING.

[15]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[16]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[17]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[18]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[19]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[20]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[21]  Christopher D. Manning,et al.  Exploring Sentiment Summarization , 2004 .

[22]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[23]  Yue Lu,et al.  Rated aspect summarization of short comments , 2009, WWW '09.

[24]  ChengXiang Zhai,et al.  Comprehensive Review of Opinion Summarization , 2011 .

[25]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[26]  Haizhou Li,et al.  Graph-based informative-sentence selection for opinion summarization , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[27]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[28]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[29]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[30]  Xiaoyan Zhu,et al.  A Comparative Study on Ranking and Selection Strategies for Multi-Document Summarization , 2010, COLING.

[31]  Ming Zhou,et al.  Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[32]  Abraham Bookstein,et al.  Informetric distributions, part I: Unified overview , 1990, J. Am. Soc. Inf. Sci..

[33]  Kristina Lerman,et al.  The Role of Social Media in the Discussion of Controversial Topics , 2013, 2013 International Conference on Social Computing.

[34]  Sasha Blair-Goldensohn,et al.  Building a Sentiment Summarizer for Local Service Reviews , 2008 .

[35]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[36]  Maite Taboada,et al.  Methods for Creating Semantic Orientation Dictionaries , 2006, LREC.

[37]  Gert Sabidussi,et al.  The centrality index of a graph , 1966 .

[38]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[39]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[40]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[41]  Kristina Lerman,et al.  Tripartite graph clustering for dynamic sentiment analysis on social media , 2014, SIGMOD Conference.

[42]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks by H*-graph , 2010, SIGMOD Conference.

[43]  Jon M. Kleinberg,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Opinions How Opinions are Received by Online Communities: A Case Study on Amazon.com Helpfulness Votes , 2022 .

[44]  Uzay Kaymak,et al.  Polarity analysis of texts using discourse structure , 2011, CIKM '11.

[45]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[46]  Dragomir R. Radev,et al.  Detecting Multiple Facets of an Event using Graph-Based Unsupervised Methods , 2008, COLING.