Summarizing Contrastive Themes via Hierarchical Non-Parametric Processes

Given a topic of interest, a contrastive theme is a group of opposing pairs of viewpoints. We address the task of summarizing contrastive themes: given a set of opinionated documents, select meaningful sentences to represent contrastive themes present in those documents. Several factors make this a challenging problem: unknown numbers of topics, unknown relationships among topics, and the extraction of comparative sentences. Our approach has three core ingredients: contrastive theme modeling, diverse theme extraction, and contrastive theme summarization. Specifically, we present a hierarchical non-parametric model to describe hierarchical relations among topics; this model is used to infer threads of topics as themes from the nested Chinese restaurant process. We enhance the diversity of themes by using structured determinantal point processes for selecting a set of diverse themes with high quality. Finally, we pair contrastive themes and employ an iterative optimization algorithm to select sentences, explicitly considering contrast, relevance, and diversity. Experiments on three datasets demonstrate the effectiveness of our method.

[1]  Alexander J. Smola,et al.  Nested Chinese Restaurant Franchise Process: Applications to User Tracking and Document Modeling , 2013, ICML.

[2]  Michael J. Paul,et al.  Summarizing Contrastive Viewpoints in Opinionated Text , 2010, EMNLP.

[3]  Furu Wei,et al.  Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization , 2008, SIGIR '08.

[4]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[5]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[6]  Houfeng Wang,et al.  Entity-centric topic-oriented opinion summarization in twitter , 2012, KDD.

[7]  Eduard H. Hovy,et al.  From Single to Multi-document Summarization , 2002, ACL.

[8]  James Allan,et al.  Retrieval and novelty detection at the sentence level , 2003, SIGIR.

[9]  Yang Liu,et al.  Summarizing web forum threads based on a latent topic propagation process , 2011, CIKM '11.

[10]  Yong Yu,et al.  Enhancing diversity, coverage and balance for summarization through structure learning , 2009, WWW '09.

[11]  Umeshwar Dayal,et al.  Ranking explanatory sentences for opinion summarization , 2013, SIGIR.

[12]  Michael Gamon,et al.  The PYTHY Summarization System: Microsoft Research at DUC 2007 , 2007 .

[13]  Jeffrey Nichols,et al.  Summarizing sporting events using twitter , 2012, IUI '12.

[14]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[15]  Shiri Dori-Hacohen,et al.  Detecting controversy on the web , 2013, CIKM.

[16]  Xueqi Cheng,et al.  A Novel Relational Learning-to-Rank Approach for Topic-Focused Multi-document Summarization , 2013, 2013 IEEE 13th International Conference on Data Mining.

[17]  Katja Filippova,et al.  Multi-Sentence Compression: Finding Shortest Paths in Word Graphs , 2010, COLING.

[18]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[19]  Harry Shum,et al.  Twitter Topic Summarization by Ranking Tweets using Social Influence and Content Quality , 2012, COLING.

[20]  Craig MacDonald,et al.  Incremental Update Summarization: Adaptive Sentence Selection based on Prevalence and Novelty , 2014, CIKM.

[21]  M. de Rijke,et al.  Hierarchical multi-label classification of social text streams , 2014, SIGIR.

[22]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[23]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[24]  Michael J. Paul,et al.  A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics , 2010, AAAI.

[25]  ChengXiang Zhai,et al.  Comprehensive Review of Opinion Summarization , 2011 .

[26]  Wai Lam,et al.  MEAD - A Platform for Multidocument Multilingual Text Summarization , 2004, LREC.

[27]  Yue Lu,et al.  Rated aspect summarization of short comments , 2009, WWW '09.

[28]  Tao Li,et al.  Learning to Rank for Query-Focused Multi-document Summarization , 2011, 2011 IEEE 11th International Conference on Data Mining.

[29]  Bing Liu,et al.  Opinion Extraction and Summarization on the Web , 2006, AAAI.

[30]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[31]  M. de Rijke,et al.  Personalized time-aware tweets summarization , 2013, SIGIR.

[32]  M. de Rijke,et al.  Fusion helps diversification , 2014, SIGIR.

[33]  ChengXiang Zhai,et al.  Generating comparative summaries of contradictory opinions in text , 2009, CIKM.

[34]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[35]  Hao Yu,et al.  Structure-Aware Review Mining and Summarization , 2010, COLING.

[36]  Ben Taskar,et al.  Discovering Diverse and Salient Threads in Document Collections , 2012, EMNLP.

[37]  Zhenhua Wang,et al.  Sumblr: continuous summarization of evolving tweet streams , 2013, SIGIR.

[38]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.

[39]  Ryan T. McDonald,et al.  Contrastive Summarization: An Experiment with Consumer Reviews , 2009, NAACL.

[40]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[41]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[42]  Dilek Z. Hakkani-Tür,et al.  A Hybrid Hierarchical Model for Multi-Document Summarization , 2010, ACL.

[43]  Xiaojun Wan,et al.  Comparative news summarization using concept-based optimization , 2012, Knowledge and Information Systems.

[44]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[45]  Wei-Hao Lin,et al.  Which Side are You on? Identifying Perspectives at the Document and Sentence Levels , 2006, CoNLL.

[46]  Hua Li,et al.  Document Summarization Using Conditional Random Fields , 2007, IJCAI.

[47]  Ben Taskar,et al.  Structured Determinantal Point Processes , 2010, NIPS.

[48]  ChengXiang Zhai,et al.  Micropinion generation: an unsupervised approach to generating ultra-concise summaries of opinions , 2012, WWW.

[49]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[50]  David B. Dunson,et al.  The dynamic hierarchical Dirichlet process , 2008, ICML '08.