Time-aware Multi-Viewpoint Summarization of Multilingual Social Text Streams

A viewpoint is a triple consisting of an entity, a topic related to this entity and sentiment towards this topic. In time-aware multi-viewpoint summarization one monitors viewpoints for a running topic and selects a small set of informative documents. In this paper, we focus on time-aware multi-viewpoint summarization of multilingual social text streams. Viewpoint drift, ambiguous entities and multilingual text make this a challenging task. Our approach includes three core ingredients: dynamic viewpoint modeling, cross-language viewpoint alignment, and, finally, multi-viewpoint summarization. Specifically, we propose a dynamic latent factor model to explicitly characterize a set of viewpoints through which entities, topics and sentiment labels during a time interval are derived jointly; we connect viewpoints in different languages by using an entity-based semantic similarity measure; and we employ an update viewpoint summarization strategy to generate a time-aware summary to reflect viewpoints. Experiments conducted on a real-world dataset demonstrate the effectiveness of our proposed method for time-aware multi-viewpoint summarization of multilingual social text streams.

[1]  Mounia Lalmas,et al.  DEESSE: entity-Driven Exploratory and sErendipitous Search SystEm , 2014, CIKM.

[2]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[3]  Naonori Ueda,et al.  Topic Tracking Model for Analyzing Consumer Purchase Behavior , 2009, IJCAI.

[4]  Zhenhua Wang,et al.  Sumblr: continuous summarization of evolving tweet streams , 2013, SIGIR.

[5]  Daniel Barbará,et al.  On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[6]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[7]  M. de Rijke,et al.  Adding semantics to microblog posts , 2012, WSDM '12.

[8]  Xiaojun Wan Update Summarization Based on Co-Ranking with Constraints , 2012, COLING.

[9]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[10]  ChengXiang Zhai,et al.  Micropinion generation: an unsupervised approach to generating ultra-concise summaries of opinions , 2012, WWW.

[11]  James Allan,et al.  Introduction to topic detection and tracking , 2002 .

[12]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[13]  Yinglin Wang,et al.  Generating Aspect-oriented Multi-Document Summarization with Event-aspect model , 2011, EMNLP.

[14]  Yue Lu,et al.  Rated aspect summarization of short comments , 2009, WWW '09.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Shuang-Hong Yang,et al.  Large-scale high-precision topic modeling on twitter , 2014, KDD.

[17]  Bing Liu,et al.  Opinion Extraction and Summarization on the Web , 2006, AAAI.

[18]  Craig MacDonald,et al.  Scalable distributed event detection for Twitter , 2013, 2013 IEEE International Conference on Big Data.

[19]  Michael J. Paul,et al.  A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics , 2010, AAAI.

[20]  Eric P. Xing,et al.  Timeline: A Dynamic Hierarchical Dirichlet Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream , 2010, UAI.

[21]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.

[22]  M. de Rijke,et al.  Estimating Reputation Polarity on Microblog Posts , 2016, Inf. Process. Manag..

[23]  Luo Si,et al.  Mining contrastive opinions on political texts using cross-perspective topic model , 2012, WSDM '12.

[24]  Lora Aroyo,et al.  CrowdTruth: Machine-Human Computation Framework for Harnessing Disagreement in Gathering Annotated Data , 2014, SEMWEB.

[25]  Jimeng Sun,et al.  Dynamic Mixture Models for Multiple Time-Series , 2007, IJCAI.

[26]  Umeshwar Dayal,et al.  Ranking explanatory sentences for opinion summarization , 2013, SIGIR.

[27]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[28]  ChengXiang Zhai,et al.  Generating comparative summaries of contradictory opinions in text , 2009, CIKM.

[29]  Jordan Boyd-Graber,et al.  Online Latent Dirichlet Allocation with Infinite Vocabulary , 2013, ICML.

[30]  Craig MacDonald,et al.  Incremental Update Summarization: Adaptive Sentence Selection based on Prevalence and Novelty , 2014, CIKM.

[31]  Craig MacDonald,et al.  On sparsity and drift for effective real-time filtering in microblogs , 2013, CIKM.

[32]  Alistair Kennedy,et al.  Update Summary Update , 2008, TAC.

[33]  Michael J. Paul,et al.  Summarizing Contrastive Viewpoints in Opinionated Text , 2010, EMNLP.

[34]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[35]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[36]  M. de Rijke,et al.  Summarizing Contrastive Themes via Hierarchical Non-Parametric Processes , 2015, SIGIR.

[37]  Brian Roark,et al.  Query-focused Supervised Sentence Ranking for Update Summaries , 2008, TAC.

[38]  Lora Aroyo,et al.  The Three Sides of CrowdTruth , 2014, Hum. Comput..

[39]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[41]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[42]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[43]  M. de Rijke,et al.  Personalized time-aware tweets summarization , 2013, SIGIR.

[44]  Yihong Gong,et al.  Comparative document summarization via discriminative sentence selection , 2009, CIKM.

[45]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[46]  Hao Yu,et al.  Structure-Aware Review Mining and Summarization , 2010, COLING.

[47]  Jing Jiang,et al.  Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter , 2014, SDM.

[48]  Enrique Alfonseca,et al.  DualSum: a Topic-Model based approach for update summarization , 2012, EACL.

[49]  Peng Li,et al.  Joint topic modeling for event summarization across news and social media streams , 2012, CIKM.