Topic and sentiment aware microblog summarization for twitter

Recent advances in microblog content summarization has primarily viewed this task in the context of traditional multi-document summarization techniques where a microblog post or their collection form one document. While these techniques already facilitate information aggregation, categorization and visualization of microblog posts, they fall short in two aspects: i ) when summarizing a certain topic from microblog content, not all existing techniques take topic polarity into account. This is an important consideration in that the summarization of a topic should cover all aspects of the topic and hence taking polarity into account (sentiment) can lead to the inclusion of the less popular polarity in the summarization process. ii ) Some summarization techniques produce summaries at the topic level. However, it is possible that a given topic can have more than one important aspect that need to have representation in the summarization process. Our work in this paper addresses these two challenges by considering both topic sentiments and topic aspects in tandem. We compare our work with the state of the art Twitter summarization techniques and show that our method is able to outperform existing methods on standard metrics such as ROUGE-1.

[1]  Jeffrey Nichols,et al.  Summarizing sporting events using twitter , 2012, IUI '12.

[2]  Noah A. Smith,et al.  Toward Abstractive Summarization Using Semantic Representations , 2018, NAACL.

[3]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[4]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[5]  Julio Gonzalo,et al.  Tweet Stream Summarization for Online Reputation Management , 2016, ECIR.

[6]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Yajie Miao,et al.  Enhancing Query-oriented Summarization based on Sentence Wikification , 2010 .

[9]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[10]  Joshua Goodman,et al.  Multi-Document Summarization by Maximizing Informative Content-Words , 2007, IJCAI.

[11]  Vincenzo Loia,et al.  Formal and relational concept analysis for fuzzy-based automatic semantic annotation , 2013, Applied Intelligence.

[12]  Yang Yang,et al.  Multimedia Summarization for Social Events in Microblog Stream , 2015, IEEE Transactions on Multimedia.

[13]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[14]  Xueqi Cheng,et al.  Aspect-based extractive summarization of online reviews , 2011, SAC '11.

[15]  Jugal K. Kalita,et al.  Comparing Twitter Summarization Algorithms for Multiple Post Summaries , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[16]  Eric SanJuan,et al.  Multilingual Summarization Evaluation without Human Models , 2010, COLING.

[17]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[18]  Yukio Ohsawa,et al.  KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[19]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[20]  Julio Gonzalo,et al.  Overview of RepLab 2013: Evaluating Online Reputation Monitoring Systems , 2013, CLEF.

[21]  Marie-Francine Moens,et al.  Multidocument Question Answering Text Summarization Using Topic Signatures , 2005, J. Digit. Inf. Manag..

[22]  Qiang Yang,et al.  Web-page summarization using clickthrough data , 2005, SIGIR '05.

[23]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[24]  Eduard H. Hovy,et al.  From Single to Multi-document Summarization , 2002, ACL.

[25]  Fang Wu,et al.  Finding communities in linear time: a physics approach , 2003, ArXiv.

[26]  Xu Ling,et al.  Mining multi-faceted overviews of arbitrary topics in a text collection , 2008, KDD.

[27]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[28]  KhreichWael,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, CI 2015.

[29]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[31]  Rajesh Piryani,et al.  Generating Aspect-based Extractive Opinion Summary: Drawing Inferences from Social Media Texts , 2018, Computación y Sistemas.

[32]  Elena Lloret,et al.  Analyzing the Use of Word Graphs for Abstractive Text Summarization , 2011 .

[33]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[34]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[35]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[36]  Jugal K. Kalita,et al.  Experiments in Microblog Summarization , 2010, 2010 IEEE Second International Conference on Social Computing.

[37]  Guy Lapalme,et al.  Fully Abstractive Approach to Guided Summarization , 2012, ACL.

[38]  Xiaodong Gu,et al.  Aspect-based Opinion Summarization with Convolutional Neural Networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[39]  Jelena Jovanovic,et al.  The state of the art in semantic relatedness: a framework for comparison , 2017, The Knowledge Engineering Review.

[40]  Mimmo Parente,et al.  Time Aware Knowledge Extraction for microblog summarization on Twitter , 2015, Inf. Fusion.

[41]  Mohsen Kahani,et al.  Semantics-Enabled User Interest Detection from Twitter , 2015, 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT).

[42]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[43]  Jugal K. Kalita,et al.  Summarization of Twitter Microblogs , 2014, Comput. J..

[44]  Rupal Bhargava,et al.  ATSSI: Abstractive Text Summarization Using Sentiment Infusion , 2016 .

[45]  Mohsen Kahani,et al.  Inferring Implicit Topical Interests on Twitter , 2016, ECIR.

[46]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[47]  Rada Mihalcea,et al.  A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[48]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[49]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[50]  Xiaojun Wan,et al.  CMiner: Opinion Extraction and Summarization for Chinese Microblogs , 2016, IEEE Transactions on Knowledge and Data Engineering.

[51]  Ani Nenkova,et al.  Automatically Evaluating Content Selection in Summarization without Human Models , 2009, EMNLP.

[52]  Christian Sohler,et al.  Analysis of Agglomerative Clustering , 2011, STACS.

[53]  Xiaohua Hu,et al.  Exploiting Wikipedia as external knowledge for document clustering , 2009, KDD.

[54]  Liang Zhou,et al.  On the Summarization of Dynamically Introduced Information: Online Discussions and Blogs , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[55]  Feras Al-Obeidat,et al.  Entity linking of tweets based on dominant entity candidates , 2018, Social Network Analysis and Mining.

[56]  Chen Lin,et al.  Generating event storylines from microblogs , 2012, CIKM.

[57]  Jon Oberlander,et al.  AAAI Spring Symposium of Computational Approaches to Analysing Weblog , 2006, AAAI 2006.

[58]  Qi Gao,et al.  Analyzing user modeling on twitter for personalized news recommendations , 2011, UMAP'11.

[59]  Jade Goldstein-Stewart,et al.  Summarizing text documents: sentence selection and evaluation metrics , 1999, SIGIR '99.

[60]  Juan-Manuel Torres-Moreno,et al.  Automatic Summarization System coupled with a Question-Answering System (QAAS) , 2009, ArXiv.

[61]  Ani Nenkova,et al.  Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[62]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[63]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[64]  Yue Liu,et al.  Aggregate Characterization of User Behavior in Twitter and Analysis of the Retweet Graph , 2014, ACM Trans. Internet Techn..

[65]  Chris H. Q. Ding,et al.  Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[66]  Ali A. Ghorbani,et al.  An Iterative Hybrid Filter-Wrapper Approach to Feature Selection for Document Clustering , 2009, Canadian Conference on AI.

[67]  Yen-Liang Chen,et al.  Opinion mining from online hotel reviews - A text summarization approach , 2017, Inf. Process. Manag..

[68]  Leonhard Hennig,et al.  Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis , 2009, RANLP.

[69]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[70]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[71]  Abdul Razak Hamdan,et al.  Hierarchical Clustering Algorithms in Data Mining , 2015 .

[72]  Marina Meila,et al.  Comparing Clusterings by the Variation of Information , 2003, COLT.

[73]  S. vanDongen Performance criteria for graph clustering and Markov cluster experiments , 2000 .

[74]  Matthew Rowe,et al.  Linked Knowledge Sources for Topic Classification of Microposts: A Semantic Graph-Based Approach , 2014, J. Web Semant..

[75]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[76]  U. Berkeley Exploring Content Models for Multi-Document Summarization , 2018 .

[77]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.