Automatic Summarization of Online Debates

Debate summarization is one of the novel and challenging research areas in automatic text summarization which has been largely unexplored. In this paper, we develop a debate summarization pipeline to summarize key topics which are discussed or argued in the two opposing sides of online debates. We view that the generation of debate summaries can be achieved by clustering, cluster labeling, and visualization. In our work, we investigate two different clustering approaches for the generation of the summaries. In the first approach, we generate the summaries by applying purely term-based clustering and cluster labeling. The second approach makes use of X-means for clustering and Mutual Information for labeling the clusters. Both approaches are driven by ontologies. We visualize the results using bar charts. We think that our results are a smooth entry for users aiming to receive the first impression about what is discussed within a debate topic containing waste number of argumentations.

[1]  Michael J. Paul,et al.  Summarizing Contrastive Viewpoints in Opinionated Text , 2010, EMNLP.

[2]  Vasudeva Varma,et al.  Online debate summarization using topic directed sentiment analysis , 2013, WISDOM '13.

[3]  Rob Koopman,et al.  Clustering articles based on semantic similarity , 2017, Scientometrics.

[4]  MICHAL CAMPR,et al.  Comparative summarization via Latent Semantic Analysis , 2012 .

[5]  ChengXiang Zhai,et al.  Generating comparative summaries of contradictory opinions in text , 2009, CIKM.

[6]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[9]  Bei Yu,et al.  A cross-collection mixture model for comparative text mining , 2004, KDD.

[10]  Ryan T. McDonald,et al.  Contrastive Summarization: An Experiment with Consumer Reviews , 2009, NAACL.

[11]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[12]  Kalina Bontcheva,et al.  Understanding climate change tweets: an open source toolkit for social media analysis , 2015, EnviroInfo/ICT4S.

[13]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[14]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[15]  Ahmet Aker,et al.  Automatic label generation for news comment clusters , 2016, INLG.

[16]  Kalina Bontcheva,et al.  Understanding Human Preferences for Summary Designs in Online Debates Domain , 2016, Polibits.