论文信息 - Comparing Twitter Summarization Algorithms for Multiple Post Summaries

Comparing Twitter Summarization Algorithms for Multiple Post Summaries

Due to the sheer volume of text generated by a micro log site like Twitter, it is often difficult to fully understand what is being said about various topics. In an attempt to understand micro logs better, this paper compares algorithms for extractive summarization of micro log posts. We present two algorithms that produce summaries by selecting several posts from a given set. We evaluate the generated summaries by comparing them to both manually produced summaries and summaries produced by several leading traditional summarization systems. In order to shed light on the special nature of Twitter posts, we include extensive analysis of our results, some of which are unexpected.

Jugal K. Kalita | David I. Inouye | J. Kalita

[1] Robert Tibshirani,et al. Estimating the number of clusters in a data set via the gap statistic , 2000 .

[2] Max Kaufmann. Syntactic Normalization of Twitter Messages , 2010 .

[3] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.

[4] G. Karypis,et al. Criterion Functions for Document Clustering ∗ Experiments and Analysis , 2001 .

[5] Jugal K. Kalita,et al. Experiments in Microblog Summarization , 2010, 2010 IEEE Second International Conference on Social Computing.

[6] Inderjit S. Dhillon,et al. Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[7] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[8] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.

[9] Gordon I. McCalla,et al. A Response to the Need for Summary Responses , 1984, COLING.

[10] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[11] J. Kalita,et al. Automatic Summarization of Twitter Topics , 2010 .