Multiple Alternative Sentence Compressions for Automatic Text Summarization

We perform multi-document summarization by generating compressed versions of source sentences as summary candidates and using weighted features of these candidates to construct summaries. We combine a parse-and-trim approach with a novel technique for producing multiple alternative compressions for source sentences. In addition, we use a novel method for tuning the feature weights that maximizes the change in the ROUGE-2 score ( ROUGE) between the already existing summary state and the new state that results from the addition of the candidate under consideration. We also describe experiments using a new paraphrase-based feature for redundancy checking. Finally, we present the results of our DUC2007 submissions and some ideas for future work.

[1]  John M. Conroy,et al.  Back to Basics: CLASSY 2006 , 2006 .

[2]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[3]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[4]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[5]  Martin Hassel,et al.  Towards Holistic Summarization – Selecting Summaries, Not Sentences , 2006, LREC.

[6]  Hongyan Jing,et al.  Sentence Reduction for Automatic Text Summarization , 2000, ANLP.

[7]  John Doull Back to Basics , 1997 .

[8]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[9]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[10]  Dragomir R. Radev,et al.  The University of Michigan at DUC 2004 , 2004 .

[11]  Chris Callison-Burch,et al.  Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[12]  Eugene Charniak,et al.  Supervised and Unsupervised Learning for Sentence Compression , 2005, ACL.

[13]  John M. Conroy,et al.  Sentence Trimming and Selection: Mixing and Matching , 2006 .

[14]  Jimmy J. Lin,et al.  Multi-candidate reduction: Sentence compression as a tool for document summarization tasks , 2007, Inf. Process. Manag..

[15]  Lucy Vanderwende,et al.  Microsoft Research at DUC2006: Task-Focused Summarization with Sentence Simplification and Lexical Expansion , 2006 .

[16]  Jimmy J. Lin,et al.  Sentence Compression as a Component of a Multi-Document Summarization System , 2006 .

[17]  Jaime Carbonell,et al.  Multi-Document Summarization By Sentence Extraction , 2000 .

[18]  R. Schwartz,et al.  Automatic Headline Generation for Newspaper Stories , 2002 .

[19]  M. J. D. Powell,et al.  An efficient method for finding the minimum of a function of several variables without calculating derivatives , 1964, Comput. J..

[20]  Jimmy J. Lin,et al.  Multiple alternative sentence compressions as a tool for automatic summarization tasks , 2007 .

[21]  Richard M. Schwartz,et al.  Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation , 2003, HLT-NAACL 2003.