Explorations in Automatic Book Summarization

Most of the text summarization research carried out to date has been concerned with the summarization of short documents (e.g., news stories, technical reports), and very little work if any has been done on the summarization of very long documents. In this paper, we try to address this gap and explore the problem of book summarization. We introduce a new data set specifically designed for the evaluation of systems for book summarization, and describe summarization techniques that explicitly account for the length of the documents.

[1]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[2]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[3]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[4]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[5]  Edward Gibson,et al.  Paragraph-, Word-, and Coherence-based Approaches to Sentence Ranking: A Comparison of Algorithm and Human Performance , 2004, ACL.

[6]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[7]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[8]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[9]  Liang Zhou,et al.  Digesting Virtual "Geek" Culture: The Summarization of Technical Internet Relay Chats , 2005, ACL.

[10]  Igor Malioutov,et al.  Minimum Cut Model for Spoken Lecture Segmentation , 2006, ACL.

[11]  Stephen Wan,et al.  Generating Overview Summaries of Ongoing Email Thread Discussions , 2004, COLING.

[12]  Wei Li,et al.  The Hong Kong Polytechnic University at DUC2005 , 2005 .

[13]  Anna Kazantseva,et al.  Challenges in Evaluating Summaries of Short Stories , 2006 .

[14]  Liang Zhou,et al.  A Web-Trained Extraction Summarization System , 2003, NAACL.

[15]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[16]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[17]  Dragomir R. Radev,et al.  MEAD ReDUCs: Michigan at DUC 2003 , 2003 .

[18]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[19]  Michel Galley,et al.  Automatic Summarization of Conversational Multi-Party Speech , 2006, AAAI.

[20]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[21]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[22]  B. Magnini,et al.  A Keyphrase-Based Approach to Summarization : the LAKE System at DUC-2005 , 2005 .

[23]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[24]  Tsutomu Hirao,et al.  NTT's Text Summarization System for DUC-2002 , 2002 .