Learning Predicate Insertion Rules for Document Abstracting

The insertion of linguistic material into document sentences to create new sentences is a common activity in document abstracting. We investigate a transformation-based learning method to simulate this type of operation relevant for text summarization. Our work is framed on a theory of transformation-based abstracting where an initial text summary is transformed into an abstract by the application of a number of rules learnt from a corpus of examples. Our results are as good as recent work on classification-based predicate insertion.

[1]  Min-Yen Kan,et al.  Corpus-trained Text Generation for Summarization , 2002, INLG.

[2]  Constantin Orasan,et al.  Building better corpora for summarisation , 2003 .

[3]  Torbjörn Lager µ-TBL Lite: A Small, Extendible Transformation-Based Learner , 1999, EACL.

[4]  Kalina Bontcheva,et al.  Architectural elements of language engineering robustness , 2002, Natural Language Engineering.

[5]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[6]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[7]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[8]  Ken Samuel,et al.  An Investigation of Transformation-Based Learning in Discourse , 1998, ICML.

[9]  Edward T. Cremmins The Art of Abstracting. , 1982 .

[10]  Michela Montesi,et al.  Revision of author abstracts: how it is carried out by LISA editors , 2007, ASLIB Proceedings.

[11]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[12]  Eraldo R. Fernandes,et al.  Clause Identification Using Entropy Guided Transformation Learning , 2009, 2009 Seventh Brazilian Symposium in Information and Human Language Technology.

[13]  Eric Brill,et al.  Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[14]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[15]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[16]  Timothy C. Craven Human creation of abstracts with selected computer assistance tools , 1998, Inf. Res..

[17]  J. Ross Quinlan,et al.  Learning decision tree classifiers , 1996, CSUR.

[18]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[19]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[20]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[21]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[22]  Brigitte Endres-Niggemeyer,et al.  SimSum: an empirically founded simulation of summarizing , 2000, Inf. Process. Manag..

[23]  Daniel Marcu,et al.  Abstractive headline generation using WIDL-expressions , 2007, Inf. Process. Manag..

[24]  Horacio Saggion,et al.  A Classification Algorithm for Predicting the Structure of Summaries , 2009 .

[25]  Laura Hasler,et al.  FROM EXTRACTS TO ABSTRACTS: HUMAN SUMMARY PRODUCTION OPERATIONS FOR COMPUTER-AIDED SUMMARISATION , 2007 .

[26]  Eduard H. Hovy,et al.  Identifying Topics by Position , 1997, ANLP.

[27]  Horacio Saggion,et al.  Concept Identification and Presentation in the Context of Technical Text Summarization , 2000 .

[28]  Barbara Di Eugenio,et al.  From Extracting to Abstracting: Generating Quasi-abstractive Summaries , 2008, LREC.

[29]  María Pinto Molina Documentary abstracting: toward a methodological model , 1995 .

[30]  Horacio Saggion,et al.  Generating Indicative-Informative Summaries with SumUM , 2002, Computational Linguistics.

[31]  Horacio Saggion,et al.  Multi-document summarization by cluster/prole relevance and redundancy removal , 2004 .

[32]  Elizabeth Du,et al.  The discourse-level structure of empirical abstracts: an exploratory study , 1991, Inf. Process. Manag..

[33]  Cícero Nogueira dos Santos,et al.  Phrase Chunking Using Entropy Guided Transformation Learning , 2008, ACL.