论文信息 - Using Word Sequences for Text Summarization

Using Word Sequences for Text Summarization

Traditional approaches for extractive summarization score/classify sentences based on features such as position in the text, word frequency and cue phrases These features tend to produce satisfactory summaries, but have the inconvenience of being domain dependent In this paper, we propose to tackle this problem representing the sentences by word sequences (n-grams), a widely used representation in text categorization The experiments demonstrated that this simple representation not only diminishes the domain and language dependency but also enhances the summarization performance.

Manuel Montes-y-Gómez | Luis Villaseñor Pineda | Esaú Villatoro-Tello

[1] Alex Alves Freitas,et al. Automatic Text Summarization Using a Machine Learning Approach , 2002, SBIA.

[2] Michele Banko,et al. Using N-Grams To Understand the Nature of Summaries , 2004, HLT-NAACL.

[3] Jihoon Yang,et al. Text Summarization by Sentence Segment Extraction Using Machine Learning Algorithms , 2000, PAKDD.

[4] Johannes Fürnkranz,et al. A Study Using $n$-gram Features for Text Categorization , 1998 .

[5] Francine Chen,et al. A trainable document summarizer , 1995, SIGIR '95.

[6] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[7] Constantin Orasan,et al. Building better corpora for summarisation , 2003 .

[8] Geoffrey Sampson,et al. The Oxford Handbook of Computational Linguistics , 2003, Lit. Linguistic Comput..

[9] W. B. Cavnar,et al. N-gram-based text categorization , 1994 .

[10] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.

[11] Dragomir R. Radev,et al. Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[12] R. Bekkerman,et al. Using Bigrams in Text Categorization , 2003 .