Artex is AnotheR TEXt summarizer

This paper describes Artex, another algorithm for Automatic Text Summarization. In order to rank sentences, a simple inner product is calculated between each sentence, a document vector (text topic) and a lexical vector (vocabulary used by a sentence). Summaries are then generated by assembling the highest ranked sentences. No ruled-based linguistic post-processing is necessary in order to obtain summaries. Tests over several datasets (coming from Document Understanding Conferences (DUC), Text Analysis Conferences (TAC), evaluation campaigns, etc.) in French, English and Spanish have shown that summarizer achieves interesting results.

[1]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[2]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[3]  Juan-Manuel Torres-Moreno Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization , 2012, ArXiv.

[4]  Eric SanJuan,et al.  Textual Energy of Associative Memories: Performant Applications of Enertex Algorithm in Text Summarization and Topic Segmentation , 2007, MICAI.

[5]  Daniel Marcu,et al.  Practical structured learning techniques for natural language processing , 2006 .

[6]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[7]  Elizabeth D. Liddy,et al.  Advances in Automatic Text Summarization , 2001, Information Retrieval.

[8]  Juan-Manuel Torres-Moreno,et al.  Résumé automatique de documents : une approche statistique , 2011 .

[9]  Patrice Bellot,et al.  Overview of the INEX 2010 Question Answering Track (QA@INEX) , 2010, INEX.

[10]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[11]  Frédéric Béchet,et al.  The LIA-Thales summarization system at DUC-2006 , 2006, HLT-NAACL 2006.

[12]  Eric SanJuan,et al.  Summary Evaluation with and without References , 2010, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..

[13]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[14]  Ani Nenkova,et al.  Automatic Summary Evaluation without Human Models , 2008, TAC.

[15]  Eric SanJuan,et al.  A New Hybrid Summarizer Based on Vector Space Model, Statistical Physics and Linguistics , 2007, MICAI.

[16]  Juan-Manuel Torres-Moreno,et al.  Automatic Summarization System coupled with a Question-Answering System (QAAS) , 2009, ArXiv.