Compression entropique de phrases contrôlée par un perceptron

Sentence compression is a necessary component to the generation of abstracts. Previous studies focused mainly on the syntactic tree representation of the sentence. Our approach is a statistic approach, which does not use syntactic trees, which can be inaccurate in sentence analysis. At the core of our system is a language model based on lemma bigrams and part-of-speech tags (only a shallow parsing is performed) as well as an entropy computation over sentences to retrieve the best-compressed sentences. We also introduce the perceptron which is used to classify the compressed and non-compressed sentences and to indicate whether or not a sentence should be compressed.

[1]  Violaine Prince,et al.  Compression de phrases par élagage de leur arbre morpho-syntaxique Une première application sur les phrases narratives , 2006 .

[2]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[3]  Ryan T. McDonald Discriminative Sentence Compression with Soft Syntactic Evidence , 2006, EACL.

[4]  Hongyan Jing,et al.  Sentence Reduction for Automatic Text Summarization , 2000, ANLP.

[5]  Mirella Lapata,et al.  Models for Sentence Compression: A Comparison across Domains, Training Requirements and Evaluation Measures , 2006, ACL.

[6]  Mirella Lapata,et al.  Discourse Chunking and its Application to Sentence Compression , 2005, HLT.

[7]  Sadaoki Furui,et al.  Speech Summarization: An Approach through Word Extraction and a Method for Evaluation , 2004, IEICE Trans. Inf. Syst..

[8]  Violaine Prince,et al.  Compression de phrases par élagage de leur arbre morpho-syntaxique. Une première application sur les phrases narratives , 2006, Tech. Sci. Informatiques.

[9]  Susumu Horiguchi,et al.  A Sentence Reduction using Syntax Control , 2003 .

[10]  Akira Shimazu,et al.  Example-based sentence reduction using the hidden markov model , 2004, TALIP.

[11]  Daniel Marcu,et al.  Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[12]  Yi Pan,et al.  Sentence Compression for Automated Subtitling: A Hybrid Approach , 2004, ACL 2004.

[13]  Eugene Charniak,et al.  Supervised and Unsupervised Learning for Sentence Compression , 2005, ACL.

[14]  Jun'ichi Tsujii,et al.  Trimming CFG Parse Trees for Sentence Compression Using Machine Learning Approaches , 2006, ACL.

[15]  Stefan Riezler,et al.  Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical-Functional Grammar , 2003, NAACL.

[16]  Akira Shimazu,et al.  Probabilistic Sentence Reduction Using Support Vector Machines , 2004, COLING.

[17]  Chin-Yew Lin Improving Summarization Performance by Sentence Compression — A Pilot Study , 2003 .