A Comparison of Model Free versus Model Intensive Approaches to Sentence Compression

This work introduces a model free approach to sentence compression, which grew out of ideas from Nomoto (2008), and examines how it compares to a state-of-art model intensive approach known as Tree-to-Tree Transducer, or T3 (Cohn and Lapata, 2008). It is found that a model free approach significantly outperforms T3 on the particular data we created from the Internet. We also discuss what might have caused T3's poor performance.

[1]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[2]  Mirella Lapata,et al.  Sentence Compression Beyond Word Deletion , 2008, COLING.

[3]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[4]  Mirella Lapata,et al.  Models for Sentence Compression: A Comparison across Domains, Training Requirements and Evaluation Measures , 2006, ACL.

[5]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[6]  Tadashi Nomoto A Generic Sentence Trimmer with CRFs , 2008, ACL.

[7]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[10]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[11]  Ryan T. McDonald Discriminative Sentence Compression with Soft Syntactic Evidence , 2006, EACL.

[12]  Sadaoki Furui,et al.  Speech Summarization: An Approach through Word Extraction and a Method for Evaluation , 2004, IEICE Trans. Inf. Syst..

[13]  Masaaki Nagata,et al.  A Stochastic Japanese Morphological Analyzer Using a Forward-DP Backward-A* N-Best Search Algorithm , 1994, COLING.

[14]  Michael Strube,et al.  Generating Constituent Order in German Clauses , 2007, ACL.

[15]  Michael Strube,et al.  Sentence Fusion via Dependency Graph Compression , 2008, EMNLP.

[16]  Kazuhiko Ozeki,et al.  Sentence Compression Using Statistical Information About Dependency Path Length , 2006, TSD.

[17]  Mirella Lapata,et al.  Sentence Compression as Tree Transduction , 2009, J. Artif. Intell. Res..

[18]  Eugene Charniak,et al.  Supervised and Unsupervised Learning for Sentence Compression , 2005, ACL.

[19]  Stefan Riezler,et al.  Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical-Functional Grammar , 2003, NAACL.

[20]  Mirella Lapata,et al.  Large Margin Synchronous Generation and its Application to Sentence Compression , 2007, EMNLP.