论文信息 - Discriminative Sentence Compression with Soft Syntactic Evidence

Discriminative Sentence Compression with Soft Syntactic Evidence

We present a model for sentence compression that uses a discriminative largemargin learning framework coupled with a novel feature set defined on compressed bigrams as well as deep syntactic representations provided by auxiliary dependency and phrase-structure parsers. The parsers are trained out-of-domain and contain a significant amount of noise. We argue that the discriminative nature of the learning algorithm allows the model to learn weights relative to any noise in the feature set to optimize compression accuracy directly. This differs from current state-of-the-art models (Knight and Marcu, 2000) that treat noisy parse trees, for both compressed and uncompressed sentences, as gold standard when calculating model parameters.

Ryan T. McDonald

[1] Y. Singer,et al. Ultraconservative online algorithms for multiclass problems , 2003 .

[2] Stefan Riezler,et al. Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical-Functional Grammar , 2003, NAACL.

[3] Koby Crammer,et al. Flexible Text Segmentation with Structured Multilabel Classification , 2005, HLT.

[4] Koby Crammer,et al. Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[5] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[6] Daniel Marcu,et al. Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[7] Eugene Charniak,et al. A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[8] William W. Cohen,et al. Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[9] Eugene Charniak,et al. Supervised and Unsupervised Learning for Sentence Compression , 2005, ACL.

[10] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.