论文信息 - Hidden-Variable Models for Discriminative Reranking - 字舞流文

Hidden-Variable Models for Discriminative Reranking

We describe a new method for the representation of NLP structures within reranking approaches. We make use of a conditional log-linear model, with hidden variables representing the assignment of lexical items to word clusters or word senses. The model learns to automatically make these assignments based on a discriminative training criterion. Training and decoding with the model requires summing over an exponential number of hidden-variable assignments: the required summations can be computed efficiently and exactly using dynamic programming. As a case study, we apply the model to parse reranking. The model gives an F-measure improvement of a 1.25% beyond the base parser, and an a 0.25% improvement beyond the Collins (2000) reranker. Although our experiments are focused on parsing, the techniques described generalize naturally to NLP structures other than parse trees.

Michael Collins | Terry Koo | M. Collins | Terry Koo

[1] Scott Miller,et al. Name Tagging with Word Clusters and Discriminative Training , 2004, NAACL.

[2] David J. Spiegelhalter,et al. Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[3] Hermann Ney,et al. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[4] Jun'ichi Tsujii,et al. Probabilistic CFG with Latent Annotations , 2005, ACL.

[5] Adwait Ratnaparkhi,et al. A maximum entropy model for parsing , 1994, ICSLP.

[6] Michael Collins,et al. New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[7] Massimiliano Ciaramita,et al. Supersense Tagging of Unknown Nouns in WordNet , 2003, EMNLP.

[8] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[9] Daniel M. Bikel. A Statistical Model for Parsing and Word-Sense Disambiguation , 2000, EMNLP.

[10] William T. Freeman,et al. Understanding belief propagation and its generalizations , 2003 .

[11] M. Collins. Michael Collins , 2004 .

[12] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[13] Trevor Darrell,et al. Conditional Random Fields for Object Recognition , 2004, NIPS.

[14] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[15] Mark Johnson,et al. Estimators for Stochastic “Unification-Based” Grammars , 1999, ACL.

[16] ROBERT G. ROSE,et al. Robert G , 2001 .

[17] Naftali Tishby,et al. Distributional Clustering of English Words , 1993, ACL.

[18] Mark Johnson,et al. Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques , 2002, ACL.

[19] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[20] Eugene Charniak,et al. Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[21] Michael Collins,et al. Ranking Algorithms for Named Entity Extraction: Boosting and the VotedPerceptron , 2002, ACL.

[22] Michael Collins,et al. Discriminative Reranking for Natural Language Parsing , 2000, CL.

[23] James R. Curran,et al. Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[24] Aravind K. Joshi,et al. An SVM-based voting algorithm with application to parse reranking , 2003, CoNLL.

[25] Marilyn A. Walker,et al. SPoT: A Trainable Sentence Planner , 2001, NAACL.

[26] Anoop Sarkar,et al. Discriminative Reranking for Machine Translation , 2004, NAACL.