论文信息 - Semi-supervised CCG Lexicon Extension - 字舞流文

Semi-supervised CCG Lexicon Extension

This paper introduces Chart Inference (CI), an algorithm for deriving a CCG category for an unknown word from a partial parse chart. It is shown to be faster and more precise than a baseline brute-force method, and to achieve wider coverage than a rule-based system. In addition, we show the application of CI to a domain adaptation task for question words, which are largely missing in the Penn Treebank. When used in combination with self-training, CI increases the precision of the baseline StatCCG parser over subject-extraction questions by 50%. An error analysis shows that CI contributes to the increase by expanding the number of category types available to the parser, while self-training adjusts the counts.

Mark Steedman | Emily Thomforde | Mark Steedman | Emily Thomforde

[1] Kim K. Baldridge,et al. Adapting Chart Realization to CCG , 2003, ENLG@EACL.

[2] Eugene Charniak,et al. Effective Self-Training for Parsing , 2006, NAACL.

[3] James R. Curran,et al. Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[4] Martin Kay,et al. Syntactic Process , 1979, ACL.

[5] Eugene Charniak,et al. Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[6] Julia Hockenmaier,et al. Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[7] Xuchen Yao,et al. An Inference-rules based Categorial Grammar Learner for Simulating Language Acquisition , 2009 .

[8] Suresh Manandhar,et al. Unsupervised Lexical Learning with Categorical Grammars Using the LLL Corpus , 2001, Learning Language in Logic.

[9] James R. Curran,et al. Log-Linear Models for Wide-Coverage CCG Parsing , 2003, EMNLP.

[10] Daniel H. Younger,et al. Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[11] Chris Mellish,et al. Some Chart-Based Techniques for Parsing Ill-Formed Input , 1989, ACL.

[12] Xuchen Yao,et al. Unsupervised Syntax Learning with Categorial Grammars using Inference Rules , 2009 .

[13] Suresh Manandhar,et al. Unsupervised Lexical Learning with Categorial Grammars , 1999 .

[14] S. Manandhar,et al. Acquisition of Large Scale Categorial Grammar Lexicons , 2001 .

[15] Xuchen Yao,et al. Proceedings of The 14th Student Session of the European Summer School for Logic, Language, and Information , 2009 .

[16] Mark Steedman,et al. CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[17] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[18] Christos Christodoulopoulos. Creating a Natural Logic Inference System with Combinatory Categorial Grammar , 2008 .

[19] Mark Steedman,et al. Object-Extraction and Question-Parsing using CCG , 2004, EMNLP.

[20] Suresh Manandhar,et al. A psychologically plausible and computationally effective approach to learning syntax , 2001, CoNLL.

[21] Eugene Charniak,et al. When is Self-Training Effective for Parsing? , 2008, COLING.

[22] Stephen Clark,et al. Constructing a Parser Evaluation Scheme , 2008, CF+CDPE@COLING.

[23] Stephen Clark,et al. Adapting a Lexicalized-Grammar Parser to Contrasting Domains , 2008, EMNLP.

[24] Tsuneaki Kato,et al. Yet Another Chart-Based Technique for Parsing Ill-Formed Input , 1994, ANLP.