Towards A Universal Tool For NLP Resource Acquisition

This paper describes an approach to developing a universal tool for eliciting, from a non-expert human user, knowledge about any language L. The purpose of this elicitation is rapid development of NLP systems. The approach is described on the example of the syntax module of the Boas knowledge elicitation system for a quick ramp up of a standard transfer-based machine translation system from L into English. The preparation of knowledge for the MT system is carried out into two stages; the acquisition of descriptive knowledge about L and using the descriptive knowledge to derive operational knowledge for the system. Boas guides the acquisition process using data-driven, expectation-driven and goal-driven methodologies.

[1]  W. J. Samarin Field Linguistics: A Guide to Linguistic Field Work , 1967 .

[2]  B. Comrie,et al.  Lingua descriptive studies: Questionnaire , 1977 .

[3]  Edith A. Moravcsik,et al.  Universals of human language , 1978 .

[4]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[5]  J. Nichols Head-marking and dependent-marking grammar , 1986 .

[6]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[7]  Joseph H. Greenberg,et al.  Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements , 1990, On Language.

[8]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[9]  Eric Brill,et al.  Deducing linguistic structure from the statistics of large corpora , 1990 .

[10]  L. Bouquiaux,et al.  Studying and Describing Unwritten Languages , 1992 .

[11]  Noam Chomsky,et al.  The Minimalist Program , 1992 .

[12]  Bonnie J. Dorr,et al.  Interlingual Machine Translation: A Parameterized Approach , 1993, Artif. Intell..

[13]  Matthew Haines,et al.  Filling Knowledge Gaps in a Broad-Coverage Machine Translation System , 1995, IJCAI.

[14]  Kevin Knight,et al.  Learning Word Meanings by Instruction , 1996, AAAI/IAAI, Vol. 1.

[15]  Thomas E. Payne Describing Morphosyntax: A Guide for Field Linguists , 1997 .

[16]  Douglas A. Jones,et al.  Twisted pair grammar: support for rapid development of machine translation for low density languages , 1998, AMTA.

[17]  Sergei Nirenburg,et al.  Project Boas: "A Linguist in the Box" as a multi-purpose language resource , 1998, LREC.

[18]  Sergei Nirenburg,et al.  Universal Grammar and Lexis for Quick Ramp-Up of MT Systems , 1998, ACL.

[19]  Sergei Nirenburg,et al.  Supply-Side and Demand-Side Lexical Semantics , 1999 .

[20]  John A. Goldsmith,et al.  Unsupervised Learning of the Morphology of a Natural Language , 2001, CL.