论文信息 - Last Words: Natural Language Processing and Linguistic Fieldwork

Last Words: Natural Language Processing and Linguistic Fieldwork

March 2009 marked an important milestone: the First International Conference on Language Documentation and Conservation, held at the University of Hawai‘i.1 The scale of the event was striking, with five parallel tracks running over three days. The organizers coped magnificently with three times the expected participation (over 300). The buzz among the participants was that we were at the start of something big, that we were already part of a significant and growing community dedicated to supporting small languages together, the conference subtitle. The event was full of computation and linguistics, yet devoid of computational linguistics. The language documentation community uses technology to process language, but is largely ignorant of the field of natural language processing. I pondered what we have to offer this community: “Send us your 10 million words of Nahuatl-English bitext and we’ll do you a machine translation system!” “Show us your Bambara WordNet and we’ll use it to train a word sense disambiguation tool!” “Write up the word-formation rules of Inuktitut in this arcane format and we’ll give you a morphological analyzer!” Is there not some more immediate contribution we could offer?

Steven Bird

[1] Joseph E. Grimes,et al. Computer backup for field work in phonology , 1968, Mech. Transl. Comput. Linguistics.

[2] Douglas B. Lenat,et al. On the thresholds of knowledge , 1987, Proceedings of the International Workshop on Artificial Intelligence for Industrial Applications.

[3] Joshua A. Fishman,et al. Can Threatened Languages Be Saved? Reversing Language Shift, Revisited: A 21st Century Perspective. Multilingual Matters 116. , 2001 .

[4] Larry M. Hyman. Linguistic Fieldwork: Fieldwork as a state of mind , 2001 .

[5] Gary Simons,et al. Seven Dimensions of Portability for Language Documentation and Description , 2002, ArXiv.

[6] D. Crystal. What is language death , 2002 .

[7] Lenore A. Grenoble,et al. Saving Languages: An Introduction to Language Revitalization , 2005 .

[8] Ulrike Mosel,et al. Essentials of language documentation , 2006 .

[9] K. David Harrison,et al. When languages die : the extinction of the world's languages and the erosion of human knowledge , 2007 .

[10] Jason Baldridge,et al. Evaluating Automation Strategies in Language Documentation , 2009, Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing - HLT '09.

[11] Ewan Klein,et al. Natural Language Processing with Python , 2009 .