Last Words: Natural Language Processing and Linguistic Fieldwork

March 2009 marked an important milestone: the First International Conference on Language Documentation and Conservation, held at the University of Hawai‘i.1 The scale of the event was striking, with five parallel tracks running over three days. The organizers coped magnificently with three times the expected participation (over 300). The buzz among the participants was that we were at the start of something big, that we were already part of a significant and growing community dedicated to supporting small languages together, the conference subtitle. The event was full of computation and linguistics, yet devoid of computational linguistics. The language documentation community uses technology to process language, but is largely ignorant of the field of natural language processing. I pondered what we have to offer this community: “Send us your 10 million words of Nahuatl-English bitext and we’ll do you a machine translation system!” “Show us your Bambara WordNet and we’ll use it to train a word sense disambiguation tool!” “Write up the word-formation rules of Inuktitut in this arcane format and we’ll give you a morphological analyzer!” Is there not some more immediate contribution we could offer?