NLTK: The Natural Language Toolkit

The Natural Language Toolkit is a suite of program modules, data sets and tutorials supporting research and teaching in computational linguistics and natural language processing. NLTK is written in Python and distributed under the GPL open source license. Over the past year the toolkit has been rewritten, simplifying many linguistic data structures and taking advantage of recent enhancements in the Python language. This paper reports on the simplified toolkit and explains how it is used in teaching NLP.

[1]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[2]  Guido van Rossum,et al.  An Introduction to Python , 2003 .

[3]  Jason Baldridge,et al.  Leo: an Architecture for Sharing Resources for Unification-Based Grammars , 2002, LREC.

[4]  J. Harrington,et al.  Techniques in Speech Acoustics , 1999, Computational Linguistics.

[5]  Elizabeth D. Liddy,et al.  Hands-On NLP for an Interdisciplinary Audience , 2005, ACL 2005.

[6]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[7]  Michael Hammond Programming for linguists : Java[TM] technology for language researches , 2002 .

[8]  Steven Bird,et al.  Creating Annotation Tools with the Annotation Graph Toolkit , 2002, LREC.

[9]  Rune Sætre,et al.  Semantic Annotation of Biomedical Literature Using Google , 2005, ICCSA.

[10]  Kalina Bontcheva,et al.  Using GATE as an Environment for Teaching NLP , 2002, ACL 2002.

[11]  Fredrik Lundh,et al.  An Introduction to Tkinter , 1999 .

[12]  John M. Lawler,et al.  Using Computers in Linguistics: A Practical Guide , 1998 .

[13]  David H. D. Warren,et al.  Definite Clause Grammars for Language Analysis - A Survey of the Formalism and a Comparison with Augmented Transition Networks , 1980, Artif. Intell..

[14]  Marti A. Hearst Teaching Applied Natural Language Processing: Triumphs and Tribulations , 2005, ACL 2005.

[15]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.