Natural language processing

Approximately 40 years ago, the goal of endowing computers with the capacity to understand natural language began. These efforts were originally called natural language understanding, which is now more frequently called natural language processing (NLP). NLP is considered a branch of artificial intelligence (AI), but over the years it has become an interesting area of study in computational statistics and text data mining. NLP encompasses approaches that use computers to analyze, determine semantic similarity, and translate between languages. The area usually deals with written languages, but it could also be applied to speech. In this article, we cover definitions and concepts necessary for the understanding of NLP, methods at the word and sentence level (word sense disambiguation, part-of-speech tagging, and parsing), and the vector space model for NLP at the document level. Copyright © 2010 John Wiley & Sons, Inc. For further resources related to this article, please visit the WIREs website.