Speech and Language Processing An Introduction to Natural Language Processing , Computational Linguistics , and Speech Recognition Second Edition

Dave Bowman: Open the pod bay doors, HAL. HAL: I'm sorry Dave, I'm afraid I can't do that. The idea of giving computers the ability to process human language is as old as the idea of computers themselves. This book is about the implementation and implications of that exciting idea. We introduce a vibrant interdisciplinary field with many names corresponding to its many facets, names like speech and language processing, human language technology, natural language processing, computational linguistics, and speech recognition and synthesis. The goal of this new field is to get computers to perform useful tasks involving human language, tasks like enabling human-machine communication, improving human-human communication, or simply doing useful processing of text or speech. One example of a useful such task is a conversational agent. The HAL 9000 com-Conversational agent puter in Stanley Kubrick's film 2001: A Space Odyssey is one of the most recognizable characters in 20th century cinema. HAL is an artificial agent capable of such advanced language behavior as speaking and understanding English, and at a crucial moment in the plot, even reading lips. It is now clear that HAL's creator, Arthur C. Clarke, was a little optimistic in predicting when an artificial agent such as HAL would be available. But just how far off was he? What would it take to create at least the language-related parts of HAL? We call programs like HAL that converse with humans in natural language conversational agents or dialogue systems. In this text we study the vari-Dialogue system ous components that make up modern conversational agents, including language input (automatic speech recognition and natural language understanding) and language output (dialogue and response planning and speech synthesis). Let's turn to another useful language-related task, that of making available to non-English-speaking readers the vast amount of scientific information on the Web in En-glish. Or translating for English speakers the hundreds of millions of Web pages written in other languages like Chinese. The goal of machine translation is to automatically Machine translation translate a document from one language to another. We introduce the algorithms and mathematical tools needed to understand how modern machine translation works. Machine translation is far from a solved problem; we cover the algorithms currently used in the field, as well as important component tasks. Many other language processing tasks are also related to the Web. Another such task is Web-based question answering. This is a generalization …