论文信息 - Natural Language Processing: Overview

Natural Language Processing: Overview

The advent of the World Wide Web has greatly increased demand for software tools and appliances for processing unstructured and semi-structured natural language text. Ancillary developments, such as corporate intranets, enterprise portals, and ubiquitous e-mail, have created many challenges and opportunities in application areas such as information retrieval, electronic commerce, and knowledge management. On the supply side, the development of language technology to address such attendant problems as information overload and rapid globalization has been facilitated by two technical breakthroughs. The first is conceptual, and represents a new emphasis upon empirical approaches to language processing that rely more heavily upon corpus statistics than linguistic theory. The second is computational, and consists of more powerful, networked machines that are capable of processing millions of documents and performing the billions of calculations that the statistical profiling of large corpora requires. This article outlines the new application areas and describes some of the advances that have been made. The emphasis is upon showing how the technical approaches outlined elsewhere in this encyclopedia can be combined to create products and services that have genuine value.

Peter Jackson | Frank Schilder

[1] William C. Mann,et al. Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[2] Douglas E. Appelt,et al. FASTUS: A Finite-state Processor for Information Extraction from Real-world Text , 1993, IJCAI.

[3] Constantine D. Spyropoulos,et al. An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[4] Neil R. Smalheiser,et al. Artificial Intelligence An interactive system for finding complementary literatures : a stimulus to scientific discovery , 1995 .

[5] Inderjeet Mani,et al. Machine Learning of Generic and User-Focused Summarization , 1998, AAAI/IAAI.

[6] James Allan,et al. Document classification using multiword features , 1998, CIKM '98.

[7] Francine Chen,et al. A trainable document summarizer , 1995, SIGIR '95.

[8] Stephen Tomlinson. Stemming Evaluated in 6 Languages by Hummingbird SearchServerTM at CLEF 2001 , 2001, CLEF.

[9] Judith L. Klavans,et al. Columbia Newsblaster: Multilingual News Summarization on the Web , 2004, NAACL.

[10] Chris Buckley,et al. Automatic Text Summarization by Paragraph Extraction , 1997 .

[11] Mark T. Maybury,et al. Multimedia summaries of broadcast news , 1997, Proceedings Intelligent Information Systems. IIS'97.