An Overview of Empirical Natural Language Processing

In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech recognition, syntactic parsing, semantic processing, information extraction, and machine translation. This article presents an introduction to the series of specialized articles on these topics and attempts to describe and explain the growing interest in using learning methods to aid the development of natural language processing systems.

[1]  Wendy G. Lehnert,et al.  Wrap-Up: a Trainable Discourse Module for Information Extraction , 1994, J. Artif. Intell. Res..

[2]  Hwee Tou Ng,et al.  Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach , 1996, ACL.

[3]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[4]  Richard M. Schwartz,et al.  A Fully Statistical Approach to Natural Language Interfaces , 1996, ACL.

[5]  B. MacWhinney,et al.  Implementations are not conceptualizations: Revising the verb learning model , 1991, Cognition.

[6]  Robert C. Berwick,et al.  The acquisition of syntactic knowledge , 1985 .

[7]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[8]  Risto Miikkulainen,et al.  Subsymbolic natural language processing - an integrated model of scripts, lexicon, and memory , 1993, Neural network modeling and connectionism.

[9]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[10]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[11]  Manny Rayner,et al.  Quantitative Evaluation of Explanation-Based Learning as an Optimisation Tool for a Large-Scale Natural Language System , 1991, IJCAI.

[12]  G. A. Miller,et al.  Finitary models of language users , 1963 .

[13]  R. Reilly,et al.  Connectionist approaches to natural language processing , 1994 .

[14]  John R. Anderson,et al.  Induction of Augmented Transition Networks , 1977, Cogn. Sci..

[15]  Eric Brill,et al.  Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach , 1993, ACL.

[16]  Beth Sundheim,et al.  A Performance Evaluation of Text-Analysis Technologies , 1991, AI Mag..

[17]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[18]  Larry H. Reeker,et al.  The Computational Study of Language Acquisition , 1976, Adv. Comput..

[19]  Scott Bennett,et al.  Evaluating Automated and Manual Acquisition of Anaphora Resolution Strategies , 1995, ACL.

[20]  Claire Cardie,et al.  Learning to Disambiguate Relative Pronouns , 1992, AAAI.

[21]  George R. Kiss,et al.  Grammatical Word Classes: A Learning Process and its Simulation , 1973 .

[22]  Noam Chomsky Review of B.F. Skinner, Verbal Behavior , 1959 .

[23]  Robert F. Simmons,et al.  The Acquisition and Use of Context-Dependent Grammars for English , 1992, Comput. Linguistics.

[24]  Bernard Mérialdo,et al.  Tagging English Text with a Probabilistic Model , 1994, CL.

[25]  Kenneth Ward Church,et al.  Introduction to the Special Issue on Computational Linguistics Using Large Corpora , 1993, Comput. Linguistics.

[26]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[27]  Raymond J. Mooney,et al.  Learning Parse and Translation Decisions from Examples with Rich Context , 1997, ACL.

[28]  Eugene Charniak,et al.  Tree-Bank Grammars , 1996, AAAI/IAAI, Vol. 2.

[29]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[30]  Kenneth Ward Church,et al.  Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table , 1982, CL.

[31]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[32]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[33]  Terry Winograd,et al.  Understanding natural language , 1974 .

[34]  Gerald DeJong,et al.  Learning Schemata for Natural Language Processing , 1985, IJCAI.

[35]  Yehoshua Bar-Hillel,et al.  Language and Information , 1964 .

[36]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[37]  James L. McClelland,et al.  Learning and Applying Contextual Constraints in Sentence Comprehension , 1990, Artif. Intell..

[38]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[39]  Dennis H. Klatt,et al.  Review of the ARPA speech understanding project , 1990 .

[40]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[41]  D.R. Reddy,et al.  Speech recognition by machine: A review , 1976, Proceedings of the IEEE.

[42]  Steve Young,et al.  Applications of stochastic context-free grammars using the Inside-Outside algorithm , 1990 .

[43]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Claire Cardie,et al.  A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis , 1993, AAAI.

[45]  Pat Langley,et al.  Language Acquisition and Machine Learning. , 1986 .

[46]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[47]  Diane J. Litman,et al.  Cue Phrase Classification Using Machine Learning , 1996, J. Artif. Intell. Res..

[48]  James L. McClelland,et al.  Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[49]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[50]  W Stolz,et al.  A Probabilistic Procedure for Grouping Words into Phrases , 1965, Language and speech.

[51]  Alex Waibel,et al.  Readings in speech recognition , 1990 .

[52]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[53]  V. Marchman,et al.  From rote learning to system building: acquiring verb morphology in children and connectionist nets , 1993, Cognition.

[54]  Mats Rooth,et al.  Structural Ambiguity and Lexical Relations , 1991, ACL.

[55]  James F. Allen Natural language understanding , 1987, Bejnamin/Cummings series in computer science.

[56]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[57]  Risto Miikkulainen Subsymbolic Case-Role Analysis of Sentences with Embedded Clauses , 1993 .