Word Space

Representations for semantic information about words are necessary for many applications of neural networks in natural language processing. This paper describes an efficient, corpus-based method for inducing distributed semantic representations for a large number of words (50,000) from lexical coccurrence statistics by means of a large-scale linear regression. The representations are successfully applied to word sense disambiguation using a nearest neighbor method.

[1]  Jordan B. Pollack,et al.  Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation , 1988, Cogn. Sci..

[2]  James L. McClelland,et al.  Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[3]  A. H. Kawamoto Distributed Representations of Ambiguous Words and Their Resolution in a Connectionist Network , 1988 .

[4]  James Kelly,et al.  AutoClass: A Bayesian Classification System , 1993, ML.

[5]  Roy E. Kimbrell,et al.  Searching for text? Send an N-gram] , 1988 .

[6]  Carolyn J. Crouch,et al.  An approach to the automatic construction of global thesauri , 1990, Inf. Process. Manag..

[7]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[8]  J. C. Scholtes Unsupervised learning and the information retrieval problem , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[9]  Stephen I. Gallant A Practical Approach for Representing Context and for Performing Word Sense Disambiguation Using Neural Networks , 1991, Neural Computation.

[10]  David Yarowsky,et al.  Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.

[11]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[12]  Michael W. Berry,et al.  Large-Scale Sparse Singular Value Computations , 1992 .

[13]  David R. Karger,et al.  Scatter/Gather: a cluster-based approach to browsing large document collections , 1992, SIGIR '92.

[14]  H. Schütze,et al.  Dimensions of meaning , 1992, Supercomputing '92.

[15]  Stephen I. Gallant,et al.  HNC's MatchPlus system , 1992, SIGF.