Word Sense Disambiguation Using a Second Language Monolingual Corpus

This paper presents a new approach for resolving lexical ambiguities in one language using statistical data from a monolingual corpus of another language. This approach exploits the differences between mappings of words to senses in different languages. The paper concentrates on the problem of target word selection in machine translation, for which the approach is directly applicable. The presented algorithm identifies syntactic relations between words, using a source language parser, and maps the alternative interpretations of these relations to the target language, using a bilingual lexicon. The preferred senses are then selected according to statistics on lexical relations in the target language. The selection is based on a statistical model and on a constraint propagation algorithm, which simultaneously handles all ambiguities in the sentence. The method was evaluated using three sets of Hebrew and German examples and was found to be very useful for disambiguation. The paper includes a detailed comparative analysis of statistical sense disambiguation methods.

[1]  John Cocke,et al.  A Statistical Approach to Language Translation , 1988, COLING.

[2]  Susan McRoy,et al.  Using Multiple Knowledge Sources for Word Sense Discrimination , 1992, Comput. Linguistics.

[3]  Victor Sadler,et al.  Working With Analogical Semantics: Disambiguation Techniques in Dlt. , 1989 .

[4]  Mats Rooth,et al.  Structural Ambiguity and Lexical Relations , 1991, ACL.

[5]  Hinrich Schütze,et al.  Word Space , 1992, NIPS.

[6]  Ralph Grishman,et al.  Discovery Procedures for Sublanguage Selectional Patterns: Initial Experiments , 1986, Comput. Linguistics.

[7]  L. Chris Miller BABELWARE for the desktop , 1993 .

[8]  Mori Rimon,et al.  An Active Bilingual Lexicon for Machine Translation , 1988, COLING.

[9]  Ramanathan V. Guha,et al.  Cyc: toward programs with common sense , 1990, CACM.

[10]  D. W. Barron Machine Translation , 1968, Nature.

[11]  Ido Dagan,et al.  Contextual Word Similarity and Estimation from Sparse Data , 1993, ACL.

[12]  Ido Dagan,et al.  Contextual word similarity and estimation from sparse data , 1995, Comput. Speech Lang..

[13]  Frederick Jelinek,et al.  Self-organizing language modeling for speech recognition , 1990 .

[14]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[15]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[16]  Kenneth Ward Church,et al.  A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams , 1991 .

[17]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[18]  Michael C. McCord,et al.  Slot Grammar: A System for Simpler Construction of Practical Natural Language Grammars , 1989, Natural Language and Logic.

[19]  Kenneth Ward Church,et al.  Robust Bilingual Word Alignment for Machine Aided Translation , 1993, VLC@ACL.

[20]  Martin Chodorow,et al.  Extracting Semantic Hierarchies from a Large On-Line Dictionary , 1985, ACL.

[21]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[22]  Alon Itai,et al.  Two Languages Are More Informative Than One , 1991, ACL.

[23]  Reuben Alcalay,et al.  The Complete Hebrew-English Dictionary , 1996 .

[24]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[25]  Alon Itai,et al.  Automatic Processing of Large Corpora for the Resolution of Anaphora References , 1990, COLING.

[26]  H. Schütze,et al.  Dimensions of meaning , 1992, Supercomputing '92.

[27]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[28]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[29]  F. Mosteller,et al.  Inference and Disputed Authorship: The Federalist , 1966 .

[30]  Paul S. Jacobs,et al.  Tagging for Learning: Collecting Thematic Relations from Corpus , 1990, COLING.

[31]  Curt Burgess,et al.  Implications of Lexical Ambiguity Resolution for Word Recognition and Comprehension , 1988 .

[32]  David Yarowsky,et al.  Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.

[33]  Robert L. Mercer,et al.  But Dictionaries Are Data Too , 1993, HLT.

[34]  Marti A. Hearst Noun Homograph Disambiguation Using Local Context in Large Text Corpora , 1991 .