Sequential Model Selection for Word Sense Disambiguation

Statistical models of word-sense disambiguation are often based on a small number of contextual features or on a model that is assumed to characterize the interactions among a set of features. Model selection is presented as an alternative to these approaches, where a sequential search of possible models is conducted in order to find the model that best characterizes the interactions among features. This paper expands existing model selection methodology and presents the first comparative study of model selection search strategies and evaluation criteria when applied to the problem of building probabilistic classifiers for word-sense disambiguation.

[1]  G. Zipf,et al.  The Psycho-Biology of Language , 1936 .

[2]  Mehmet Kayaalp,et al.  Signiicant Lexical Relationships , 1996 .

[3]  David Yarowsky,et al.  One Sense per Collocation , 1993, HLT.

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[6]  Alon Itai,et al.  Two Languages Are More Informative Than One , 1991, ACL.

[7]  Hwee Tou Ng,et al.  Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach , 1996, ACL.

[8]  Janyce Wiebe,et al.  A New Approach to Word Sense Disambiguation , 1994, HLT.

[9]  Ted Pedersen,et al.  Significant Lexical Relationships , 1996, AAAI/IAAI, Vol. 1.

[10]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[11]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[12]  Ted Pedersen,et al.  The Measure of a Model , 1996, EMNLP.

[13]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[14]  H. Akaike A new look at the statistical model identification , 1974 .

[15]  Svend Kreiner,et al.  Analysis of Multidimensional Contingency Tables by Exact Conditional Tests: Techniques and Strategies , 1987 .

[16]  Janyce Wiebe,et al.  Word-Sense Disambiguation Using Decomposable Models , 1994, ACL.

[17]  Raymond J. Mooney,et al.  Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning , 1996, EMNLP.

[18]  Ezra Black,et al.  An Experiment in Computational Discrimination of English Word Senses , 1988, IBM J. Res. Dev..

[19]  Ellen M. Voorhees,et al.  Corpus-Based Statistical Sense Resolution , 1993, HLT.

[20]  G. Āllport The Psycho-Biology of Language. , 1936 .

[21]  T. Speed,et al.  Markov Fields and Log-Linear Interaction Models for Contingency Tables , 1980 .