论文信息 - A New Supervised Learning Algorithm for Word Sense Disambiguation

A New Supervised Learning Algorithm for Word Sense Disambiguation

The Naive Mix is a new supervised learning algorithm that is based on a sequential method for selecting probabilistic models. The usual objective of model selection is to find a single model that adequately characterizes the data in a training sample. However, during model selection a sequence of models is generated that consists of the best-fitting model at each level of model complexity. The Naive Mix utilizes this sequence of models to define a probabilistic model which is then used as a probabilistic classifier to perform word-sense disambiguation. The models in this sequence are restricted to the class of decomposable log-linear models. This class of models offers a number of computational advantages. Experiments disambiguating twelve different words show that a Naive Mix formulated with a forward sequential search and Akaike's Information Criteria rivals established supervised learning algorithms such as decision trees (C4.5), rule induction (CN2) and nearest-neighbor classification (PEBLS).

Ted Pedersen | Rebecca F. Bruce | Ted Pedersen

[1] Raymond J. Mooney,et al. Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning , 1996, EMNLP.

[2] G. Zipf,et al. The Psycho-Biology of Language , 1936 .

[3] Mehmet Kayaalp,et al. Signiicant Lexical Relationships , 1996 .

[4] D. Madigan,et al. Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[5] Ellen M. Voorhees,et al. Corpus-Based Statistical Sense Resolution , 1993, HLT.

[6] Janyce Wiebe,et al. Word-Sense Disambiguation Using Decomposable Models , 1994, ACL.

[7] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[8] Ted Pedersen,et al. Sequential Model Selection for Word Sense Disambiguation , 1997, ANLP.

[9] P. Holland,et al. Discrete Multivariate Analysis. , 1976 .

[10] Hwee Tou Ng,et al. Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach , 1996, ACL.

[11] T. Speed,et al. Markov Fields and Log-Linear Interaction Models for Contingency Tables , 1980 .