Margin perceptron for word sense disambiguation

Word Sense Disambiguation (WSD) is an AI-complete problem where senses of words in the documents must be correctly selected from a senses inventory. Support Vector Machines (SVM) method has been successfully applied to supervised WSD. In contrast, perceptron has not been popular in supervised WSD. In this paper, a supervised method combining Margin Perceptron (MP) and Platt's probabilistic output is proposed to solve the word sense ambiguity problem. Experiments were conducted on Senseval-3 English Lexical Sample Task data set. The performance is comparable with systems using SVMs. Our system is in line with the best system participating in Senseval-3, regarding that we only used given training data, and no classifiers combination technique was applied. The advantage of our method is mainly two-fold: Firstly, good achieved performance shows that MP can be applied to problem with limited training data, especially in natural language processing. Secondly, MP algorithm used in this work is easy to implement, which benefits the application and the extension of the algorithm.

[1]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.

[2]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[3]  Hwee Tou Ng,et al.  Supervised Word Sense Disambiguation with Support Vector Machines and multiple knowledge sources , 2004, SENSEVAL@ACL.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Nancy Ide,et al.  Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art , 1998, Comput. Linguistics.

[6]  Carlo Strapparava,et al.  Kernel Methods for Minimally Supervised WSD , 2009, CL.

[7]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[8]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[9]  Albert B Novikoff,et al.  ON CONVERGENCE PROOFS FOR PERCEPTRONS , 1963 .

[10]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[11]  Marius Popescu,et al.  Regularized Least-Squares classification for Word Sense Disambiguation , 2004, SENSEVAL@ACL.

[12]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[13]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[14]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[15]  Samy Bengio,et al.  Links between perceptrons, MLPs and SVMs , 2004, ICML.

[16]  Carlo Strapparava,et al.  Pattern abstraction and term similarity for Word Sense Disambiguation: IRST at Senseval-3 , 2004 .

[17]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.