Subjective Bayes Method for Word Semantic Similarity Measurement

Measuring semantic similarity between words is a classical problem in nature language processing, the result of which can promote many applications such as machine translation, word sense disambiguation, ontology mapping, computational linguistics, etc. This paper combines knowledge-based methods with statistical methods in measuring words similarity, the novel aspect of which is that subjective Bayes method is employed. Firstly, extract evidences based on Word Net, secondly, analyze reasonableness of candidate evidence using scatter plot, thirdly, generate sufficiency measure by statistics and piecewise linear interpolation technique, fourthly, obtain comprehensive posteriori by integrating uncertainty reasoning with conclusion uncertainty synthetic strategy, finally, we quantify word semantic similarity. On data set R&G (65), we conducted experiment through 5-fold cross validation, and the correlation of our experimental results with human judgment is 0.912, with 0.4% improvements over existing best practice, which show that using subjective Bayes method to measure word semantic similarity is reasonable and effective.

[1]  Christiane Fellbaum,et al.  Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms , 1998 .

[2]  Songmei Cai,et al.  An Improved Semantic Similarity Measure for Word Pairs , 2010, 2010 International Conference on e-Education, e-Business, e-Management and e-Learning.

[3]  Evgeniy Gabrilovich,et al.  A word at a time: computing word relatedness using temporal semantic analysis , 2011, WWW.

[4]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[5]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[6]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[7]  Eneko Agirre,et al.  A Proposal for Word Sense Disambiguation using Conceptual Distance , 1995, ArXiv.

[8]  Bin Li,et al.  Computing Word Similarity on Large-Scale Corpus , 2009, 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC).

[9]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[10]  SeungJin Lim,et al.  A Graph Modeling of Semantic Similarity between Words , 2007 .

[11]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[12]  De Xu,et al.  Concept vector for semantic similarity and relatedness based on WordNet structure , 2012, J. Syst. Softw..

[13]  M. Cooper Collective Media Annotation using Undirected Random Field Models , 2007 .

[14]  David Sánchez,et al.  Ontology-based semantic similarity: A new feature-based approach , 2012, Expert Syst. Appl..

[15]  HirstGraeme,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006 .

[16]  Giuseppe Pirrò,et al.  A semantic similarity metric combining features and intrinsic information content , 2009, Data Knowl. Eng..

[17]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[18]  Ido Dagan,et al.  Similarity-Based Models of Word Cooccurrence Probabilities , 1998, Machine Learning.

[19]  Richard O. Duda,et al.  Subjective bayesian methods for rule-based inference systems , 1976, AFIPS '76.

[20]  David Sánchez,et al.  Ontology-based information content computation , 2011, Knowl. Based Syst..

[21]  Fang Wu,et al.  A New Measure of Word Semantic Similarity Based on WordNet Hierarchy and DAG Theory , 2009, 2009 International Conference on Web Information Systems and Mining.

[22]  Lillian Lee,et al.  Similarity-Based Approaches to Natural Language Processing , 1997, ArXiv.

[23]  Ossama Emam,et al.  Unsupervised Information Extraction Approach Using Graph Mutual Reinforcement , 2006, EMNLP.

[24]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[25]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[26]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[27]  Lei Liu,et al.  Measuring Word Similarity Based on Pattern Vector Space Model , 2009, 2009 International Conference on Artificial Intelligence and Computational Intelligence.

[28]  Ding Yuan,et al.  Improving Translation Selection with a New Translation Model Trained by Independent Monolingual Corpora , 2001, Int. J. Comput. Linguistics Chin. Lang. Process..

[29]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[30]  David M. W. Powers,et al.  Measuring Semantic Similarity in the Taxonomy of WordNet , 2005, ACSC.