An Information-Theoretic Definition of Similarity

Similarity is an important and widely used concept. Previous definitions of similarity are tied to a particular application or a form of knowledge representation. We present an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model. We demonstrate how our definition can be used to measure the similarity in a number of different domains.

[1]  A. Tversky Features of Similarity , 1977 .

[2]  Michael McGill,et al.  An Evaluation of Factors Affecting Document Ranking by Information Retrieval Systems. , 1979 .

[3]  Jess M. Stein,et al.  Random House college thesaurus , 1989 .

[4]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[5]  Fahiem Bacchus,et al.  Representing and reasoning with probabilistic knowledge , 1988 .

[6]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[8]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[9]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[10]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[11]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[12]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[13]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[14]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[15]  Myoung-Ho Kim,et al.  Information Retrieval Based on Conceptual Distance in is-a Hierarchies , 1993, J. Documentation.

[16]  Dekang Lin,et al.  Principle-Based Parsing without Overgeneration , 1993, ACL.

[17]  Donna Harman,et al.  Overview of the First Text REtrieval Conference. , 1993, SIGIR 1993.

[18]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[19]  Hiyan Alshawi,et al.  Training and Scaling Preference Functions for Disambiguation , 1994, Comput. Linguistics.

[20]  Ralph Grishman,et al.  Generalizing Automatically Generated Selectional Patterns , 1994, COLING.

[21]  Dekang Lin,et al.  PRINCIPAR - An Efficient, Broad-coverage, Principle-based Parser , 1994, COLING.

[22]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[23]  Philip Resnik,et al.  Disambiguating Noun Groupings with Respect to Wordnet Senses , 1995, VLC@ACL.