Corpus-based Learning of Analogies and Semantic Relations

We present an algorithm for learning from unlabeled text, based on the Vector Space Model (VSM) of information retrieval, that can solve verbal analogy questions of the kind found in the SAT college entrance exam. A verbal analogy has the form A:B::C:D, meaning “A is to B as C is to D”; for example, mason:stone::carpenter:wood. SAT analogy questions provide a word pair, A:B, and the problem is to select the most analogous word pair, C:D, from a set of five choices. The VSM algorithm correctly answers 47% of a collection of 374 college-level analogy questions (random guessing would yield 20% correct; the average college-bound senior high school student answers about 57% correctly). We motivate this research by applying it to a difficult problem in natural language processing, determining semantic relations in noun-modifier pairs. The problem is to classify a noun-modifier pair, such as “laser printer”, according to the semantic relation between the noun (printer) and the modifier (laser). We use a supervised nearest-neighbour algorithm that assigns a class to a given noun-modifier pair by finding the most analogous noun-modifier pair in the training data. With 30 classes of semantic relations, on a collection of 600 labeled noun-modifier pairs, the learning algorithm attains an F value of 26.5% (random guessing: 3.3%). With 5 classes of semantic relations, the F value is 43.2% (random: 20%). The performance is state-of-the-art for both verbal analogies and noun-modifier relations.

[1]  D. Rothstein Cognition and Thought: An Information-Processing Approach. , 1966 .

[2]  U. Neisser,et al.  Cognition and thought : an information-processing approach , 1966 .

[3]  Michael Lesk,et al.  Word-word associations in document retrieval systems , 1969 .

[4]  G. Lakoff,et al.  Metaphors We Live by , 1982 .

[5]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[6]  Dedre Gentner,et al.  Structure-Mapping: A Theoretical Framework for Analogy , 1983, Cogn. Sci..

[7]  P. C. Wong,et al.  Generalized vector spaces model in information retrieval , 1985, SIGIR '85.

[8]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[9]  Brian Falkenhainer,et al.  The Structure-Mapping Engine: Algorithm and Examples , 1989, Artif. Intell..

[10]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[11]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[12]  Aristotle,et al.  THE NICOMACHEAN ETHICS , 1990 .

[13]  Experiments on linguistically based term associations , 1991, RIAO.

[14]  David D. Lewis,et al.  Evaluating Text Categorization I , 1991, HLT.

[15]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[16]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[17]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[18]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[19]  C. Daganzo THE CELL TRANSMISSION MODEL.. , 1994 .

[20]  Lucy Vanderwende,et al.  Algorithm for Automatic Interpretation of Noun Sequences , 1994, COLING.

[21]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[22]  William B. Dolan Metaphor as an Emergent Property of Machine-Readable Dictionaries , 1995 .

[23]  D. Hofstadter Fluid Concepts and Creative Analogies: Computer Models of the Fundamental Mechanisms of Thought, Douglas Hofstadter. 1994. Basic Books, New York, NY. 512 pages. ISBN: 0-465-05154-5. $30.00 , 1995 .

[24]  Terry Regier,et al.  The Human Semantic Potential: Spatial Language and Constrained Connectionism , 1996 .

[25]  Donna K. Harman,et al.  Overview of the Fifth Text REtrieval Conference (TREC-5) , 1996, TREC.

[26]  Ellen M. Voorhees,et al.  The fifth text REtrieval conference (TREC-5) , 1997 .

[27]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[28]  Stan Szpakowicz,et al.  Semi-Automatic Recognition of Noun Modifier Relationships , 1998, ACL.

[29]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[30]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[31]  Philip Resnik,et al.  Mining the Web for Bilingual Text , 1999, ACL.

[32]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[33]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[34]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[35]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[36]  Barbara Rosario,et al.  Classifying the Semantic Relations in Noun Compounds via a Domain-Specific Lexical Hierarchy , 2001, EMNLP.

[37]  Joachim M. Buhmann,et al.  Coupled Clustering: A Method for Detecting Structural Correspondence , 2001, J. Mach. Learn. Res..

[38]  Javed Mostafa,et al.  Detecting Gene Relations from MEDLINE Abstracts , 2000, Pacific Symposium on Biocomputing.

[39]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[40]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[41]  R. French The computational modeling of analogy-making , 2002, Trends in Cognitive Sciences.

[42]  Barbara Rosario,et al.  The Descent of Hierarchy, and Selection in Relational Semantics , 2002, ACL.

[43]  Ido Dagan,et al.  Cross-dataset Clustering: Revealing Corresponding Themes across Multiple Corpora , 2002, CoNLL.

[44]  Jingang Yi,et al.  Stability of macroscopic traffic flow modeling through wavefront expansion , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[45]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[46]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[47]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[48]  H. M. Zhang Driver memory, traffic viscosity and a viscous vehicular traffic flow model , 2003 .

[49]  Tony Veale The Analogical Thesaurus , 2003, IAAI.

[50]  Mirella Lapata,et al.  A Probabilistic Account of Logical Metonymy , 2003, Computational Linguistics.

[51]  Jeffrey P. Bigham,et al.  Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems , 2003, ArXiv.

[52]  Margaret A. Boden,et al.  Douglas Hofstadter and the Fluid Analogies Research Group, Fluid Concepts and Creative Analogies: Computer Models of the Fundamental Mechanisms of Thought , 2004, Minds and Machines.

[53]  Marcel Worring,et al.  NIST Special Publication , 2005 .