Interpreting compound nouns with kernel methods

Abstract This paper presents a classification-based approach to noun–noun compound interpretation within the statistical learning framework of kernel methods. In this framework, the primary modelling task is to define measures of similarity between data items, formalised as kernel functions. We consider the different sources of information that are useful for understanding compounds and proceed to define kernels that compute similarity between compounds in terms of these sources. In particular, these kernels implement intuitive notions of lexical and relational similarity and can be computed using distributional information extracted from text corpora. We report performance on classification experiments with three semantic relation inventories at different levels of granularity, demonstrating in each case that combining lexical and relational information sources is beneficial and gives better performance than either source taken alone. The data used in our experiments are taken from general English text, but our methods are also applicable to other domains and potentially to other languages where noun–noun compounding is frequent and productive.

[1]  Fintan J. Costello,et al.  Learning to Interpret Novel Noun-Noun Compounds: Evidence from Category Learning Experiments , 2007, Cognitive Aspects of Computational Language Acquisition.

[2]  Anna Korhonen,et al.  Probabilistic models of similarity in syntactic context , 2011, EMNLP.

[3]  Hal Daumé,et al.  Generative Kernels for Exponential Families , 2011, AISTATS.

[4]  Preslav Nakov,et al.  SemEval-2010 Task 9: The Interpretation of Noun Compounds Using Paraphrasing Verbs and Prepositions , 2010, SemEval@ACL.

[5]  Eduard H. Hovy,et al.  A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation , 2010, ACL.

[6]  Mehryar Mohri,et al.  Two-Stage Learning Kernel Algorithms , 2010, ICML.

[7]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[8]  Eric P. Xing,et al.  Nonextensive Information Theoretic Kernels on Measures , 2009, J. Mach. Learn. Res..

[9]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[10]  Diarmuid Ó Séaghdha,et al.  Using Lexical and Relational Similarity to Classify Semantic Relations , 2009, EACL.

[11]  Jonathan K. Kummerfeld,et al.  Large-Scale Syntactic Processing : Parsing the Web Final Report of the 2009 JHU CLSP Workshop , 2009 .

[12]  Preslav Nakov,et al.  Solving Relational Similarity Problems Using the Web as a Corpus , 2008, ACL.

[13]  Preslav Nakov Noun Compound Interpretation Using Paraphrasing Verbs: Feasibility Study , 2008, AIMSA.

[14]  Diarmuid Ó Séaghdha,et al.  Semantic Classification with Distributional Kernels , 2008, COLING.

[15]  Peter D. Turney A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations , 2008, COLING.

[16]  Automatic Content Extraction 2008 Evaluation Plan ( ACE 08 ) Assessment of Detection and Recognition of Entities and Relations Within and Across Documents , 2008 .

[17]  Diarmuid Ó Séaghdha Learning compound noun semantics , 2008 .

[18]  Holly P. Branigan,et al.  Priming the Interpretation of Noun-Noun Combinations. , 2007 .

[19]  Ann Copestake,et al.  Co-occurrence Contexts for Noun Compound Interpretation , 2007 .

[20]  Preslav Nakov,et al.  SemEval-2007 Task 04: Classification of Semantic Relations between Nominals , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[21]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[22]  Peter D. Turney Similarity of Semantic Relations , 2006, CL.

[23]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.

[24]  Stan Szpakowicz,et al.  Learning Noun-Modifier Semantic Relations with Corpus-based and WordNet-based Features , 2006, AAAI.

[25]  Lara L. Jones,et al.  Priming via relational similarity: A COPPER HORSE is faster when seen through a GLASS EYE , 2006 .

[26]  John D. Lafferty,et al.  Diffusion Kernels on Statistical Manifolds , 2005, J. Mach. Learn. Res..

[27]  Fintan J. Costello,et al.  Investigating the Relations used in Conceptual Combination , 2005, Artificial Intelligence Review.

[28]  Timothy Baldwin,et al.  Automatic Interpretation of Noun Compounds Using WordNet Similarity , 2005, IJCNLP.

[29]  Dan I. Moldovan,et al.  On the semantics of noun compounds , 2005, Comput. Speech Lang..

[30]  Matthias Hein,et al.  Hilbertian Metrics and Positive Definite Kernels on Probability Measures , 2005, AISTATS.

[31]  Timothy Baldwin,et al.  Translation by Machine of Complex Nominals: Getting it Right , 2004 .

[32]  James Richard Curran,et al.  From distributional to semantic similarity , 2004 .

[33]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[34]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[35]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[36]  Christina L. Gagné,et al.  Priming relations in ambiguous noun-noun combinations , 2002, Memory & cognition.

[37]  Christina L. Gagné,et al.  Lexical and Relational Influences on the Processing of Novel Compounds , 2002, Brain and Language.

[38]  Nello Cristianini,et al.  Composite Kernels for Hypertext Categorisation , 2001, ICML.

[39]  Martin Haspelmath,et al.  Language typology and language universals : an international handbook , 2001 .

[40]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[41]  Dekang Lin,et al.  Automatic Identification of Non-compositional Phrases , 1999, ACL.

[42]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[43]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[44]  Christina L. Gagné,et al.  Influence of Thematic Relations on the Comprehension of Modifier–noun Combinations , 1997 .

[45]  Oliver Geoffrey Davidson,et al.  The interpretation of noun noun compounds , 1996 .

[46]  M. Ryder Ordered Chaos: The Interpretation of English Noun-Noun Compounds , 1994 .

[47]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[48]  C. Berg,et al.  Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions , 1984 .

[49]  Sylvia W Russell semantic categories of nominals for conceptual dependency analysis of natural language. , 1972 .

[50]  Stanley Y. W. Su A Semantic Theory Based Upon Interactive Meaning , 1969 .

[51]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .