Derivational Smoothing for Syntactic Distributional Semantics

Syntax-based vector spaces are used widely in lexical semantics and are more versatile than word-based spaces (Baroni and Lenci, 2010). However, they are also sparse, with resulting reliability and coverage problems. We address this problem by derivational smoothing, which uses knowledge about derivationally related words (oldish! old) to improve semantic similarity estimates. We develop a set of derivational smoothing methods and evaluate them on two lexical semantics tasks in German. Even for models built from very large corpora, simple derivational smoothing can improve coverage considerably.

[1]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[2]  Nizar Habash,et al.  A Categorial Variation Database for English , 2003, NAACL.

[3]  Ulrich Heid,et al.  Design and Application of a Gold Standard for Morphological Analysis: SMOR as an Example of Morphological Evaluation , 2010, LREC.

[4]  James Allan,et al.  Stemming in the language modeling framework , 2003, SIGIR '03.

[5]  Rochelle Lieber,et al.  Morphology and Lexical Semantics , 2004 .

[6]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[7]  P. Resnik Selectional constraints: an information-theoretic model and its computational realization , 1996, Cognition.

[8]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[9]  Katrin Erk,et al.  Vector Space Models of Word Meaning and Phrase Meaning: A Survey , 2012, Lang. Linguistics Compass.

[10]  Jan Snajder,et al.  DErivBase: Inducing and Evaluating a Derivational Morphology Resource for German , 2013, ACL.

[11]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[12]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[13]  Sebastian Padó,et al.  A distributional memory for German , 2012, KONVENS.

[14]  A. W. F. Edwards,et al.  Ronald Aylmer Fisher , 1990 .

[15]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[16]  J. Firth,et al.  Papers in linguistics, 1934-1951 , 1957 .

[17]  Dale Schuurmans,et al.  Strictly Lexical Dependency Parsing , 2005, IWPT.

[18]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[19]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[20]  Randy Goebel,et al.  Discriminative Learning of Selectional Preference from Unlabeled Text , 2008, EMNLP.

[21]  Roberto Navigli,et al.  An analysis of ontology-based query expansion strategies , 2003 .

[22]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[23]  Max Mühlhäuser,et al.  Comparing Wikipedia and German Wordnet by Evaluating Semantic Relatedness on Multiple Datasets , 2007, NAACL.

[24]  Julio Gonzalo,et al.  Indexing with WordNet synsets can improve text retrieval , 1998, WordNet@ACL/COLING.

[25]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[26]  Arne Fitschen,et al.  Ein computerlinguistisches Lexikon als komplexes System , 2004 .

[27]  Ido Dagan,et al.  Similarity-Based Models of Word Cooccurrence Probabilities , 1998, Machine Learning.

[28]  Graeme Hirst,et al.  Cross-Lingual Distributional Profiles of Concepts for Measuring Semantic Distance , 2007, EMNLP.

[29]  Katrin Erk,et al.  A Flexible, Corpus-Driven Model of Regular and Inverse Selectional Preferences , 2010, CL.