Detecting Compositionality of Verb-Object Combinations using Selectional Preferences

In this paper we explore the use of selectional preferences for detecting noncompositional verb-object combinations. To characterise the arguments in a given grammatical relationship we experiment with three models of selectional preference. Two use WordNet and one uses the entries from a distributional thesaurus as classes for representation. In previous work on selectional preference acquisition, the classes used for representation are selected according to the coverage of argument tokens rather than being selected according to the coverage of argument types. In our distributional thesaurus models and one of the methods using WordNet we select classes for representing the preferences by virtue of the number of argument types that they cover, and then only tokens under these classes which are representative of the argument head data are used to estimate the probability distribution for the selectional preference model. We demonstrate a highly signicant correlation between measures which use these ‘typebased’ selectional preferences and compositionality judgements from a data set used in previous research. The type-based models perform better than the models which use tokens for selecting the classes. Furthermore, the models which use the automatically acquired thesaurus entries produced the best results. The correlation for the thesaurus models is stronger than any of the individual features used in previous research on the same dataset.

[1]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[2]  Timothy Baldwin,et al.  An Empirical Model of Multiword Expression Decomposability , 2003, ACL 2003.

[3]  Hang Li,et al.  Generalizing Case Frames Using a Thesaurus and the MDL Principle , 1995, CL.

[4]  Aravind K. Joshi,et al.  Measuring the Relative Compositionality of Verb-Noun (V-N) Collocations by Integrating Features , 2005, HLT.

[5]  Daniel M. Bikel,et al.  A Distributional Analysis of a Lexicalized Statistical Parsing Model , 2004, EMNLP.

[6]  Colin Bannard,et al.  Statistical Techniques for Automatically Inferring the Semantics of Verb-Particle Constructions , 2003 .

[7]  Mark Johnson,et al.  Unsupervised learning of multi-word verbs , 2001 .

[8]  Geoffrey Leech,et al.  100 Million Words of English:The British National Corpus (BNC) , 1992 .

[9]  Stephen Clark,et al.  Class-Based Probability Estimation Using a Semantic Hierarchy , 2002, CL.

[10]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[11]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[12]  S. Evert,et al.  Can we do better than frequency ? A case study on extracting PP-verb collocations , 2001 .

[13]  P. Resnik Selection and information: a class-based approach to lexical relationships , 1993 .

[14]  Roger K. Moore Computer Speech and Language , 1986 .

[15]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[16]  Daniel Jurafsky,et al.  Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? , 2001, EMNLP.

[17]  Diana McCarthy,et al.  Using Semantic Preferences to Identify Verbal Participation in Role Switching Alternations , 2000, ANLP.

[18]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[19]  Ted Briscoe,et al.  Robust Accurate Statistical Annotation of General Text , 2002, LREC.

[20]  Timothy Baldwin,et al.  A Statistical Approach to the Semantics of Verb-Particles , 2003, ACL 2003.

[21]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[22]  Frank Keller,et al.  Using the Web to Obtain Frequencies for Unseen Bigrams , 2003, CL.

[23]  John Carroll,et al.  Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[24]  Afsaneh Fazly,et al.  Automatically Constructing a Lexicon of Verb Phrase Idiomatic Combinations , 2006, EACL.

[25]  A. Stuart,et al.  Non-Parametric Statistics for the Behavioral Sciences. , 1957 .

[26]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[27]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[28]  Colin J. Bannard,et al.  Learning about the meaning of verb-particle constructions from corpora , 2005, Comput. Speech Lang..

[29]  Suzanne Stevenson,et al.  Statistical Measures of the Semi-Productivity of Light Verb Constructions , 2004 .

[30]  A. Wagner Learning Thematic Role Relations for Wordnets , 2002 .

[31]  Ralph Grishman,et al.  Generalizing Automatically Generated Selectional Patterns , 1994, COLING.

[32]  Marc Light,et al.  Hiding a Semantic Class Hierarchy in a Markov Model , 1998 .

[33]  Suzanne Stevenson,et al.  Using Selectional Profile Distance to Detect Verb Alternations , 2004, HLT-NAACL 2004.