A Neural Network Approach to Selectional Preference Acquisition

This paper investigates the use of neural networks for the acquisition of selectional preferences. Inspired by recent advances of neural network models for nlp applications, we propose a neural network model that learns to discriminate between felicitous and infelicitous arguments for a particular predicate. The model is entirely unsupervised ‐ preferences are learned from unannotated corpus data. We propose two neural network architectures: one that handles standard two-way selectional preferences and one that is able to deal with multi-way selectional preferences. The model’s performance is evaluated on a pseudo-disambiguation task, on which it is shown to achieve state of the art performance.

[1]  Oren Etzioni,et al.  A Latent Dirichlet Allocation Method for Selectional Preferences , 2010, ACL.

[2]  Anna Korhonen,et al.  Modelling selectional preferences in a lexical hierarchy , 2012, *SEMEVAL.

[3]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[4]  Simone Teufel,et al.  Statistical Metaphor Processing , 2013, CL.

[5]  Katrin Erk,et al.  A Flexible, Corpus-Driven Model of Regular and Inverse Selectional Preferences , 2010, CL.

[6]  Tim Van de Cruys,et al.  A Non-negative Tensor Factorization Model for Selectional Preference Induction , 2009, Natural Language Engineering.

[7]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[8]  Geoffrey E. Hinton,et al.  Three new graphical models for statistical language modelling , 2007, ICML '07.

[9]  P. Resnik Selectional constraints: an information-theoretic model and its computational realization , 1996, Cognition.

[10]  Roberto Basili,et al.  Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics , 2009 .

[11]  Randy Goebel,et al.  Discriminative Learning of Selectional Preference from Unlabeled Text , 2008, EMNLP.

[12]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[13]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[14]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[15]  Katrin Erk,et al.  Flexible, Corpus-Based Modelling of Human Plausibility Judgements , 2007, EMNLP.

[16]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[17]  Diana McCarthy,et al.  Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences , 2003, CL.

[18]  Dong Yu,et al.  The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Yuji Matsumoto,et al.  Modeling and Learning Semantic Co-Compositionality through Prototype Projections and Neural Networks , 2013, EMNLP.

[20]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[21]  Katrin Erk,et al.  A Simple, Similarity-based Model for Selectional Preferences , 2007, ACL.

[22]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[23]  Tao Li,et al.  The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[24]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[25]  Hang Li,et al.  Generalizing Case Frames Using a Thesaurus and the MDL Principle , 1995, CL.

[26]  Stephen Clark,et al.  Class-Based Probability Estimation Using a Semantic Hierarchy , 2002, CL.

[27]  Diarmuid Ó Séaghdha Latent Variable Models of Selectional Preference , 2010, ACL.

[28]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[29]  Mats Rooth,et al.  Inducing a Semantically Annotated Lexicon via EM-Based Clustering , 1999, ACL.