Evaluating Hierarchies of Verb Argument Structure with Hierarchical Clustering

Verbs can only be used with a few specific arrangements of their arguments (syntactic frames). Most theorists note that verbs can be organized into a hierarchy of verb classes based on the frames they admit. Here we show that such a hierarchy is objectively well-supported by the patterns of verbs and frames in English, since a systematic hierarchical clustering algorithm converges on the same structure as the handcrafted taxonomy of VerbNet, a broad-coverage verb lexicon. We also show that the hierarchies capture meaningful psychological dimensions of generalization by predicting novel verb coercions by human participants. We discuss limitations of a simple hierarchical representation and suggest similar approaches for identifying the representations underpinning verb argument structure.

[1]  Neville Ryant,et al.  A large-scale classification of English verbs , 2008, Lang. Resour. Evaluation.

[2]  M. R. Manzini Learnability and Cognition , 1991 .

[3]  Diana Maynard,et al.  Using Lexico-Syntactic Ontology Design Patterns for Ontology Creation and Population , 2009, WOP.

[4]  Libby Barak,et al.  Comparing Computational Cognitive Models of Generalization in a Language Acquisition Task , 2016, EMNLP.

[5]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[6]  J. Tenenbaum,et al.  Variability, negative evidence, and the acquisition of verb argument constructions. , 2010, Journal of child language.

[7]  Amy Bidgood,et al.  The retreat from overgeneralization in child language acquisition: word learning, morphology, and verb argument structure. , 2013, Wiley interdisciplinary reviews. Cognitive science.

[8]  Tal Galili,et al.  dendextend: an R package for visualizing, adjusting and comparing trees of hierarchical clustering , 2015, Bioinform..

[9]  Patrick Shafto,et al.  Bayesian Hierarchical Cross-Clustering , 2011, AISTATS.

[10]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[11]  Yuval Krymolowski,et al.  Automatic Classification of English Verbs Using Rich Syntactic Features , 2008, IJCNLP.

[12]  Franziska Frankfurter,et al.  Constructions: A construction grammar approach to argument structure: Adele E. Goldberg, Chicago, IL: The University of Chicago Press, 1995. xi + 265 pp , 1998 .

[13]  Caroline F. Rowland,et al.  Children use verb semantics to retreat from overgeneralization errors: A novel verb grammaticality judgment study , 2011 .

[14]  Daniel H. Huson,et al.  Tanglegrams for rooted phylogenetic trees and networks , 2011, Bioinform..

[15]  Yang Xu,et al.  R/BHC: fast Bayesian hierarchical clustering for microarray data , 2009, BMC Bioinformatics.

[16]  Ryan Blything,et al.  A connectionist model of the retreat from verb argument structure overgeneralization* , 2015, Journal of Child Language.

[17]  Martha Palmer,et al.  Leveraging Lexical Resources for the Detection of Event Relations , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.

[18]  Alessandro Moschitti,et al.  Semantic Role Labeling via FrameNet, VerbNet and PropBank , 2006, ACL.

[19]  Suzanne Stevenson,et al.  Learning verb alternations in a usage-based Bayesian model , 2010 .

[20]  Libby Barak,et al.  Learning Verb Classes in an Incremental Model , 2014, CMCL@ACL.

[21]  Anna Korhonen,et al.  Improved Lexical Acquisition through DPP-based Verb Clustering , 2013, ACL.

[22]  Suzanne Stevenson,et al.  A General Feature Space for Automatic Verb Classification , 2003, EACL.

[23]  Adele E. Goldberg,et al.  The partial productivity of constructions as induction , 2011 .

[24]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[25]  Zoubin Ghahramani,et al.  Unsupervised and Constrained Dirichlet Process Mixture Models for Verb Clustering , 2009 .

[26]  Katherine A. Heller,et al.  Bayesian hierarchical clustering , 2005, ICML.

[27]  D. Huson,et al.  Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks. , 2012, Systematic biology.

[28]  Joshua B. Tenenbaum,et al.  CrossCat: A Fully Bayesian Nonparametric Method for Analyzing Heterogeneous, High Dimensional Data , 2015, J. Mach. Learn. Res..