Determining the Degree of Compositionality of German Particle Verbs by Clustering Approaches

This work determines the degree of compositionality of German particle verbs by two soft clustering approaches. We assume that the more compositional a particle verb is, the more often it appears in the same cluster with its base verb, after applying a probability threshold to establish cluster membership. As German particle verbs are difficult to approach automatically at the syntax-semantics interface, because they typically change the subcategorisation behaviour in comparison to their base verbs, we explore the clustering approaches not only with respect to technical parameters such as the number of clusters, the number of iterations, etc. but in addition focus on the choice of features to describe the particle verbs.

[1]  SabineSchulteim Walde,et al.  Exploring Features to Identify Semanti Nearest Neighbours: A Case Study on German Parti le Verbs , 2012 .

[2]  Barbara Stiebels,et al.  Lexikalische Argumente und Adjunkte : zum semantischen Beitrag von verbalen Präfixen und Partikeln , 1996 .

[3]  Mats Rooth,et al.  Inducing a Semantically Annotated Lexicon via EM-Based Clustering , 1999, ACL.

[4]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[5]  Sabine Schulte im Walde,et al.  Identification, Quantitative Description, and Preliminary Distributional Analysis of German Particle Verbs , 2004, Workshop On Enhancing And Using Electronic Dictionaries.

[6]  Bernice W. Polemis Nonparametric Statistics for the Behavioral Sciences , 1959 .

[7]  Ray Jackendoff,et al.  Verb-Particle Explorations , 2002 .

[8]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[9]  Sabine Schulte im Walde Experiments on the Automatic Induction of German Semantic Verb Classes , 2006, CL.

[10]  Bonnie J. Dorr,et al.  Role of Word Sense Disalnbiguation in Lexical Acquisition: Predicting Semantics from Syntactic Cues , 1996, COLING.

[11]  Mats Rooth,et al.  Two-dimensional clusters in grammatical relations , 1995 .

[12]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[13]  Sabine Schulte im Walde,et al.  Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences , 2008, ACL.

[14]  Michela Cennamo Italian Journal of Linguistics , 2006 .

[15]  John Carroll,et al.  Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[16]  Aravind K. Joshi,et al.  Detecting Compositionality of Verb-Object Combinations using Selectional Preferences , 2007, EMNLP-CoNLL.

[17]  Sabine Schulte im Walde Clustering Verbs Semantically According to their Alternation Behaviour , 2000, COLING.

[18]  Anke Lüdeling,et al.  On Particle Verbs and Similar Constructions in German , 2001 .

[19]  Nadine Aldinger,et al.  Towards a Dynamic Lexicon: Predicting the Syntactic Argument Structure of Complex Verbs , 2004, LREC.

[20]  Michael Schiehlen A Cascaded Finite-State Parser for German , 2003, EACL.

[21]  Afsaneh Fazly,et al.  A distributional account of the semantics of multiword expressions , 2008 .

[22]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[23]  A. Stuart,et al.  Non-Parametric Statistics for the Behavioral Sciences. , 1957 .

[24]  Ines Rehbein,et al.  German Particle Verbs and Pleonastic Prepositions , 2006, ACL 2006.

[25]  Suzanne Stevenson,et al.  A General Feature Space for Automatic Verb Classification , 2003, EACL.

[26]  Claudia Kunze,et al.  Extension and Use of GermaNet, a Lexical-Semantic Database , 2000, LREC.

[27]  Yuval Krymolowski,et al.  Clustering Polysemic Subcategorization Frame Distributions Semantically , 2003, ACL.

[28]  Adam Kilgarriff,et al.  Large Linguistically-Processed Web Corpora for Multiple Languages , 2006, EACL.

[29]  Dominic Abrams,et al.  Language, Speech, and Communication , 2006 .

[30]  Aline Villavicencio,et al.  Introduction to the special issue on multiword expressions: Having a crack at a hard nut , 2005, Comput. Speech Lang..

[31]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[32]  Akiko Nagano,et al.  The Oxford Handbook of Compounding , 2010 .

[33]  Rochelle Lieber,et al.  Introduction: Status and Definition of Compounding , 2011 .

[34]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[35]  Stanislav Kavka,et al.  Compounding and Idiomatology , 2011 .

[36]  Sabine Schulte im Walde,et al.  Predicting the Degree of Compositionality of German Particle Verbs based on Empirical Syntactic and Semantic Subcategorisation Transfer Patterns , 2008 .