Three Approaches to Finding German Valence Compounds

Valence compounds (German: Rektionskomposita) such as Autofahrer ‘car driver’ are a special subclass in the otherwise very heterogeneous class of nominal compounds. As the corresponding verb (fahren ‘to drive’ in the example) governs the (accusative) object (Auto ‘car’), valence compounds allow for a straightforward (event-)semantic interpretation. Hence the automatic detection of valence compounds constitutes an essential step towards a more comprehensive approach to the analysis of compound-internal semantic relations. Using a hand-annotated dataset of 200 examples, we develop an accurate approach that finds valence compounds in large-scale corpora.

[1]  E. Hinrichs,et al.  GernEdiT: A Graphical Tool for GermaNet Development , 2010, ACL.

[2]  Helmut Schmid,et al.  Estimation of Conditional Probabilities With Decision Trees and an Application to Fine-Grained POS Tagging , 2008, COLING.

[3]  Diarmuid Ó Séaghdha Annotating and Learning Compound Noun Semantics , 2007, ACL.

[4]  Joakim Nivre,et al.  Discriminative Classifiers for Deterministic Dependency Parsing , 2006, ACL.

[5]  Dan I. Moldovan,et al.  On the semantics of noun compounds , 2005, Comput. Speech Lang..

[6]  Anne Schiller,et al.  German Compound Analysis with wfsc , 2005, FSMNLP.

[7]  Carmen Scherer Wortbildungswandel und Produktivität : eine empirische Studie zur nominalen -er-Derivation im Deutschen , 2005 .

[8]  Ulrich Heid,et al.  SMOR: A German Computational Morphology Covering Derivation, Composition and Inflection , 2004, LREC.

[9]  Philipp Koehn,et al.  Empirical Methods for Compound Splitting , 2003, EACL.

[10]  Maria Lapata,et al.  The Disambiguation of Nominalizations , 2002, CL.

[11]  Frank Henrik Müller,et al.  Annotating Topological Fields and Chunks - and Revising POS Tags at the Same Time , 2002, COLING.

[12]  Claudia Kunze,et al.  GermaNet - representation, visualization, application , 2002, LREC.

[13]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[14]  William W. Cohen Learning Trees and Rules with Set-Valued Features , 1996, AAAI/IAAI, Vol. 1.

[15]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[16]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  Yannick Versley Not Just Bigger : Towards Better-Quality Web Corpora , 2012 .

[19]  Yannick Versley,et al.  Automatic Mining of Valence Compounds for German: A Corpus-Based Approach , 2012, DH.

[20]  Amir Zeldes,et al.  Deutsche Komposita zwischen Syntax und Morphologie: Ein korpusbasierter Ansatz , 2012 .

[21]  Sofia Cassel MaltParser and LIBLINEAR Transition-based dependency parsing with linear classification for feature model optimization , 2009 .

[22]  Hartmut E. H. Lenk,et al.  Streiter für Gerechtigkeit und Teilnehemer am Meinugsstreit?Zur Valenz von Nomina agentis im Deutschen und Finnischen , 2007 .

[23]  T. Marek Analysis of German Compounds Using Weighted Finite State Transducers , 2006 .

[24]  Ingeburg Kühnhold,et al.  Deutsche Wortbildung : Typen und Tendenzen in der Gegenwartssprache ; eine Bestandsaufnahme des Instituts für Deutsche Sprache, Forschungsstelle Innsbruck : Morphem- und Sachregister zu Band I—III , 1984 .