Evaluating Verb Subcategorisation Frames learned by a German Statistical Grammar against Manual Defi

The paper describes an extensive evaluation of computational large-scale verb subcategorisation by comparing subcategorisation frames induced from a German lexicaIised statistical grammar against manual verb definitions in the dictionary Duden - Das Stilworterbuch. We achieved an f-score of 62.30% on 3,090 verbs with a training corpus frequency between 10 and 2,000; ignoring prepositional phrases within the frame definitions resulted in an f-score of 72.05%. As to our knowledge, no former approach on automatic acquisition of verb subcategorisation has performed a comparably extensive evaluation. Our evaluation results justify the utilisation of the statistical grammar framework for obtaining a reliable subcategorisation lexicon for verbs. The lexical entries hold a potential for adding to and improving manual verb definitions.