论文信息 - Extracting Idiomatic Hungarian Verb Frames

Extracting Idiomatic Hungarian Verb Frames

We describe a machine learning method for collecting idiomatic fixed stem verb frames. Firstly we collect frequent frame candidates from the output of a partial parser, secondly we apply a certain idiomaticity metric to the list to get the most idiomatic frames. Running our implemented system we get a list of ten thousand frames of more than 900 verbs which will be translated to English and used as a resource in a Hungarian-to-English machine translation system.

Bálint Sass | Bálint Sass

[1] Anoop Sarkar,et al. Learning Verb Subcategorization from Corpora: Counting Frame Subsets , 2000, LREC.

[2] Ted Briscoe,et al. Automatic Extraction of Subcategorization from Corpora , 1997, ANLP.

[3] John Carroll,et al. Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[4] Jussi Piitulainen,et al. Idiomatic Object Usage and Support Verbs , 1998, COLING-ACL.

[5] Robert M. Vago,et al. The Hungarian Language , 1972 .

[6] Mark Steedman,et al. Proceedings of the Third International Conference on Language Resources and Evaluation, LREC 2002, May 29-31, 2002, Las Palmas, Canary Islands, Spain , 2002 .

[7] Steven P. Abney. Partial parsing via finite-state cascades , 1996, Natural Language Engineering.

[8] Christopher D. Manning. Automatic Acquisition of a Large Sub Categorization Dictionary From Corpora , 1993, ACL.

[9] Tamás Váradi,et al. The Hungarian National Corpus , 2002, LREC.

[10] Michael R. Brent,et al. From Grammar to Lexicon: Unsupervised Learning of Lexical Syntax , 1993, Comput. Linguistics.

[11] Gosse Bouma,et al. A New Approach to the Corpus-based Statistical Investigation of Hungarian Multi-word Lexemes , 2004, LREC.