论文信息 - Learning Syntactic Verb Frames using Graphical Models - 字舞流文

Learning Syntactic Verb Frames using Graphical Models

We present a novel approach for building verb subcategorization lexicons using a simple graphical model. In contrast to previous methods, we show how the model can be trained without parsed input or a predefined subcategorization frame inventory. Our method outperforms the state-of-the-art on a verb clustering task, and is easily trained on arbitrary domains. This quantitative evaluation is complemented by a qualitative discussion of verbs and their frames. We discuss the advantages of graphical models for this task, in particular the ease of integrating semantic information about verbs and arguments in a principled fashion. We conclude with future work to augment the approach.

Anna Korhonen | Thomas Lippincott | Diarmuid Ó Séaghdha | A. Korhonen | Thomas Lippincott

[1] Daniel Gildea,et al. The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[2] Gregor Heinrich. Parameter estimation for text analysis , 2009 .

[3] Johan Bos,et al. Linguistically Motivated Large-Scale NLP with C&C and Boxer , 2007, ACL.

[4] Anna Korhonen,et al. Exploring subdomain variation in biomedical language , 2010, BMC Bioinformatics.

[5] Tiejun Zhao,et al. Weakly Supervised SVM for Chinese- English Cross-lingual Subcategorization Lexicon Acquisition , 2008 .

[6] Ted Briscoe,et al. Can Subcategorisation Probabilities Help a Statistical Parser , 1998, VLC@COLING/ACL.

[7] Sanda M. Harabagiu,et al. Using Predicate-Argument Structures for Information Extraction , 2003, ACL.

[8] Sophia Ananiadou,et al. Bootstrapping a Verb Lexicon for Biomedical Information Extraction , 2009, CICLing.

[9] Nigel Collier,et al. The Choice of Features for Classification of Verbs in Biomedical Texts , 2008, COLING.

[10] K. Bretonnel Cohen,et al. A critical review of PASBio's argument structures for biomedical verbs , 2006, BMC Bioinformatics.

[11] Radford M. Neal. Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[12] Christopher D. Manning,et al. Verb Sense and Subcategorization: Using Joint Inference to Improve Performance on Complementary Task , 2004, EMNLP.

[13] Ari Rappoport,et al. Fully Unsupervised Core-Adjunct Argument Classification , 2010, ACL.

[14] Ted Briscoe,et al. The Second Release of the RASP System , 2006, ACL.

[15] John B. Lowe,et al. The Berkeley FrameNet Project , 1998, ACL.

[16] Christopher D. Manning,et al. Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[17] Ted Briscoe,et al. A Large Subcategorization Lexicon for Natural Language Processing Applications , 2006, LREC.

[18] Nigel Collier,et al. Automatic Classification of Verbs in Biomedical Texts , 2006, ACL.

[19] Olivier Bodenreider,et al. The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[20] BriscoeTed,et al. Large lexicons for natural language processing , 1987 .

[21] Sophia Ananiadou,et al. A Specialised Verb Lexicon as the Basis of Fact Extraction in the Biomedical Domain , 2010 .

[22] Beth Levin,et al. English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[23] Gregor Heinrich. “ Infinite LDA ” – Implementing the HDP with minimum code complexity , 2011 .

[24] Adam R. Teichert. Unsupervised Part of Speech Tagging Without a Lexicon , 2009 .

[25] J. A. Hartigan,et al. A k-means clustering algorithm , 1979 .

[26] Neville Ryant,et al. A Large-scale Classication of English Verbs , 2006 .

[27] P. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[28] Suzanne Stevenson,et al. Automatic Verb Classification Based on Statistical Distributions of Argument Structure , 2001, CL.

[29] Christopher D. Manning,et al. The Infinite Tree , 2007, ACL.

[30] Cédric Messiant,et al. A Subcategorization Acquisition System for French Verbs , 2008, ACL.

[31] MerloPaola,et al. Automatic verb classification based on statistical distributions of argument structure , 2001 .

[32] Marina Meila,et al. Comparing Clusterings by the Variation of Information , 2003, COLT.

[33] Daniel Jurafsky,et al. How Verb Subcategorization Frequencies Are Affected By Corpus Choice , 1998, COLING.

[34] Diana McCarthy,et al. Using Semantic Preferences to Identify Verbal Participation in Role Switching Alternations , 2000, ANLP.

[35] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[36] Suzanne Stevenson,et al. A General Feature Space for Automatic Verb Classification , 2003, EACL.

[37] Neville Ryant,et al. A large-scale classification of English verbs , 2008, Lang. Resour. Evaluation.

[38] Ralph Grishman,et al. Comlex Syntax: Building a Computational Lexicon , 1994, COLING.

[39] Jun'ichi Tsujii,et al. Probabilistic Disambiguation Models for Wide-Coverage HPSG Parsing , 2005, ACL.

[40] Anna Korhonen,et al. Improving Verb Clustering with Automatically Acquired Selectional Preferences , 2009, EMNLP.

[41] Sabine Schulte im Walde. 44. The induction of verb frames and verb classes from corpora , 2009 .

[42] Anna Korhonen,et al. Statistical Filtering and Subcategorization Frame Acquisition , 2000, EMNLP.

[43] Diarmuid Ó Séaghdha. Latent Variable Models of Selectional Preference , 2010, ACL.

[44] Branimir Boguraev,et al. Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE , 1987, CL.

[45] Sebastian Riedel,et al. The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[46] Ted Briscoe,et al. A System for Large-Scale Acquisition of Verbal, Nominal and Adjectival Subcategorization Frames from Corpora , 2007, ACL.

[47] Vito Pirrelli,et al. Unsupervised Acquisition of Verb Subcategorization Frames from Shallow-Parsed Corpora , 2008, LREC.