论文信息 - Automatic Acquisition of a Large Sub Categorization Dictionary From Corpora

Automatic Acquisition of a Large Sub Categorization Dictionary From Corpora

This paper presents a new method for producing a dictionary of subcategorization frames from unlabelled text corpora. It is shown that statistical filtering of the results of a finite state parser running on the output of a stochastic tagger produces high quality results, despite the error rates of the tagger and the parser. Further, it is argued that this method can be used to learn all subcategorization frames, whereas previous methods are not extensible to a general solution to the problem.

Christopher D. Manning

[1] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[2] Marti A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[3] Mats Rooth,et al. Structural Ambiguity and Lexical Relations , 1991, ACL.

[4] Geert Adriaens,et al. Converting Large On-Line Valency Dictionaries For NLP Applications: From Proton Descriptions To Metal Frames , 1992, COLING.

[5] Michael R. Brent,et al. Automatic Acquisition of Subcategorization Frames from Tagged Text , 1991, HLT.

[6] Julian M. Kupiec,et al. Robust part-of-speech tagging using a hidden Markov model , 1992 .

[7] William A. Stockdale. A STUDY OF THE EFFECTIVENESS OF A PROGRAMED LEARNING METHOD IN TEACHING THE USE OF "WEBSTER'S SEVENTH NEW COLLEGIATE DICTIONARY.". , 1967 .

[8] Ivan A. Sag,et al. Information-based syntax and semantics , 1987 .

[9] John Sinclair,et al. Collins COBUILD English Language Dictionary , 1987 .