PropBank: the Next Level of TreeBank
暂无分享,去创建一个
There has long been a recognition that syntactic structure alone does not provide enough information for machine understanding of human language. Various efforts under the auspices of MUC [8] have added limited-coverage semantic lexicons in order to improve the performance of the systems under evaluation. With the aim of providing data for statistical techniques several sites are investigating semantic annotation. The Prague Tectogrammatics project [3] endeavours to annotate semantic relationships at the same time as syntactic and morphological structure. The Framenet Project [4] is eschewing fine-grained syntactic structure in favor of ’chunked’ data and semantic annotation. This paper describes the PropBank project at Penn, which adds a layer of semantic annotation atop the syntactic structure already present in the Penn TreeBank [5,6].
[1] J. Lowe,et al. A Frame-Semantic Approach to Semantic Annotation , 1997 .
[2] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[3] Eva Hajicová,et al. Argument/Valency Structure in PropBank, LCS Database and Prague Dependency Treebank: A Comparative Pilot Study , 2002, LREC.
[4] Jarmila Panevová,et al. Tectogrammatics in corpus tagging , 2001 .