Recent years have seen a revived interest in semantic parsing by applying statistical and machine-learning methods to semantically annotated corpora such as the FrameNet and the Proposition Bank. So far much of the research has been focused on English due to the lack of semantically annotated resources in other languages. In this paper, we report first results on semantic role labeling using a pre-release version of the Chinese Proposition Bank. Since the Chinese Proposition Bank is superimposed on top of the Chinese Tree-bank, i.e., the semantic role labels are assigned to constituents in a treebank parse tree, we start by reporting results on experiments using the handcrafted parses in the treebank. This will give us a measure of the extent to which the semantic role labels can be bootstrapped from the syntactic annotation in the treebank. We will then report experiments using a fully automatic Chinese parser that integrates word segmentation, POS-tagging and parsing. This will gauge how successful semantic role labeling can be done for Chinese in realistic situations. We show that our results using hand-crafted parses are slightly higher than the results reported for the state-of-the-art semantic role labeling systems for English using the Penn English Proposition Bank data, even though the Chinese Proposition Bank is smaller in size. When an automatic parser is used, however, the accuracy of our system is much lower than the English state-of-the-art. This reveals an interesting cross-linguistic difference between the two languages, which we attempt to explain. We also describe a method to induce verb classes from the Proposition Bank "frame files" that can be used to improve semantic role labeling.
[1]
John B. Lowe,et al.
The Berkeley FrameNet Project
,
1998,
ACL.
[2]
Daniel Gildea,et al.
Automatic Labeling of Semantic Roles
,
2000,
ACL.
[3]
Daniel Gildea,et al.
The Necessity of Parsing for Predicate Argument Recognition
,
2002,
ACL.
[4]
Daniel Jurafsky,et al.
Automatic Labeling of Semantic Roles
,
2002,
CL.
[5]
Richard Sproat,et al.
The First International Chinese Word Segmentation Bakeoff
,
2003,
SIGHAN.
[6]
Xiaoqiang Luo.
A Maximum Entropy Chinese Character-Based Parser
,
2003,
EMNLP.
[7]
Nianwen Xue,et al.
Annotating the Propositions in the Penn Chinese Treebank
,
2003,
SIGHAN.
[8]
Nianwen Xue,et al.
Calibrating Features for Semantic Role Labeling
,
2004,
EMNLP.
[9]
Daniel Jurafsky,et al.
Shallow Semantic Parsing using Support Vector Machines
,
2004,
NAACL.
[10]
Daniel Jurafsky,et al.
Shallow Semantc Parsing of Chinese
,
2004,
HLT-NAACL.
[11]
C. Luo,et al.
Sinica Treebank: Design Criteria, Representational Issues and Implementation
,
2004
.
[12]
M. A. R T A P A L,et al.
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus
,
2005,
Natural Language Engineering.
[13]
Daniel Gildea,et al.
The Proposition Bank: An Annotated Corpus of Semantic Roles
,
2005,
CL.