The verb-noun sequence in Chinese often creates ambiguities in parsing. These ambiguities can usually be resolved if we know in advance whether the verb and the noun tend to be in the verb-object relation or the modifier-head relation. In this paper, we describe a learning procedure whereby such knowledge can be automatically acquired. Using an existing (imperfect) parser with a chart filter and a tree filter, a large corpus, and the log-likelihood-ratio (LLR) algorithm, we were able to acquire verb-noun pairs which typically occur either in verb-object relations or modifier-head relations. The learned pairs are then used in the parsing process for disambiguation. Evaluation shows that the accuracy of the original parser improves significantly with the use of the automatically acquired knowledge.
[1]
Ted Dunning,et al.
Accurate Methods for the Statistics of Surprise and Coincidence
,
1993,
CL.
[2]
Frank Smadja,et al.
Retrieving Collocations from Text: Xtract
,
1993,
CL.
[3]
Andi Wu,et al.
Dynamic Lexical Acquisition in Chinese Sentence Analysis
,
2002,
COLING.
[4]
Andi Wu,et al.
Word Segmentation In Sentence Analysis
,
1998
.
[5]
P.Bryan Heidorn.
Natural language processing
,
1996
.
[6]
SmadjaFrank.
Retrieving collocations from text
,
1993
.
[7]
Karen Jensen,et al.
Natural Language Processing: The PLNLP Approach
,
2013,
Natural Language Processing.