A PROBABILISTIC CHINESE BASENP RECOGNITION MODEL COMBINED WITH SYNTACTIC COMPOSITION TEMPLATES

A formal definition of Chinese baseNP is proposed in the present paper. The manipulatability of the definition is shown through the formulation of the Chinese baseNP annotation specification oriented for information processing and the extraction of the syntactic composition templates. It is pointed out that the syntactic composition templates are only necessary but not the sufficient condition for baseNP recognition, so the boundary ambiguity and phrase\|type ambiguity could not be solved depending merely on the syntactic composition templates. On the ground of this view, the basic templates embodying the baseNP composition and the N\|grams modeling the contextual constraints are organically incorporated into a new probabilistic model for Chinese baseNP recognition. The experiment shows that the model is superior to the N\|gram model based on part\|of\|speech information.