Weighted Probabilistic Sum Model Based on Decision Tree Decomposition for Text Chunking

Text chunking is often regarded as classification of words. To classify a given word into a correct class, we need to exploit various features within a context. In this paper, we propose a weighted probabilistic sum model for text chunking, which can exploit various context features with weights and finally select an optimal sequence of chunk tags. For effective feature selection, we utilize a decision tree as an intermediate feature space inducer. To select a more compact feature set with less computational load, we organized a partially ordered feature space according to the IGR distribution of features. Our approach for text chunking can be outlined by the following four steps: (1) learning a decision tree in a universal feature space, (2) reorganizing a feature space through decomposition of the decision tree, (3) estimating the weights of features and prediction probabilities, and (4) predicting possible chunk tags and searching an optimal chunk tag sequence. To alleviate the sparse data problem, we integrate general features with specific features. That is, we integrate various context features with different sizes and attributes. In addition, we try to combine words and word classes based on Word-Net for English text chunking. The experimental results show that our method can improve the performance of text chunking and overcome the sparse data problem.

[1]  Steven Abney,et al.  Parsing By Chunks , 1991 .