A feature extraction method using base phrase and keyword in Chinese text

The feature extraction is the most key technology of text categorization. The word is used as the feature in the traditional text classification, and its effect for the text classification is evidence. The feature extraction method using base phrase and keyword changes the feature extraction of Chinese text from syntax and semantic further. In the first, analyzing the feature of baseNP and basedVP, and then make some words into baseNP and baseVP which accord to the rules of phrase, give WSD to other words in the finally. The paper proposes a stepwise feature extraction from word to phrase. The experiment results show that this method can perform much better than traditional feature extraction method, it can improve the text classification precision and recall.