Towards effective web page classification

In order to manage and organize information on the web, we propose a novel web page classification strategy integrating topic model and SVM. We use topic model to harness the implicit information on web pages for feature extraction. Accuracy of the strategy is 84.15%, 2.23% superior to the traditional classification strategy based on CHI.