Method of Web Information Extraction Based on Decision Tree

Due to the constantly updated characteristic of data in Web, this paper studies the decision tree technology and how to use in the field of Web information extraction. According to the datasets by information extraction, a decision tree of agricultural products market is constructed by C4.5/C5.0 algorithm, with constantly updated data to update the decision tree, and then generate the understandable rules. The experiment proves that it is feasible to realize the Web information extraction based on the decision tree.

[1]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[2]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[3]  Donato Malerba,et al.  A Comparative Analysis of Methods for Pruning Decision Trees , 1997, IEEE Trans. Pattern Anal. Mach. Intell..