论文信息 - Using J-Pruning to Reduce Overfitting of Classification Rules in Noisy Domains

Using J-Pruning to Reduce Overfitting of Classification Rules in Noisy Domains

The automatic induction of classification rules from examples is an important technique used in data mining. One of the problems encountered is the overfitting of rules to training data. This paper describes a means of reducing overfitting known as J-pruning, based on the J-measure, an information theoretic means of quantifying the information content of a rule, and examines its effectiveness in the presence of noisy data for two rule induction algorithms: one where the rules are generated via the intermediate representation of a decision tree and one where rules are generated directly from examples.

Max Bramer | M. Bramer

[1] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[2] Max Bramer,et al. An Information-Theoretic Approach to the Pre-pruning of Classification Rules , 2002, Intelligent Information Processing.

[3] Padhraic Smyth,et al. Rule Induction Using Information Theory , 1991, Knowledge Discovery in Databases.

[4] John Mingers,et al. An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.

[5] Max Bramer,et al. Automatic Induction of Classification Rules from Examples Using N-Prism , 2000 .

[6] Max Bramer,et al. Using J-pruning to reduce overfitting in classification trees , 2002, Knowl. Based Syst..

[7] Philip J. Stone,et al. Experiments in induction , 1966 .

[8] C. Lokhorst,et al. Knowledge Discovery in Dutch Dairy Databases , 1998 .

[9] Aiko M. Hormann,et al. Programs for Machine Learning. Part I , 1962, Inf. Control..

[10] William Frawley,et al. Knowledge Discovery in Databases , 1991 .

[11] Jadzia Cendrowska,et al. PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[12] Frans Coenen,et al. Research and Development in Intelligent Systems XVI , 2000, Springer London.

[13] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .