Rule-based Machine Learning Methods for Functional Prediction

We describe a machine learning method for predicting the value of a real-valued function, given the values of multiple input variables. The method induces solutions from samples in the form of ordered disjunctive normal form (DNF) decision rules. A central objective of the method and representation is the induction of compact, easily interpretable solutions. This rule-based decision model can be extended to search efficiently for similar cases prior to approximating function values. Experimental results on real-world data demonstrate that the new techniques are competitive with existing machine learning and statistical methods and can sometimes yield superior regression performance.

[1]  James L. McClelland Explorations In Parallel Distributed Processing , 1988 .

[2]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[3]  J. Ross Quinlan,et al.  Combining Instance-Based and Model-Based Learning , 1993, ICML.

[4]  R. Tibshirani,et al.  Combining Estimates in Regression and Classification , 1996 .

[5]  R. H. Gallagher Iterative methods for non-linear optimization problems, S. L. S. Jacoby, J. S. Kowalik and J. T. Pizzo, Prentice-Hall, Englewood Cliffs, New Jersey, 1972. No. of pages: 274. Price $14.00 , 1973 .

[6]  Michael Lebowitz,et al.  Categorizing Numeric Information for Generalization , 1985, Cogn. Sci..

[7]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[8]  Brian D. Ripley,et al.  Statistical aspects of neural networks , 1993 .

[9]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[10]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[11]  H. Scheffé,et al.  The Analysis of Variance , 1960 .

[12]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[13]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[14]  Gerhard Widmer Combining Knowledge-Based and Instance-Based Learning to Exploit Qualitative Knowledge , 1993, Informatica.

[15]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[16]  Brian W. Kernighan,et al.  An Effective Heuristic Algorithm for the Traveling-Salesman Problem , 1973, Oper. Res..

[17]  James L. McClelland,et al.  Explorations in parallel distributed processing: a handbook of models, programs, and exercises , 1988 .

[18]  Usama M. Fayyad,et al.  The Attribute Selection Problem in Decision Tree Generation , 1992, AAAI.

[19]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[20]  Sholom M. Weiss,et al.  Optimized rule induction , 1993, IEEE Expert.

[21]  Sholom M. Weiss,et al.  Decision Tree Pruning: Biased or Optimal? , 1994, AAAI.

[22]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[23]  Sholom M. Weiss,et al.  Rule-Based Regression , 1993, IJCAI.

[24]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[25]  Bradley Efron,et al.  Computer-Intensive Methods in Statistical Regression , 1988 .

[26]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .