Induction of ripple-down rules applied to modeling large databases

A methodology forthe modeling of large data sets is described which results in rule sets having minimal inter-rule interactions, and being simply maintained. An algorithm for developing such rule sets automatically is described and its efficacy shown with standard test data sets. Comparative studies of manual and automatic modeling of a data set of some nine thousand five hundred cases are reported. A study is reported in which ten years of patient data have been modeled on a month by month basis to determine how well a diagnostic system developed by automated induction would have performed had it been in use throughout the project.

[1]  Brian R. Gaines Empirical investigation of knowledge representation servers: design issues and applications experience with KRS , 1991, SGAR.

[2]  B. Gaines SYSTEM IDENTIFICATION, APPROXIMATION AND COMPLEXITY , 1977 .

[3]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[4]  Donald Michie,et al.  Expert systems in the micro-electronic age , 1979 .

[5]  Paul Compton,et al.  Knowledge in Context: A Strategy for Expert System Maintenance , 1990, Australian Joint Conference on Artificial Intelligence.

[6]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[7]  Jörg Rech,et al.  Knowledge Discovery in Databases , 2001, Künstliche Intell..

[8]  J. Ross Quinlan,et al.  An Expert System for the Interpretation of Thyroid Assays in a Clinical Laboratory , 1985, Aust. Comput. J..

[9]  Brian R. Gaines An Ounce of Knowledge is Worth a Ton of Data: Quantitative studies of the Trade-Off between Expertise and Data Based On Statistically Well-Founded Empirical Induction , 1989, ML.

[10]  William H. Press,et al.  Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[11]  W. Press,et al.  Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3 , 2003 .

[12]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[13]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[14]  Brian R. Gaines,et al.  Class library implementation of an open architecture knowledge support system , 1994, Int. J. Hum. Comput. Stud..

[15]  X. Li Quality time-What's so bad about rule-based programming? , 1991 .

[16]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[17]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[18]  J. R. Quinlan Discovering rules by induction from large collections of examples Intro-ductory readings in expert s , 1979 .

[19]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[20]  P. Compton,et al.  A philosophical basis for knowledge acquisition , 1990 .