Enhancing knowledge discovery via association-based evolution of neural logic networks

The comprehensibility aspect of rule discovery is of emerging interest in the realm of knowledge discovery in databases. Of the many cognitive and psychological factors relating the comprehensibility of knowledge, we focus on the use of human amenable concepts as a representation language in expressing classification rules. Existing work in neural logic networks (or neulonets) provides impetus for our research; its strength lies in its ability to learn and represent complex human logic in decision-making using symbolic-interpretable net rules. A novel technique is developed for neulonet learning by composing net rules using genetic programming. Coupled with a sequential covering approach for generating a list of neulonets, the straightforward extraction of human-like logic rules from each neulonet provides an alternate perspective to the greater extent of knowledge that can potentially be expressed and discovered, while the entire list of neulonets together constitute an effective classifier. We show how the sequential covering approach is analogous to association-based classification, leading to the development of an association-based neulonet classifier. Empirical study shows that associative classification integrated with the genetic construction of neulonets performs better than general association-based classifiers in terms of higher accuracies and smaller rule sets. This is due to the richness in logic expression inherent in the neulonet learning paradigm.

[1]  V. Gaudet,et al.  Genetic Programming of Logic-Based Neural Networks , 1995 .

[2]  Hoon-Heng H. Teh Neural Logic Networks: A New Class of Neural Networks , 1995 .

[3]  Yiming Ma,et al.  Improving an Association Rule Based Classifier , 2000, PKDD.

[4]  J. K. Kinnear,et al.  Advances in Genetic Programming , 1994 .

[5]  Carla E. Brodley,et al.  Linear Machine Decision Trees , 1991 .

[6]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[9]  Chew Lim Tan,et al.  Confidence and Support Classification Using Genetically Programmed Neural Logic Networks , 2004, GECCO.

[10]  Michael J. Pazzani,et al.  Knowledge discovery from data? , 2000, IEEE Intell. Syst..

[11]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[12]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[13]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[14]  Chew Lim Tan,et al.  An Artificial Neural Network that Models Human Decision Making , 1996, Computer.

[15]  Peter J. Angeline,et al.  Genetic programming and emergent intelligence , 1994 .

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Chew Lim Tan,et al.  Genetic construction of neural logic networks , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[18]  Douglas H. Fisher,et al.  An Empirical Comparison of ID3 and Back-propagation , 1989, IJCAI.

[19]  J. Paredis,et al.  Rule induction with a genetic sequential covering algorithm (GeSeCo) , 2000 .

[20]  Ron Kohavi,et al.  MLC++: a machine learning library in C++ , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[21]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[22]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[23]  Hitoshi Iba,et al.  Distributed genetic programming: empirical study and analysis , 1996 .

[24]  Lalit M. Patnaik,et al.  Application of genetic programming for multicategory pattern classification , 2000, IEEE Trans. Evol. Comput..

[25]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[26]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[27]  Georgios Dounias,et al.  An evolutionary system for neural logic networks using genetic programming and indirect encoding , 2004, J. Appl. Log..

[28]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[29]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[30]  M. Pazzani,et al.  ID2-of-3: Constructive Induction of M-of-N Concepts for Discriminators in Decision Trees , 1991 .

[31]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .

[32]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[33]  Chew Lim Tan,et al.  Neural Logic Network Learning Using Genetic Programming , 2001, Int. J. Comput. Intell. Appl..

[34]  Seth Bullock,et al.  Simple Heuristics That Make Us Smart , 1999 .

[35]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[36]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[37]  S. C. Kleene,et al.  Introduction to Metamathematics , 1952 .

[38]  P. Todd,et al.  Simple Heuristics That Make Us Smart , 1999 .

[39]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[40]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[41]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[42]  James R. Levenick Inserting Introns Improves Genetic Algorithm Success Rate: Taking a Cue from Biology , 1991, ICGA.