Vector Quantization with Rule Extraction for Mixed Domain Data

We use a variant of learning vector quantization (LVQ) for extracting a rule based characterization of given labeled mixed domain data. Thereby, standard LVQ is improved to not only a more stable prototype calculation mechanism, but also to automatic detection of the importance of the different input data components by implementing an adaptive metric. This component weighting is related to the importance of certain components for providing a good data classification, and the obtained component ranking is then used to generate a classification tree from the prototypes by calculating natural splits of the input space. We present the generalized relevance vector quantizer (GRLVQ), its extension, supervised neural gas (SRNG), and we show how both methods can be used for extracting so called BB-classification trees and rules from data. Artificial data, data from the UCI repository, and linguistic data are used in our experiments. Since the data domains might be a mixture of real values, integers, and discrete nominal data, we also discuss appropriate preprocessing techniques.

[1]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[2]  T. Villmann,et al.  Topology Preservation in Self-Organizing Maps , 1999 .

[3]  Steffen Hölldobler,et al.  A Recursive Neural Network for Reflexive Reasoning , 1998, Hybrid Neural Systems.

[4]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[5]  Aluizio F. R. Araújo,et al.  Context in temporal sequence processing: a self-organizing approach and its application to robotics , 2002, IEEE Trans. Neural Networks.

[6]  Thomas Villmann,et al.  Estimating Relevant Input Dimensions for Self-organizing Algorithms , 2001, WSOM.

[7]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[8]  Thomas Villmann,et al.  Rule Extraction from Self-Organizing Networks , 2002, ICANN.

[9]  Thomas Villmann,et al.  Learning Vector Quantization for Multimodal Data , 2002, ICANN.

[10]  Wlodzislaw Duch,et al.  A new methodology of extraction, optimization and application of crisp and fuzzy logical rules , 2001, IEEE Trans. Neural Networks.

[11]  Rafal Bogacz,et al.  BRAINN: A Connectionist Approach to Symbolic Reasoning , 1998, NC.

[12]  Lokendra Shastri,et al.  Types and Quantifiers in SHRUTI: A Connectionist Model of Rapid Reasoning and Relational Processing , 1998, Hybrid Neural Systems.

[13]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[14]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[15]  Teuvo Kohonen,et al.  Self-Organizing Maps, Third Edition , 2001, Springer Series in Information Sciences.

[16]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[17]  Pierre Flener,et al.  Inductive Synthesis of Recursive Logic Programs: Achievements and Prospects , 1999, J. Log. Program..

[18]  Alípio Mário Jorge,et al.  Architecture for Iterative Learning of Recursive Definitions , 1996 .

[19]  Wolfram Schiffmann,et al.  Optimization of the Backpropagation Algorithm for Training Multilayer Perceptrons , 1994 .

[20]  Joachim Diederich,et al.  The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks , 1998, IEEE Trans. Neural Networks.

[21]  Alfred Ultsch,et al.  Knowledge Extraction from Self-Organizing Neural Networks , 1993 .

[22]  Atsushi Sato,et al.  Generalized Learning Vector Quantization , 1995, NIPS.

[23]  Stephen Sengwah Kwek,et al.  Geometric Concept Learning and Related Topics , 1997 .

[24]  Ah Chung Tsoi,et al.  A Supervised Self-Organizing Map for Structured Data , 2001, WSOM.

[25]  Samuel Kaski,et al.  Learning metrics for exploratory data analysis , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[26]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[27]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .