Visual, linguistic data mining using Self- Organizing Maps

Data mining methods are becoming vital as the amount and complexity of available data is rapidly growing. Visual data mining methods aim at including a human observer in the loop and leveraging human perception for knowledge extraction. However, for large datasets, the rough knowledge gained via visualization is often times not sufficient. Thus, in such cases data summarization can provide a further insight into the problem at hand. Linguistic descriptors such as linguistic summaries and linguistic rules can be used in data summarization to further increase the understandability of datasets. This paper presents a Visual Linguistic Summarization tool (VLS-SOM) that combines the visual data mining capability of the Self-Organizing Map (SOM) with the understandability of linguistic descriptors. This paper also presents new quality measures for ranking of predictive rules. The presented data mining tool enables users to 1) interactively derive summaries and rules about interesting behaviors of the data visualized though the SOM, 2) visualize linguistic descriptors and visually assess the importance of generated summaries and rules. The data mining tool was tested on two benchmark problems. The tool was helpful in identifying important features of the datasets. The visualization enabled the identification of the most important summaries. For classification, the visualization proved useful in identifying multiple rules that classify the dataset.

[1]  Milos Manic,et al.  CAVE-SOM: Immersive visual data mining using 3D Self-Organizing Maps , 2011, The 2011 International Joint Conference on Neural Networks.

[2]  Kok Wai Wong,et al.  Fuzzy rules extraction using self-organising neural network and association rules , 2001, Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology. TENCON 2001 (Cat. No.01CH37239).

[3]  Angel Barriga,et al.  Linguistic summarization of network traffic flows , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[4]  Ronald R. Yager,et al.  A new approach to the summarization of data , 1982, Inf. Sci..

[5]  H. Jaap van den Herik,et al.  Interpretable Neural Networks with BP-SOM , 1998, ECML.

[6]  Slawomir Zadrozny,et al.  Computing With Words Is an Implementable Paradigm: Fuzzy Queries, Linguistic Data Summaries, and Natural-Language Generation , 2010, IEEE Transactions on Fuzzy Systems.

[7]  Rita Castillo-Ortega,et al.  Linguistic local change comparison of time series , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[8]  Chihli Hung Knowledge-Based Rule Extraction from Self-Organizing Maps , 2008, ICONIP.

[9]  Stefan Wermter,et al.  Data mining using rule extraction from Kohonen self-organising maps , 2006, Neural Computing & Applications.

[10]  H. J. van den Herik,et al.  Intelligible neural networks with BP-SOM , 1997 .

[11]  Alberto Bugarín,et al.  Semi-fuzzy quantifiers as a tool for building linguistic summaries of data patterns , 2011, 2011 IEEE Symposium on Foundations of Computational Intelligence (FOCI).

[12]  Arijit Laha Developing Credit Scoring Models with SOM and Fuzzy Rule Based k-NN Classifiers , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[14]  Ding-An Chiang,et al.  Mining time series data by a fuzzy linguistic summary system , 2000, Fuzzy Sets Syst..

[15]  Jerry M. Mendel,et al.  Linguistic Summarization Using IF–THEN Rules and Interval Type-2 Fuzzy Sets , 2011, IEEE Transactions on Fuzzy Systems.

[16]  Anna Wilbik,et al.  Linguistic summarization of time series using a fuzzy quantifier driven aggregation , 2008, Fuzzy Sets Syst..

[17]  Leonas Simanauskas,et al.  Portable Rule Extraction Method for Neural Network Decisions Reasoning , 2009 .

[18]  Julio J. Valdés,et al.  Evolutionary computation based nonlinear transformations to low dimensional spaces for sensor data fusion and Visual Data Mining , 2010, IEEE Congress on Evolutionary Computation.

[19]  Tetsuro Ogi,et al.  Super High Definition Three-Dimensional Display Evironment Applied to Visual Data Mining , 2010, 2010 13th International Conference on Network-Based Information Systems.

[20]  Jerry M. Mendel,et al.  Generating fuzzy rules by learning from examples , 1992, IEEE Trans. Syst. Man Cybern..

[21]  Andries Petrus Engelbrecht,et al.  HybridSOM: A generic rule extraction framework for self-organizing feature maps , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[22]  Hisao Ishibuchi,et al.  Rule weight specification in fuzzy rule-based classification systems , 2005, IEEE Transactions on Fuzzy Systems.

[23]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[24]  Ahmed Seffah,et al.  From Visualization to Visual Mining: Application to Environmental Data , 2008, First International Conference on Advances in Computer-Human Interaction.

[25]  James M. Keller,et al.  Computing With Words With the Ontological Self-Organizing Map , 2010, IEEE Transactions on Fuzzy Systems.

[26]  W. Pedrycz,et al.  Fuzzy computing for data mining , 1999, Proc. IEEE.

[27]  Christos Pateritsas,et al.  EXTRACTING RULES FROM TRAINED SELF- ORGANIZING MAPS , 2007 .

[28]  Alex Alves Freitas,et al.  On rule interestingness measures , 1999, Knowl. Based Syst..

[29]  Jingyu Yang,et al.  SOMRuler: A Novel Interpretable Transmembrane Helices Predictor , 2011, IEEE Transactions on NanoBioscience.

[30]  Tatsuya Nomura,et al.  An adaptive fuzzy rule extraction using hybrid model of the fuzzy self-organizing map and the genetic algorithm with numerical chromosomes , 1998, J. Intell. Fuzzy Syst..

[31]  Manuel P. Cuéllar,et al.  Linguistic summarization of long-term trends for understanding change in human behavior , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[32]  Adam Niewiadomski,et al.  A Type-2 Fuzzy Approach to Linguistic Summarization of Data , 2008, IEEE Transactions on Fuzzy Systems.

[33]  Jerry M. Mendel,et al.  Linguistic summarization using IF-THEN rules , 2010, International Conference on Fuzzy Systems.