Incremental classification of process data for anomaly detection based on similarity analysis

Performance evaluation and anomaly detection in complex systems are time consuming tasks based on analyzing, similarity analysis and classification of many different data sets from real operations. This paper presents an original computational technology for unsupervised incremental classification of large data sets by using a specially introduced similarity analysis method. First of all the so called compressed data models are obtained from the original large data sets by a newly proposed sequential clustering algorithm. Then the data sets are compared by pairs not directly, but by using their respective compressed data models. The evaluation of the pairs is done by a special similarity analysis method that uses the so called Intelligent Sensors (Agents) and data potentials. Finally a classification decision is generated by using a predefined threshold of similarity. The applicability of the proposed computational scheme for anomaly detection, based on many available large data sets is demonstrated on an example of 18 synthetic data sets. Suggestions for further improvements of the whole computation technology and a better applicability are also discussed in the paper.

[1]  Xiaowei Zhou,et al.  Real-time joint Landmark Recognition and Classifier Generation by an Evolving Fuzzy System , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[4]  Anne M. P. Canuto,et al.  Particle Swarm Intelligence as Feature Selector in Ensemble Systems , 2013, 2013 Brazilian Conference on Intelligent Systems.

[5]  R. Yager,et al.  Approximate Clustering Via the Mountain Method , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[6]  M. Svensson,et al.  Self-organizing maps for automatic fault detection in a vehicle cooling system , 2008, 2008 4th International IEEE Conference Intelligent Systems.

[7]  Kohei Inoue,et al.  Sequential fuzzy cluster extraction by a graph spectral method , 1999, Pattern Recognit. Lett..

[8]  Gancho Vachkov Temporal and spatial Evolving Knowledge Base system with sequential clustering , 2010, International Conference on Fuzzy Systems.

[9]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[10]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[11]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[12]  Gancho Vachkov Human-Assisted Fuzzy Image Similarity Analysis Based on Information Compression , 2009, J. Adv. Comput. Intell. Intell. Informatics.

[13]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[14]  Teuvo Kohonen,et al.  Self-Organizing Maps, Third Edition , 2001, Springer Series in Information Sciences.

[15]  Shaoning Pang,et al.  An Incremental Principal Component Analysis for Chunk Data , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[16]  D. Filev,et al.  Clustering techniques for rule extraction from unstructured text fragments , 2005, NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society.

[17]  Simei Gomes Wysoski,et al.  Evolving spiking neural networks for taste recognition , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[18]  Bala Srinivasan,et al.  Dynamic self-organizing maps with controlled growth for knowledge discovery , 2000, IEEE Trans. Neural Networks Learn. Syst..

[19]  Shaoning Pang,et al.  r-SVMT: Discovering the knowledge of association rule over SVM classification trees , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[20]  Plamen P. Angelov,et al.  An approach for fuzzy rule-base adaptation using on-line clustering , 2004, Int. J. Approx. Reason..

[21]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[22]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Erkki Oja,et al.  Rival penalized competitive learning for clustering analysis, RBF net, and curve detection , 1993, IEEE Trans. Neural Networks.

[24]  Witold Pedrycz,et al.  C-fuzzy decision trees , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  Witold Pedrycz,et al.  Knowledge-based clustering - from data to information granules , 2007 .