Fuzzy Clustering based Anomaly Detection for Distributed Multi-view Data

Anomaly detection aims to identify the abnormal instances, whose behavior deviates significantly from the others. Nowadays owing to the existence of diverse data generation sources, different attributes of the same instances may be located on distributed parties forming a multi-view dataset. Thus multi-view anomaly detection has become a key task to discover outliers across various views. Traditionally, to perform multi-view anomaly detection, one needs to centralize data instances from all views into a single machine. However, in many real-world scenarios, it is impractical to send data from diverse views to a master machine due to the privacy issues. Inspired by this, we propose a fuzzy clustering based distributed approach for multi-view anomaly detection that simultaneously learns a membership degree matrix for each view and then detects anomalies for all parties. Specifically, we first introduce a combined fuzzy c-means clustering method for multi-view data and then design an anomaly measurement criterion to quantify the abnormal score from membership degree matrix. To solve the proposed model, a protocol is provided to unify all parties performing a well-designed optimization in an iterative way. Experiments on three datasets with different anomaly settings demonstrate the effectiveness of our approach.

[1]  Pan Su,et al.  Fuzzy rule weight modification with particle swarm optimisation , 2016, Soft Comput..

[2]  Tomoharu Iwata,et al.  Clustering-based anomaly detection in multi-view data , 2013, CIKM.

[3]  Pan Su,et al.  Aberystwyth University Induction of accurate and interpretable fuzzy rules from preliminary crisp representation , 2018 .

[4]  Pan Su,et al.  Exploiting Data Reliability and Fuzzy Clustering for Journal Ranking , 2017, IEEE Transactions on Fuzzy Systems.

[5]  Katharina Morik,et al.  Anomaly Detection in Vertically Partitioned Data by Distributed Core Vector Machines , 2013, ECML/PKDD.

[6]  Yale Song,et al.  One-Class Conditional Random Fields for Sequential Anomaly Detection , 2013, IJCAI.

[7]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[8]  Hongtao Wang,et al.  Context-aware personalized path inference from large-scale GPS snippets , 2018, Expert Syst. Appl..

[9]  Stéphane Marchand-Maillet,et al.  Multiview clustering: a late fusion approach using latent models , 2009, SIGIR.

[10]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[11]  Handong Zhao,et al.  Dual-Regularized Multi-View Outlier Detection , 2015, IJCAI.

[12]  Muhammad Ali Imran,et al.  Distributed Anomaly Detection Using Minimum Volume Elliptical Principal Component Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  D. T. Lee,et al.  Multi-party k-Means Clustering with Privacy Consideration , 2010, International Symposium on Parallel and Distributed Processing with Applications.

[14]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[15]  Yuchi Kanzawa Fuzzy clustering based on α-divergence for spherical data and for categorical multivariate data , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[16]  Ming Shao,et al.  Multi-View Low-Rank Analysis for Outlier Detection , 2015, SDM.

[17]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[18]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[19]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[20]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[21]  Xi Chen,et al.  Direct Robust Matrix Factorizatoin for Anomaly Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[22]  James R. Foulds,et al.  Collective Spammer Detection in Evolving Multi-Relational Social Networks , 2015, KDD.

[23]  Hongtao Wang,et al.  Road Traffic Anomaly Detection via Collaborative Path Inference from GPS Snippets , 2017, Sensors.

[24]  Dung N. Lam,et al.  Using Consensus Clustering for Multi-view Anomaly Detection , 2012, 2012 IEEE Symposium on Security and Privacy Workshops.

[25]  Sam Kwong,et al.  Anomaly intrusion detection using multi-objective genetic fuzzy system and agent-based evolutionary computation framework , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[26]  Deepak S. Turaga,et al.  A Spectral Framework for Detecting Inconsistency across Multi-source Object Relationships , 2011, 2011 IEEE 11th International Conference on Data Mining.

[27]  Pan Su,et al.  Link-based approach for bibliometric journal ranking , 2013, Soft Comput..

[28]  Pan Su,et al.  A hierarchical fuzzy cluster ensemble approach and its application to big data clustering , 2015, J. Intell. Fuzzy Syst..

[29]  Kanishka Bhaduri,et al.  Distributed anomaly detection using 1‐class SVM for vertically partitioned data , 2011, Stat. Anal. Data Min..

[30]  Pan Su,et al.  Ordered weighted aggregation of fuzzy similarity relations and its application to detecting water treatment plant malfunction , 2017, Eng. Appl. Artif. Intell..

[31]  Katsuhiro Honda,et al.  Fuzzy co-clustering of vertically partitioned cooccurrence data with privacy consideration , 2014, 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).