Multi-View Low-Rank Analysis with Applications to Outlier Detection

Detecting outliers or anomalies is a fundamental problem in various machine learning and data mining applications. Conventional outlier detection algorithms are mainly designed for single-view data. Nowadays, data can be easily collected from multiple views, and many learning tasks such as clustering and classification have benefited from multi-view data. However, outlier detection from multi-view data is still a very challenging problem, as the data in multiple views usually have more complicated distributions and exhibit inconsistent behaviors. To address this problem, we propose a multi-view low-rank analysis (MLRA) framework for outlier detection in this article. MLRA pursuits outliers from a new perspective, robust data representation. It contains two major components. First, the cross-view low-rank coding is performed to reveal the intrinsic structures of data. In particular, we formulate a regularized rank-minimization problem, which is solved by an efficient optimization algorithm. Second, the outliers are identified through an outlier score estimation procedure. Different from the existing multi-view outlier detection methods, MLRA is able to detect two different types of outliers from multiple views simultaneously. To this end, we design a criterion to estimate the outlier scores by analyzing the obtained representation coefficients. Moreover, we extend MLRA to tackle the multi-view group outlier detection problem. Extensive evaluations on seven UCI datasets, the MovieLens, the USPS-MNIST, and the WebKB datasets demon strate that our approach outperforms several state-of-the-art outlier detection methods.

[1]  Rose Yu,et al.  GLAD: group anomaly detection in social media analysis , 2014, ACM Trans. Knowl. Discov. Data.

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Bernhard Schölkopf,et al.  One-Class Support Measure Machines for Group Anomaly Detection , 2013, UAI.

[5]  Thomas G. Dietterich,et al.  Systematic construction of anomaly detection benchmarks from real data , 2013, ODD '13.

[6]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Tomoharu Iwata,et al.  Clustering-based anomaly detection in multi-view data , 2013, CIKM.

[8]  Muhammad Ali Imran,et al.  Adaptive Anomaly Detection with Kernel Eigenspace Splitting and Merging , 2015, IEEE Transactions on Knowledge and Data Engineering.

[9]  Martha White,et al.  Convex Sparse Coding, Subspace Learning, and Semi-Supervised Extensions , 2011, AAAI.

[10]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[11]  Yun Fu,et al.  Learning low-rank and discriminative dictionary for image classification , 2014, Image Vis. Comput..

[12]  Fabrizio Angiulli,et al.  Outlier Detection Using Inductive Logic Programming , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[13]  R. Nagaraj,et al.  Anomaly Detection via Online Oversampling Principal Component Analysis , 2014 .

[14]  Hans-Peter Kriegel,et al.  Generalized Outlier Detection with Flexible Kernel Density Estimates , 2014, SDM.

[15]  Yun Fu,et al.  Low-Rank Coding with b-Matching Constraint for Semi-Supervised Classification , 2013, IJCAI.

[16]  Dung N. Lam,et al.  Using Consensus Clustering for Multi-view Anomaly Detection , 2012, 2012 IEEE Symposium on Security and Privacy Workshops.

[17]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[18]  Osmar R. Zaïane,et al.  An Efficient Reference-Based Approach to Outlier Detection in Large Datasets , 2006, Sixth International Conference on Data Mining (ICDM'06).

[19]  Roland Memisevic,et al.  On multi-view feature learning , 2012, ICML.

[20]  Klemens Böhm,et al.  Outlier Ranking via Subspace Analysis in Multiple Views of the Data , 2012, 2012 IEEE 12th International Conference on Data Mining.

[21]  Francis R. Bach,et al.  Consistency of trace norm minimization , 2007, J. Mach. Learn. Res..

[22]  Deepak S. Turaga,et al.  A Spectral Framework for Detecting Inconsistency across Multi-source Object Relationships , 2011, 2011 IEEE 11th International Conference on Data Mining.

[23]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[24]  Longbing Cao,et al.  SVDD-based outlier detection on uncertain data , 2012, Knowledge and Information Systems.

[25]  Rasmus Pagh,et al.  A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data , 2012, KDD.

[26]  Ming Shao,et al.  Locality linear fitting one-class SVM with low-rank constraints for outlier detection , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[27]  Qingshan Liu,et al.  Blessing of Dimensionality: Recovering Mixture Data via Dictionary Pursuit , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Yun Fu,et al.  Robust Subspace Discovery through Supervised Low-Rank Constraints , 2014, SDM.

[29]  J CandèsEmmanuel,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2010 .

[30]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, ISIT.

[31]  Qingshan Liu,et al.  A Deterministic Analysis for LRR , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Shengrui Wang,et al.  Information-Theoretic Outlier Detection for Large-Scale Categorical Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[33]  Bo Du,et al.  A Discriminative Metric Learning Based Anomaly Detection Method , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[34]  Vikas Sindhwani,et al.  An RKHS for multi-view learning and manifold co-regularization , 2008, ICML '08.

[35]  Constantine Caramanis,et al.  Robust PCA via Outlier Pursuit , 2010, IEEE Transactions on Information Theory.

[36]  Yun Fu,et al.  Robust Representation for Data Analytics , 2017, Advanced Information and Knowledge Processing.

[37]  Jianjiang Feng,et al.  Smooth Representation Clustering , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[39]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[40]  Shuicheng Yan,et al.  Exact Subspace Segmentation and Outlier Detection by Low-Rank Representation , 2012, AISTATS.

[41]  Arthur Zimek,et al.  Subsampling for efficient and effective unsupervised outlier detection ensembles , 2013, KDD.

[42]  Emmanuel Müller,et al.  Focused clustering and outlier detection in large attributed graphs , 2014, KDD.

[43]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[44]  Ashok N. Srivastava,et al.  Multiple kernel learning for heterogeneous anomaly detection: algorithm and aviation safety case study , 2010, KDD.

[45]  Martha White,et al.  Convex Multi-view Subspace Learning , 2012, NIPS.

[46]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[47]  Ming Shao,et al.  Generalized Transfer Subspace Learning Through Low-Rank Constraint , 2014, International Journal of Computer Vision.

[48]  Ira Assent,et al.  Outlier Detection with Space Transformation and Spectral Analysis , 2013, SDM.

[49]  Shuicheng Yan,et al.  Multi-task low-rank affinity pursuit for image segmentation , 2011, 2011 International Conference on Computer Vision.

[50]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[51]  Hanghang Tong,et al.  Non-Negative Residual Matrix Factorization with Application to Graph Anomaly Detection , 2011, SDM.

[52]  Vandana Pursnani Janeja,et al.  Multi-domain anomaly detection in spatial datasets , 2012, Knowledge and Information Systems.

[53]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[54]  Aristidis Likas,et al.  Kernel-Based Weighted Multi-view Clustering , 2012, 2012 IEEE 12th International Conference on Data Mining.

[55]  Yuhong Guo,et al.  Convex Subspace Representation Learning from Multi-View Data , 2013, AAAI.

[56]  Jiayu Zhou,et al.  Integrating low-rank and group-sparse structures for robust multi-task learning , 2011, KDD.

[57]  Sham M. Kakade,et al.  An Information Theoretic Framework for Multi-view Learning , 2008, COLT.

[58]  Alfred O. Hero,et al.  Multi-criteria Anomaly Detection using Pareto Depth Analysis , 2011, NIPS.

[59]  Ming Shao,et al.  Multi-View Low-Rank Analysis for Outlier Detection , 2015, SDM.

[60]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[61]  Philip S. Yu,et al.  An Efficient Approach for Outlier Detection with Imperfect Data Labels , 2014, IEEE Transactions on Knowledge and Data Engineering.

[62]  Xiaowei Zhou,et al.  Automatic mitral leaflet tracking in echocardiography by outlier detection in the low-rank representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Barnabás Póczos,et al.  Group Anomaly Detection using Flexible Genre Models , 2011, NIPS.

[64]  Xi Chen,et al.  Direct Robust Matrix Factorizatoin for Anomaly Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.