Learning anomalous features via sparse coding using matrix norms

Our goal is to find anomalous features in a dataset using the sparse coding concept of dictionary learning. Rather than using the averaged column ℓ2-norm for the dictionary update as is typically done in sparse coding, we explore using three matrix norms: ∥·∥1, ∥·∥2, and ∥·∥∞. Minimizing the matrix norms represents minimizing a maximum deviation in the reconstruction error rather than an average deviation, hopefully allowing us to find features that contribute significantly but infrequently to sample training points. We find that while solving for the dictionaries using matrix norm minimization takes longer to compute, all three methods are able to recover a known basis from a simple set of training data. In addition, the ∥·∥1 matrix norm is able to recover a known anomalous feature in the training data that the other norms (including the standard averaged ℓ2-norm) are unable to find.

[1]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[3]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[4]  Edward Y. Chang,et al.  Class-Boundary Alignment for Imbalanced Dataset Learning , 2003 .

[5]  IEEE Signal Processing and Signal Processing Education Workshop, SP/SPE 2015, Salt Lake City, UT, USA, August 9-12, 2015 , 2015, SP/SPE.

[6]  Jarvis D. Haupt,et al.  Identifying Outliers in Large Matrices via Randomized Adaptive Compressive Sampling , 2014, IEEE Transactions on Signal Processing.

[7]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[8]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[9]  Bruno A. Olshausen,et al.  Learning Sparse Codes for Hyperspectral Imagery , 2011, IEEE Journal of Selected Topics in Signal Processing.

[10]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[11]  Cewu Lu,et al.  Online Robust Dictionary Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[13]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[14]  Stephen P. Boyd,et al.  Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.

[15]  Wei Dai,et al.  Dictionary learning and update based on simultaneous codeword optimization (SimCO) , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Andrew W. Moore,et al.  Bayesian Network Anomaly Pattern Detection for Disease Outbreaks , 2003, ICML.

[17]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[18]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[19]  Jean Ponce,et al.  Sparse Modeling for Image and Vision Processing , 2014, Found. Trends Comput. Graph. Vis..

[20]  Foster Provost,et al.  Machine Learning from Imbalanced Data Sets 101 , 2008 .

[21]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.