A Core Set Based Large Vector-Angular Region and Margin Approach for Novelty Detection

A large vector-angular region and margin (LARM) approach is presented for novelty detection based on imbalanced data. The key idea is to construct the largest vector-angular region in the feature space to separate normal training patterns; meanwhile, maximize the vector-angular margin between the surface of this optimal vector-angular region and abnormal training patterns. In order to improve the generalization performance of LARM, the vector-angular distribution is optimized by maximizing the vector-angular mean and minimizing the vector-angular variance, which separates the normal and abnormal examples well. However, the inherent computation of quadratic programming (QP) solver takes training time and at least space, which might be computational prohibitive for large scale problems. By and -approximation algorithm, the core set based LARM algorithm is proposed for fast training LARM problem. Experimental results based on imbalanced datasets have validated the favorable efficiency of the proposed approach in novelty detection.

[1]  David A. Clifton,et al.  Identification of patient deterioration in vital-sign data using one-class support vector machines , 2011, 2011 Federated Conference on Computer Science and Information Systems (FedCSIS).

[2]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[3]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[4]  Chih-Jen Lin,et al.  Training v-Support Vector Classifiers: Theory and Algorithms , 2001, Neural Computation.

[5]  Chih-Jen Lin,et al.  Training ν-Support Vector Classifiers: Theory and Algorithms , 2001 .

[6]  Jieping Ye,et al.  A Small Sphere and Large Margin Approach for Novelty Detection Using Training Data with Outliers , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Korris Fu-Lai Chung,et al.  The Maximum Vector-Angular Margin Classifier and its fast training on large datasets using a core vector machine , 2012, Neural Networks.

[8]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[9]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[10]  Lionel Tarassenko,et al.  Static and dynamic novelty detection methods for jet engine health monitoring , 2007, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[11]  Zhi-Hua Zhou,et al.  On the doubt about margin explanation of boosting , 2010, Artif. Intell..

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[14]  David J. Brown,et al.  A Two-Phase Method of Detecting Abnormalities in Aircraft Flight Data and Ranking Their Impact on Individual Flights , 2012, IEEE Transactions on Intelligent Transportation Systems.

[15]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[16]  Jiankang Dong,et al.  Fault diagnosis for the landing phase of the aircraft based on an adaptive kernel principal component analysis algorithm , 2015, J. Syst. Control. Eng..

[17]  Zhan Yong-zhao,et al.  Support vector data description discriminant analysis , 2011 .

[18]  Guofei Gu,et al.  Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems , 2006, Sixth International Conference on Data Mining (ICDM'06).

[19]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[20]  Bartosz Krawczyk,et al.  One-class classifiers with incremental learning and forgetting for data streams with concept drift , 2015, Soft Comput..

[21]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[22]  Bartosz Krawczyk,et al.  Incremental weighted one-class classifier for mining stationary data streams , 2015, J. Comput. Sci..

[23]  Jacek M. Zurada,et al.  Generalized Core Vector Machines , 2006, IEEE Transactions on Neural Networks.

[24]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.