An improved one-class support vector machine classifier for outlier detection

Outlier detection, as a type of one-class classification problem, is one of important research topics in data mining and machine learning. Its task is to identify sample points markedly deviating from the normal data. A reliable outlier detector needs to build a model which encloses the normal data tightly. In this paper, an improved one-class SVM (OC-SVM) classifier is proposed for outlier detection problems. We name this method OC-SVM with minimum within-class scatter (OC-WCSSVM), which exploits the inner-class structure of the training set via minimizing the within-class scatter of the training data. This can construct a more accurate hyperplane for outlier detection, such that the margin between the training data and the origin in a higher dimensional space is as large as possible, while at the same time the decision boundary around the normal data is as tight as possible. Experimental results on a synthetic dataset and 10 real-world datasets demonstrate that our proposed OC-WCSSVM algorithm is effective and superior to the compared algorithms.

[1]  V. Roth Kernel Fisher Discriminants for Outlier Detection , 2006 .

[2]  Asma Rabaoui,et al.  Using One-Class SVMs and Wavelets for Audio Surveillance , 2008, IEEE Transactions on Information Forensics and Security.

[3]  Marimuthu Palaniswami,et al.  Centered Hyperspherical and Hyperellipsoidal One-Class Support Vector Machines for Anomaly Detection in Sensor Networks , 2010, IEEE Transactions on Information Forensics and Security.

[4]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[5]  Tao Guo,et al.  Neural data mining for credit card fraud detection , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[6]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[7]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[8]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[9]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[10]  Francesca Bovolo,et al.  Semisupervised One-Class Support Vector Machines for Classification of Remote Sensing Data , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Klaus-Robert Müller,et al.  Intrusion detection in unlabeled data with quarter-sphere Support Vector Machines , 2004 .

[12]  Stephan K. Chalup,et al.  Application of SVMs for Colour Classification and Collision Detection with AIBO Robots , 2003, NIPS.

[13]  Nirvana Meratnia,et al.  Outlier Detection Techniques for Wireless Sensor Networks: A Survey , 2008, IEEE Communications Surveys & Tutorials.

[14]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[15]  Chen Cai-kou Fisher Large Margin Linear Classifier , 2007 .

[16]  Bernhard Schölkopf,et al.  SV Estimation of a Distribution's Support , 1999, NIPS 1999.

[17]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[18]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[19]  San-Yih Hwang,et al.  A process-mining framework for the detection of healthcare fraud and abuse , 2006, Expert Syst. Appl..

[20]  Lorenzo Bruzzone,et al.  A Support Vector Domain Description Approach to Supervised Classification of Remote Sensing Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[21]  Volker Roth,et al.  Outlier Detection with One-class Kernel Fisher Discriminants , 2004, NIPS.

[22]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[23]  Colin Campbell,et al.  A Linear Programming Approach to Novelty Detection , 2000, NIPS.

[24]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[27]  Robert P. W. Duin,et al.  A Matlab Toolbox for Pattern Recognition , 2004 .

[28]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[29]  Defeng Wang,et al.  Structured One-Class Classification , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[30]  Mangui Liang,et al.  Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises , 2013, Neurocomputing.

[31]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[32]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[33]  Jieping Ye,et al.  A Small Sphere and Large Margin Approach for Novelty Detection Using Training Data with Outliers , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Stephen J. Roberts,et al.  A Probabilistic Resource Allocating Network for Novelty Detection , 1994, Neural Computation.