The Mean and Median Criteria for Kernel Bandwidth Selection for Support Vector Data Description

Support vector data description (SVDD) is a popular technique for detecting anomalies. The SVDD classifier partitions the whole space into an inlier region, which consists of the region near the training data, and an outlier region, which consists of points away from the training data. The computation of the SVDD classifier requires a kernel function, and the Gaussian kernel is a common choice for the kernel function. The Gaussian kernel has a bandwidth parameter, whose value is important for good results. A small bandwidth leads to overfitting, and the resulting SVDD classifier overestimates the number of anomalies. A large bandwidth leads to underfitting, and the classifier fails to detect many anomalies. In this paper we present a new automatic, unsupervised method for selecting the Gaussian kernel bandwidth. The selected value can be computed quickly, and it is competitive with existing bandwidth selection methods.

[1]  Jorge Silva,et al.  Peak criterion for choosing Gaussian kernel bandwidth in Support Vector Data Description , 2017, 2017 IEEE International Conference on Prognostics and Health Management (ICPHM).

[2]  Deovrat Kakde,et al.  Kernel bandwidth selection for SVDD: The sampling peak criterion method for large data , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[3]  Ratna Babu Chinnam,et al.  General support vector representation machine for one-class classification of non-stationary classes , 2008, Pattern Recognit..

[4]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[5]  Giles M. Foody,et al.  Sanchez-Hernandez, Carolina and Boyd, Doreen S. and Foody, Giles M. (2007) One-class classification for monitoring a specific land cover class: SVDD classification of fenland. IEEE Transactions on , 2016 .

[6]  Shehroz S. Khan,et al.  A Survey of Recent Trends in One Class Classification , 2009, AICS.

[7]  Robert P. W. Duin,et al.  Robust machine fault detection with independent component analysis and support vector data description , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[8]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[9]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[10]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[11]  P. Ho Geoscience And Remote Sensing , 2014 .

[12]  Seoung Bum Kim,et al.  One-class classification-based control charts for multivariate process monitoring , 2009 .

[13]  Bo-Suk Yang,et al.  Support vector machine in machine condition monitoring and fault diagnosis , 2007 .

[14]  Barry M. Wise,et al.  A comparison of principal component analysis, multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process , 1999 .