A One-Class Kernel Fisher Criterion for Outlier Detection

Recently, Dufrenois and Noyer proposed a one class Fisher's linear discriminant to isolate normal data from outliers. In this paper, a kernelized version of their criterion is presented. Originally on the basis of an iterative optimization process, alternating between subspace selection and clustering, I show here that their criterion has an upper bound making these two problems independent. In particular, the estimation of the label vector is formulated as an unconstrained binary linear problem (UBLP) which can be solved using an iterative perturbation method. Once the label vector is estimated, an optimal projection subspace is obtained by solving a generalized eigenvalue problem. Like many other kernel methods, the performance of the proposed approach depends on the choice of the kernel. Constructed with a Gaussian kernel, I show that the proposed contrast measure is an efficient indicator for selecting an optimal kernel width. This property simplifies the model selection problem which is typically solved by costly (generalized) cross-validation procedures. Initialization, convergence analysis, and computational complexity are also discussed. Lastly, the proposed algorithm is compared with recent novelty detectors on synthetic and real data sets.

[1]  Jean-Charles Noyer,et al.  Formulating Robust Linear Regression Estimation as a One-Class LDA Criterion: Discriminative Hat Matrix , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Jean-Charles Noyer,et al.  A kernel hat matrix based rejection criterion for outlier removal in support vector regression , 2009, 2009 International Joint Conference on Neural Networks.

[3]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[4]  Koby Crammer,et al.  A needle in a haystack: local one-class optimization , 2004, ICML.

[5]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[6]  Volker Roth,et al.  Kernel Fisher Discriminants for Outlier Detection , 2006, Neural Computation.

[7]  R. A. Mollineda,et al.  The class imbalance problem in pattern classification and learning , 2009 .

[8]  M. Kawanabe,et al.  Direct importance estimation for covariate shift adaptation , 2008 .

[9]  Michael I. Jordan,et al.  Robust Novelty Detection with Single-Class MPM , 2002, NIPS.

[10]  David M. J. Tax,et al.  One-class classification , 2001 .

[11]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[12]  Ivor W. Tsang,et al.  Learning to Locate Relative Outliers , 2011, ACML.

[13]  Takeo Kanade,et al.  Discriminative cluster analysis , 2006, ICML.

[14]  Sarangapani Jagannathan,et al.  An Online Outlier Identification and Removal Scheme for Improving Fault Detection Performance , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Jing Gao,et al.  Semi-supervised outlier detection , 2006, SAC '06.

[16]  Huanhuan Chen,et al.  Learning in the Model Space for Cognitive Fault Diagnosis , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Jieping Ye,et al.  Discriminative K-means for Clustering , 2007, NIPS.

[18]  Robert P. W. Duin,et al.  Uniform Object Generation for Optimizing One-class Classifiers , 2002, J. Mach. Learn. Res..

[19]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[20]  Takafumi Kanamori,et al.  Statistical outlier detection using direct density ratio estimation , 2011, Knowledge and Information Systems.