ACTIVE ONE-CLASS LEARNING BY KERNEL DENSITY ESTIMATION

Active learning has been a popular area of research in recent years. It can be used to improve the performance of learning tasks by asking the labels of unlabeled data from the user. In these methods, the goal is to achieve the highest possible accuracy gain while posing minimum queries to the user. The existing approaches for active learning have been mostly applicable to the traditional binary or multi-class classification problems. However, in many real-world situations, we encounter problems in which we have access only to samples of one class. These problems are known as one-class learning or outlier detection problems and the User relevance feedback in image retrieval systems is an example of such problems. In this paper, we propose an active learning method which uses only samples of one class. We use kernel density estimation as the baseline of one-class learning algorithm and then introduce some confidence criteria to select the best sample to be labeled by the user. The experimental results on real world and artificial datasets show that in the proposed method, the average gain in accuracy is increased significantly, compared to the popular random unlabeled sample selection strategy.

[1]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[2]  Wanli Zuo,et al.  Learning from Positive and Unlabeled Examples: A Survey , 2008, 2008 International Symposiums on Information Processing.

[3]  Yiannis S. Boutalis,et al.  CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[4]  Kongqiao Wang,et al.  Active learning for image retrieval with Co-SVM , 2007, Pattern Recognit..

[5]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[6]  Stergios B. Fotopoulos,et al.  All of Nonparametric Statistics , 2007, Technometrics.

[7]  J. Marron,et al.  Smoothed cross-validation , 1992 .

[8]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[9]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[10]  Zhi-Hua Zhou When semi-supervised learning meets ensemble learning , 2011 .

[11]  Bianca Zadrozny,et al.  Outlier detection by active learning , 2006, KDD '06.

[12]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[13]  Tom M. Mitchell,et al.  Semi-Supervised Text Classification Using EM , 2006, Semi-Supervised Learning.

[14]  Zhi-Hua Zhou,et al.  Learning with Unlabeled Data and Its Application to Image Retrieval , 2006, PRICAI.

[15]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[16]  Jingrui He,et al.  Generalized Manifold-Ranking-Based Image Retrieval , 2006, IEEE Transactions on Image Processing.

[17]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[18]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[19]  Marius Kloft,et al.  Active and Semi-supervised Data Domain Description , 2009, ECML/PKDD.

[20]  Wei Liu,et al.  Robust multi-class transductive learning with graphs , 2009, CVPR.

[21]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[22]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[23]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .