Active Learning from Positive and Unlabeled Data

During recent years, active learning has evolved into a popular paradigm for utilizing user's feedback to improve accuracy of learning algorithms. Active learning works by selecting the most informative sample among unlabeled data and querying the label of that point from user. Many different methods such as uncertainty sampling and minimum risk sampling have been utilized to select the most informative sample in active learning. Although many active learning algorithms have been proposed so far, most of them work with binary or multi-class classification problems and therefore can not be applied to problems in which only samples from one class as well as a set of unlabeled data are available. Such problems arise in many real-world situations and are known as the problem of learning from positive and unlabeled data. In this paper we propose an active learning algorithm that can work when only samples of one class as well as a set of unlabeled data are available. Our method works by separately estimating probability density of positive and unlabeled points and then computing expected value of in formativeness to get rid of a hyper-parameter and have a better measure of in formativeness. Experiments and empirical analysis show promising results compared to other similar methods.

[1]  Yiannis S. Boutalis,et al.  CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[2]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[3]  Pietro Perona,et al.  Entropy-based active learning for object recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[4]  Wanli Zuo,et al.  Learning from Positive and Unlabeled Examples: A Survey , 2008, 2008 International Symposiums on Information Processing.

[5]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[6]  Thomas S. Huang,et al.  One-class SVM for learning in image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[7]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Jingrui He,et al.  Generalized Manifold-Ranking-Based Image Retrieval , 2006, IEEE Transactions on Image Processing.

[9]  Marius Kloft,et al.  Active and Semi-supervised Data Domain Description , 2009, ECML/PKDD.

[10]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[11]  F. Denis Classification and Co-training from Positive and Unlabeled Examples , 2003 .

[12]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[13]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[14]  Bianca Zadrozny,et al.  Outlier detection by active learning , 2006, KDD '06.

[15]  Joachim Denzler,et al.  One-class classification with Gaussian processes , 2013, Pattern Recognit..