Balanced Active Learning Method for Image Classification

The manual labeling of natural images is and has always been painstaking and slow process, especially when large data sets are involved. Nowadays, many studies focus on solving this problem, and most of them use active learning, which offers a solution for reducing the number of images that need to be labeled. Active learning procedures usually select a subset of the whole data by iteratively querying the unlabeled instances based on their predicted informativeness. One way of estimating the information content of an image is by using uncertainty sampling as a query strategy. This basic technique can significantly reduce the number of label needed; e.g. to set up a good model for classification. Our goal was to improve this method by balancing the distribution of the already labeled images. This modification is based on a novel metric that we present in this paper. We conducted experiments on two popular data sets to demonstrate the efficiency of our proposed balanced active learning (BAL) approach, and the results showed that it outperforms the basic uncertainty sampling.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[3]  Stephen E. Robertson,et al.  A new interpretation of average precision , 2008, SIGIR '08.

[4]  Kazufumi Kaneda,et al.  Image sequence recognition with active learning using uncertainty sampling , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[5]  Jan Kautz,et al.  Hierarchical Subquery Evaluation for Active Learning on a Graph , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[7]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[8]  Chih-Jen Lin,et al.  Generalized Bradley-Terry Models and Multi-Class Probability Estimates , 2006, J. Mach. Learn. Res..

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  Yi Yang,et al.  Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization , 2015, International Journal of Computer Vision.

[11]  Jun Zhou,et al.  Maximizing Expected Model Change for Active Learning in Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[12]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[13]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[14]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[15]  Bin Li,et al.  A survey on instance selection for active learning , 2012, Knowledge and Information Systems.

[16]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[17]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[18]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[19]  Dino Ienco,et al.  High density-focused uncertainty sampling for active learning over evolving stream data , 2014, BigMine.

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[21]  Manali Sharma,et al.  Evidence-based uncertainty sampling for active learning , 2016, Data Mining and Knowledge Discovery.

[22]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[23]  William J. Emery,et al.  SVM Active Learning Approach for Image Classification Using Spatial Information , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Chuang-Hua Chueh,et al.  Cross-Domain Opinion Word Identification with Query-By-Committee Active Learning , 2014, TAAI.

[26]  C. Tomasi Estimating Gaussian Mixture Densities with EM – A Tutorial , 2004 .

[27]  Douglas A. Reynolds,et al.  Gaussian Mixture Models , 2018, Encyclopedia of Biometrics.

[28]  Sham M. Kakade,et al.  Convergence Rates of Active Learning for Maximum Likelihood Estimation , 2015, NIPS.

[29]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[30]  Kristen Grauman,et al.  Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds , 2011, CVPR 2011.

[31]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.