Active Learning of Very-High Resolution Optical Imagery with SVM: Entropy vs Margin Sampling

An active learning method is proposed for the semi-automatic selection of training sets in remote sensing image classification. The method adds iteratively to the current training set the unlabeled pixels for which the prediction of an ensemble of classifiers based on bagged training sets show maximum entropy. This way, the algorithm selects the pixels that are the most uncertain and that will improve the model if added in the training set. The user is asked to label such pixels at each iteration. Experiments using support vector machines (SVM) on an 8 classes QuickBird image show the excellent performances of the methods, that equals accuracies of both a model trained with ten times more pixels and a model whose training set has been built using a state-of-the-art SVM specific active learning method.

[1]  Sankar K. Pal,et al.  Segmentation of multispectral remote sensing images using active support vector machines , 2004, Pattern Recognit. Lett..

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Joydeep Ghosh,et al.  An Active Learning Approach to Hyperspectral Data Classification , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[5]  Giles M. Foody,et al.  Multiclass and Binary SVM Classification: Implications for Training and Classification Users , 2008, IEEE Geoscience and Remote Sensing Letters.

[6]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[7]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[8]  Lorenzo Bruzzone,et al.  A Multilevel Context-Based System for Classification of Very High Spatial Resolution Images , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Lorenzo Bruzzone,et al.  Kernel-based methods for hyperspectral image classification , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[11]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[12]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[13]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.