Enabling learning from large datasets: applying active learning to mobile robotics

Autonomous navigation in outdoor, off-road environments requires solving complex classification problems. Obstacle detection, road following and terrain classification are examples of tasks which have been successfully approached using supervised machine learning techniques for classification. Large amounts of training data are usually necessary in order to achieve satisfactory generalization. In such cases, manually labeling data becomes an expensive and tedious process. This work describes a method for reducing the amount of data that needs to be presented to a human trainer. The algorithm relies on kernel density estimation in order to identify "interesting" scenes in a dataset. Our method does not require any interaction with a human expert for selecting the images, and only minimal amounts of tuning are necessary. We demonstrate its effectiveness in several experiments using data collected with two different vehicles. We first show that our method automatically selects those scenes from a large dataset that a person would consider "important" for classification tasks. Secondly, we show that by labeling only few of the images selected by our method, we obtain classification performance that is comparable to the one reached after labeling hundreds of images from the same dataset.

[1]  Tommy Chang,et al.  Feature detection and tracking for mobile robots using a combination of ladar and color images , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[2]  Andrew W. Moore,et al.  Fast, Robust Adaptive Control by Learning only Forward Models , 1991, NIPS.

[3]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[4]  Christopher Rasmussen,et al.  Combining laser range, color, and texture cues for autonomous road following , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[5]  Andrew W. Moore,et al.  Rapid Evaluation of Multiple Density Models , 2003, AISTATS.

[6]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[7]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[8]  Andrew W. Moore,et al.  Very Fast EM-Based Mixture Model Clustering Using Multiresolution Kd-Trees , 1998, NIPS.

[9]  Anthony Stentz,et al.  Learning Predictions of the Load-Bearing Surface for Autonomous Rough-Terrain Navigation in Vegetation , 2003, FSR.

[10]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[11]  Roberto Manduchi,et al.  Terrain perception for DEMO III , 2000, Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No.00TH8511).

[12]  Dean A. Pomerleau,et al.  Progress in neural network-based vision for autonomous robot driving , 1992, Proceedings of the Intelligent Vehicles `92 Symposium.

[13]  M. Rosenblum,et al.  A high fidelity multi-sensor scene understanding system for autonomous navigation , 2000, Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No.00TH8511).

[14]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[15]  Andrew W. Moore,et al.  'N-Body' Problems in Statistical Learning , 2000, NIPS.

[16]  Andrew W. Moore,et al.  Efficient memory-based learning for robot control , 1990 .

[17]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[18]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[19]  David G. Stork,et al.  Pattern Classification , 1973 .