Active Detection via Adaptive Submodularity

Efficient detection of multiple object instances is one of the fundamental challenges in computer vision. For certain object categories, even the best automatic systems are yet unable to produce high-quality detection results, and fully manual annotation would be an expensive process. How can detection algorithms interplay with human expert annotators? To make the best use of scarce (human) labeling resources, one needs to decide when to invoke the expert, such that the best possible performance can be achieved while requiring a minimum amount of supervision. In this paper, we propose a principled approach to active object detection, and show that for a rich class of base detectors algorithms, one can derive a natural sequential decision problem for deciding when to invoke expert supervision. We further show that the objective function satisfies adaptive submodularity, which allows us to derive strong performance guarantees for our algorithm. We demonstrate the proposed algorithm on three real-world tasks, including a problem for biodiversity monitoring from micro UAVs in the Sumatra rain forest. Our results show that active detection not only outperforms its passive counterpart; for certain tasks, it also works significantly better than straightforward application of existing active learning techniques. To the best of our knowledge, our approach is the first to rigorously address the active detection problem from both empirical and theoretical perspectives.

[1]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[2]  Pietro Perona,et al.  Strong supervision from weak annotation: Interactive training of deformable part models , 2011, 2011 International Conference on Computer Vision.

[3]  S. Wich,et al.  Dawn of Drone Ecology: Low-Cost Autonomous Aerial Vehicles for Conservation , 2012 .

[4]  Yoav Freund,et al.  Active learning for visual object recognition , 2005 .

[5]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[6]  Kristen Grauman,et al.  Multi-Level Active Prediction of Useful Image Annotations for Recognition , 2008, NIPS.

[7]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Nikolaos Papanikolopoulos,et al.  Scalable Active Learning for Multiclass Image Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[11]  Pushmeet Kohli,et al.  On Detection of Multiple Object Instances Using Hough Transforms , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Kristen Grauman,et al.  Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds , 2011, CVPR 2011.

[13]  Devi Parikh,et al.  Attributes for Classifier Feedback , 2012, ECCV.

[14]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Pietro Perona,et al.  Multiclass recognition and part localization with humans in the loop , 2011, 2011 International Conference on Computer Vision.

[16]  David A. McAllester,et al.  Object Detection with Grammar Models , 2011, NIPS.

[17]  Lawrence O. Hall,et al.  Active learning to recognize multiple types of plankton , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[18]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[19]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Abhinav Gupta,et al.  Constrained Semi-Supervised Learning Using Attributes and Comparative Attributes , 2012, ECCV.

[21]  Adriana Kovashka,et al.  Actively selecting annotations among objects and attributes , 2011, 2011 International Conference on Computer Vision.

[22]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Pietro Perona,et al.  Visual Recognition with Humans in the Loop , 2010, ECCV.

[24]  P.V.C. Hough,et al.  Machine Analysis of Bubble Chamber Pictures , 1959 .

[25]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[26]  Prateek Jain,et al.  Far-sighted active learning on a budget for image and video recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.