Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

Many popular object detectors, such as AdaBoost, SVM and deformable part-based models (DPM), compute additive scoring functions at a large number of windows in an image pyramid, thus computational efficiency is an important consideration in real time applications besides accuracy. In this paper, a decision policy refers to a sequence of two-sided thresholds to execute early reject and early accept based on the cumulative scores at each step. We formulate an empirical risk function as the weighted sum of the cost of computation and the loss of false alarm and missing detection. Then a policy is said to be cost-sensitive and optimal if it minimizes the risk function. While the risk function is complex due to high-order correlations among the two-sided thresholds, we find that its upper bound can be optimized by dynamic programming efficiently. We show that the upper bound is very tight empirically and thus the resulting policy is said to be near-optimal. In experiments, we show that the decision policy outperforms state-of-the-art cascade methods significantly, in several popular detection tasks and benchmarks, in terms of computational efficiency with similar accuracy of detection.

[1]  Jonathan Brandt,et al.  Robust object detection via soft cascade , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Christopher K. I. Williams,et al.  Pascal Visual Object Classes Challenge Results , 2005 .

[3]  Song-Chun Zhu,et al.  Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection , 2015, 2013 IEEE International Conference on Computer Vision.

[4]  Nuno Vasconcelos,et al.  Learning Optimal Embedded Cascades , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[6]  David A. McAllester,et al.  The Generalized A* Architecture , 2007, J. Artif. Intell. Res..

[7]  Jiri Matas,et al.  WaldBoost - learning for time constrained sequential detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Nuno Vasconcelos,et al.  Risk minimization, probability elicitation, and cost-sensitive SVMs , 2010, ICML.

[9]  Adrian Barbu Multi-Path Marginal Space Learning for Object Detection , 2014 .

[10]  S. Ullman Visual routines , 1984, Cognition.

[11]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[12]  Song-Chun Zhu,et al.  A Numerical Study of the Bottom-Up and Top-Down Inference Processes in And-Or Graphs , 2011, International Journal of Computer Vision.

[13]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[15]  Yali Amit,et al.  A coarse-to-fine strategy for multiclass shape detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Hichem Sahbi,et al.  A Hierarchy of Support Vector Machines for Pattern Detection , 2006, J. Mach. Learn. Res..

[18]  D. Siegmund Sequential Analysis: Tests and Confidence Intervals , 1985 .

[19]  Deva Ramanan,et al.  Steerable part models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Daphne Koller,et al.  Active Classification based on Value of Classifier , 2011, NIPS.

[23]  Donald Geman,et al.  A Design Principle for Coarse-to-Fine Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[25]  James M. Rehg,et al.  On the Design of Cascades of Boosted Ensembles for Face Detection , 2008, International Journal of Computer Vision.

[26]  James M. Rehg,et al.  Fast Asymmetric Learning for Cascade Face Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Nuno Vasconcelos,et al.  Cost-Sensitive Boosting , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Rong Xiao,et al.  Dynamic Cascades for Face Detection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29]  François Fleuret,et al.  Joint Cascade Optimization Using A Product Of Boosted Classifiers , 2010, NIPS.

[30]  Christoph H. Lampert,et al.  Efficient Subwindow Search: A Branch and Bound Framework for Object Localization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  D. Geman,et al.  Hierarchical testing designs for pattern recognition , 2005, math/0507421.

[32]  Henry Schneiderman,et al.  Feature-centric evaluation for efficient cascaded object detection , 2004, CVPR 2004.

[33]  Yunde Jia,et al.  Discriminatively Trained And-Or Tree Models for Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Mohamed R. Amer,et al.  Cost-Sensitive Top-Down/Bottom-Up Inference for Multiscale Activity Recognition , 2012, ECCV.

[35]  Michael Werman,et al.  Robust Real-Time Pattern Matching Using Bayesian Sequential Hypothesis Testing , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Kilian Q. Weinberger,et al.  Classifier Cascade for Minimizing Feature Evaluation Cost , 2012, AISTATS.

[37]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[38]  Junjie Yan,et al.  The Fastest Deformable Part Model for Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  J. Andel Sequential Analysis , 2022, The SAGE Encyclopedia of Research Design.

[40]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  Iasonas Kokkinos,et al.  Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound , 2011, NIPS.

[42]  Nathan R. Sturtevant,et al.  Learning when to stop thinking and do something! , 2009, ICML '09.

[43]  Trevor Darrell,et al.  Anytime Recognition of Objects and Scenes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Iasonas Kokkinos,et al.  Inference and Learning with Hierarchical Shape Models , 2011, International Journal of Computer Vision.

[45]  George J. Pappas,et al.  Active Deformable Part Models Inference , 2014, ECCV.

[46]  J. Andrew Bagnell,et al.  SpeedBoost: Anytime Prediction with Uniform Near-Optimality , 2012, AISTATS.

[47]  Kai Ming Ting,et al.  A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.