Adaptive cascade threshold learning from negative samples for deformable part models

A solution to deploy object detection systems to practical applications is to build cascade frameworks which do threshold comparisons in each stage to efficiently discard a large number of negative objects. For particular applications, these thresholds should be retrained for better effectiveness and the efficiency via training datasets. It means that we have to store labeled datasets permanently or collect huge data (for high-quality thresholds) whenever learning new thresholds. Both approaches are inconvenient and expensive in terms of memory and data collection cost. In this paper, we propose a novel threshold selection mechanism, named Adaptive Cascade Threshold Learning (ACTL), which learns thresholds directly from non-object regions in a single input image instead of object regions from large training data as other existing methods. As a result, we can completely remove the need of training data for cascade threshold learning. Experimental results on two problems of object detection and face detection confirm that our proposed method can obtain the same level of accuracy and speed as state-of-the-art cascade DPM methods but it has the benefit of no threshold training data.

[1]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[2]  Erik Learned-Miller,et al.  FDDB: A benchmark for face detection in unconstrained settings , 2010 .

[3]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Xiaolin Hu,et al.  Joint Training of Cascaded CNN for Face Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Luc Van Gool,et al.  Efficient Non-Maximum Suppression , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[8]  Rong Xiao,et al.  Dynamic Cascades for Face Detection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  James M. Rehg,et al.  On the Design of Cascades of Boosted Ensembles for Face Detection , 2008, International Journal of Computer Vision.

[10]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Junjie Yan,et al.  The Fastest Deformable Part Model for Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jonathan Brandt,et al.  Robust object detection via soft cascade , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Matthew B. Blaschko,et al.  Non Maximal Suppression in Cascaded Ranking Models , 2013, SCIA.

[16]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[17]  François Fleuret,et al.  Exact Acceleration of Linear Object Detectors , 2012, ECCV.

[18]  Huitao Luo,et al.  Optimization design of cascaded classifiers , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).