Personness estimation for real-time human detection on mobile devices

A fast and accurate detection proposal method for the person category is proposed.Detection proposals are used by the part-based human detector in a improved way.High effectiveness of the proposed method is demonstrated on a real mobile device. One aim of detection proposal methods is to reduce the computational overhead of object detection. However, most of the existing methods have significant computational overhead for real-time detection on mobile devices. A fast and accurate proposal method of human detection called personness estimation is proposed, which facilitates real-time human detection on mobile devices and can be effectively integrated into part-based detection, achieving high detection performance at a low computational cost. Our work is based on two observations: (i) normed gradients, which are designed for generic objectness estimation, effectively generate high-quality detection proposals for the person category; (ii) fusing the normed gradients with color attributes improves the performance of proposal generation for human detection. Thus, the candidate windows generated by the personness estimation will very likely contain human subjects. The human detection is then guided by the candidate windows, offering high detection performance even when the detection task terminates prior to completion. This interruptible detection scheme, called anytime detection, enables real-time human detection on mobile devices. Furthermore, we introduce a new evaluation methodology called time-recall curves to practically evaluate our approach. The applicability of our proposed method is demonstrated in extensive experiments on a publicly available dataset and a real mobile device, facilitating acquisition and enhancement of portrait photographs (e.g. selfie) on widespread mobile platforms.

[1]  Bernt Schiele,et al.  What Makes for Effective Detection Proposals? , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Gary R. Bradski,et al.  Learning OpenCV - computer vision with the OpenCV library: software that sees , 2008 .

[4]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Kwanghoon Sohn,et al.  Real-time Human Detection based on Personness Estimation , 2015, BMVC.

[6]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[8]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  David A. Forsyth,et al.  30Hz Object Detection with DPM V5 , 2014, ECCV.

[10]  Kwanghoon Sohn,et al.  Non-parametric human segmentation using support vector machine , 2016, 2016 IEEE International Conference on Consumer Electronics (ICCE).

[11]  Baolin Yin,et al.  Cracking BING and Beyond , 2014, BMVC.

[12]  Cordelia Schmid,et al.  Learning Color Names for Real-World Applications , 2009, IEEE Transactions on Image Processing.

[13]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2014, Computational Visual Media.

[17]  Fahad Shahbaz Khan,et al.  Modulating Shape Features by Color Attention for Object Recognition , 2012, International Journal of Computer Vision.

[18]  Huimin Ma,et al.  Improving object proposals with multi-thresholding straddling expansion , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[20]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  James M. Rehg,et al.  RIGOR: Reusing Inference in Graph Cuts for Generating Object Regions , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[23]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[24]  Santiago Manen,et al.  Prime Object Proposals with Randomized Prim's Algorithm , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[26]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[27]  P. Kay,et al.  Basic Color Terms: Their Universality and Evolution , 1973 .

[28]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[29]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2019, Computational Visual Media.

[30]  Fahad Shahbaz Khan,et al.  Color attributes for object detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Trevor Darrell,et al.  Anytime Recognition of Objects and Scenes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Greg Mori,et al.  Detecting Pedestrians by Learning Shapelet Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[35]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[36]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Vladlen Koltun,et al.  Geodesic Object Proposals , 2014, ECCV.

[38]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[39]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[40]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.