Sparse auto-encoder based feature learning for human body detection in depth image

Human body detection in depth image is an active research topic in computer vision. But depth feature extraction is still an open problem. In this paper, a novel feature learning method based on sparse auto-encoder (SAE) is proposed for human body detection in depth image. The proposed learning based feature enables capturing the intrinsic human body structure. To further reduce the computation cost of SAE, both convolution neural network and pooling are introduced to reduce the training complexity. In addition, upon learning SAE based depth feature, we further pursuit the detector efficiency. A beyond sliding window localization strategy is proposed based on the fact that the depth values of object surface are almost the same. The proposed localization strategy first uses the histogram of depth to generate candidate detection window center, and then exploits the relationship between human body height and depth to determine the detection window size. Thus, it can avoid the time-consuming sliding window search, and further enables fast human body localization. Experiments on SZU Depth Pedestrian dataset verify the effectiveness of our proposed method. HighlightsSparse auto-encoder is used to learn depth feature for human detection.A beyond sliding window localization method based on depth value.

[1]  Dan Levi,et al.  Part-Based Feature Synthesis for Human Detection , 2010, ECCV.

[2]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Hironobu Fujiyoshi,et al.  Real-Time Human Detection Using Relational Depth Similarity Features , 2010, ACCV.

[5]  Xuelong Li,et al.  Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search , 2013, IEEE Transactions on Image Processing.

[6]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[7]  Luc Van Gool,et al.  PRISM: PRincipled Implicit Shape Model , 2009, BMVC.

[8]  Te-Feng Su,et al.  Search Space Reduction in Pedestrian Detection for Driver Assistance System Based on Projective Geometry , 2013 .

[9]  David A. Forsyth,et al.  30Hz Object Detection with DPM V5 , 2014, ECCV.

[10]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Xiaodong Yang,et al.  Super Normal Vector for Activity Recognition Using Depth Sequences , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  David Gerónimo Gómez,et al.  Survey of Pedestrian Detection for Advanced Driver Assistance Systems , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Dan Levi,et al.  Fast Multiple-Part Based Object Detection Using KD-Ferns , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Shaozi Li,et al.  A survey on pedestrian detection , 2012 .

[18]  Satoshi Ito,et al.  Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection , 2009, PSIVT.

[19]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Andrew Zisserman,et al.  An Exemplar Model for Learning Object Classes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Liang Wang,et al.  SLTP: A Fast Descriptor for People Detection in Depth Images , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[22]  Serge J. Belongie,et al.  Integral Channel Features - Addendum , 2009 .

[23]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[25]  Jonathan T. Barron,et al.  A category-level 3-D object dataset: Putting the Kinect to work , 2011, ICCV Workshops.

[26]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Yue Gao,et al.  3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[28]  Deva Ramanan,et al.  Histograms of Sparse Codes for Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[30]  SchieleBernt,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008 .

[31]  Dariu Gavrila,et al.  Pedestrian Detection from a Moving Vehicle , 2000, ECCV.

[32]  Shiqi Yu,et al.  An attempt to pedestrian detection in depth images , 2011, 2011 Third Chinese Conference on Intelligent Visual Surveillance.

[33]  Jake K. Aggarwal,et al.  Human detection using depth information by Kinect , 2011, CVPR 2011 WORKSHOPS.

[34]  Jake K. Aggarwal,et al.  Human activity recognition from 3D data: A review , 2014, Pattern Recognit. Lett..

[35]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[36]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Christoph Mertz,et al.  Pedestrian Detection and Tracking Using Three-dimensional LADAR Data , 2010, Int. J. Robotics Res..

[39]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Xindong Wu,et al.  3-D Object Retrieval With Hausdorff Distance Learning , 2014, IEEE Transactions on Industrial Electronics.

[41]  E. Rückert Detecting Pedestrians by Learning Shapelet Features , 2007 .

[42]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[43]  Junjie Yan,et al.  The Fastest Deformable Part Model for Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Josechu J. Guerrero,et al.  Exploiting projective geometry for view-invariant monocular human motion analysis in man-made environments , 2014, Comput. Vis. Image Underst..

[45]  Xiaojin Gong,et al.  A new depth descriptor for pedestrian detection in RGB-D images , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[46]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[47]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[48]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  A. Howard,et al.  Results from a Real-time Stereo-based Pedestrian Detection System on a Moving Vehicle , 2009 .

[50]  Chu-Song Chen,et al.  Fast Human Detection Using a Novel Boosted Cascading Structure With Meta Stages , 2008, IEEE Transactions on Image Processing.

[51]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[52]  Hong Wei,et al.  A survey of human motion analysis using depth imagery , 2013, Pattern Recognit. Lett..