Human Detection with Occlusion Handling by Over-Segmentation and Clustering on Foreground Regions

Two-dimensional image based human detection methods have been widely used in surveillance system. However, detecting human in the presence of occlusion is still a challenge for such image based systems. In this paper, a human detection method aiming to handle occlusions by using the depth data obtained from 3D imaging methods, such as those easily acquired from the Microsoft Kinect depth sensor, is proposed. In the context of surveillance setting, background subtraction on the depth data can be used to extract foreground regions which may correspond to humans. The proposed method analyzes the 3D data of the foreground regions using a "split-merge" approach. Over-segmentation and clustering are preformed on foreground regions followed by the height validation. Experimental results demonstrate that the proposed method outperforms two state-of-art human detection methods.

[1]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  James M. Keller,et al.  A system for change detection and human recognition in voxel space using the Microsoft Kinect sensor , 2011, 2011 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).

[4]  Larry S. Davis,et al.  Hierarchical Part-Template Matching for Human Detection and Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Radim Sára,et al.  A Weak Structure Model for Regular Pattern Recognition Applied to Facade Images , 2010, ACCV.

[7]  Jake K. Aggarwal,et al.  Human detection using depth information by Kinect , 2011, CVPR 2011 WORKSHOPS.

[8]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[11]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[12]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[13]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[14]  Kikuo Fujimura,et al.  Human detection using depth and gray images , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[15]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Hironobu Fujiyoshi,et al.  Real-Time Human Detection Using Relational Depth Similarity Features , 2010, ACCV.

[18]  Silvio Savarese,et al.  Detecting and tracking people using an RGB-D camera via multiple detector fusion , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[19]  Dariu Gavrila,et al.  Multi-cue Pedestrian Detection and Tracking from a Moving Vehicle , 2007, International Journal of Computer Vision.

[20]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.