Human detection by searching in 3d space using camera and scene knowledge

Many existing human detection systems are based on sub-window classification, namely detection is done by enumerating rectangular sub-images in the 2D image space. Detection rate of such approaches may be affected by perspective distortion and tilted orientation of the human in images. To overcome this problem without re-training the classifier, we develop a 3D search method. A search grid is defined in the 3D scene. At each grid point a rectified sub-image is generated to approximate the orthogonal projection of the target, so that the distortion due to camera setting is reduced. In addition, 3D target position can be estimated from single camera data. Experiments on challenging data from the PETS2007 and CAVIAR INRIA datasets show significantly improved detection performance of our approach compared with the 2D search-based methods.

[1]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Ramakant Nevatia,et al.  Cluster Boosted Tree Classifier for Multi-View, Multi-Pose Object Detection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Larry S. Davis,et al.  Hierarchical Part-Template Matching for Human Detection and Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Peter H. Tu,et al.  View adaptive detection and distributed site wide tracking , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[7]  Luc Van Gool,et al.  Coupled Detection and Trajectory Estimation for Multi-Object Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Harry Shum,et al.  Statistical Learning of Multi-view Face Detection , 2002, ECCV.

[9]  Greg Mori,et al.  Detecting Pedestrians by Learning Shapelet Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.