SURF cascade face detection acceleration on Sandy Bridge processor

Along with the inclusion of GPU cores within the same CPU die, the performance of Intel's processor-graphics has been significantly improved over earlier generation of integrated graphics. This paper presents a highly optimized SURF cascade based face detector which efficiently exploits both CPU and GPU computing power on the latest Sandy Bridge processor. The SURF cascade classifier procedure is partitioned into two phases in order to leverage both thread level and data level parallelism in the GPU. The integral image function running in the CPU core can work with the GPU in parallel. We measure the performance and power of the GPU implementation on the latest Sandy Bridge platform. The experimental results show that our proposed GPU implementation achieves a 2.98 speedup and a 1.42 speedup compared to the single thread and multi-thread CPU implementation. At the same time, the power usage can be reduced as much as 50% compared to the CPU implementation. In addition, our proposed method presents a general approach for task partitioning between CPU and GPU, thus being beneficial not only for face detection but also for other computer vision applications.

[1]  Pavel Zemcík,et al.  Local Rank Patterns - Novel Features for Rapid Object Detection , 2008, ICCVG.

[2]  Pavel Zemcík,et al.  Real-time object detection on CUDA , 2010, Journal of Real-Time Image Processing.

[3]  Amit A. Kale,et al.  Towards a robust, real-time face processing system using CUDA-enabled GPUs , 2009, 2009 International Conference on High Performance Computing (HiPC).

[4]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[5]  Jonathan Brandt,et al.  Robust object detection via soft cascade , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Tao Wang,et al.  Face detection using SURF cascade , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[7]  James M. Rehg,et al.  On the Design of Cascades of Boosted Ensembles for Face Detection , 2008, International Journal of Computer Vision.

[8]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[9]  Yangdong Deng,et al.  GPU accelerated face detection , 2010, 2010 International Conference on Intelligent Control and Information Processing.

[10]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[11]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[12]  Erik Learned-Miller,et al.  FDDB: A benchmark for face detection in unconstrained settings , 2010 .

[13]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Tat-Jen Cham,et al.  Fast polygonal integration and its application in extending haar-like features to improve object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.