Real-time GPU-based face detection in HD video sequences

Modern GPUs have evolved into fully programmable parallel stream multiprocessors. Due to the nature of the graphic workloads, computer vision algorithms are in good position to leverage the computing power of these devices. An interesting problem that greatly benefits from parallelism is face detection. This paper presents a highly optimized Haar-based face detector that works in real time over high definition videos. The proposed kernel operations exploit both coarse and fine grain parallelism for performing integral image computations and filter evaluations, thus being beneficial not only for face detection but also for other computer vision techniques. Compared to previous implementations, the experiments show that our proposal achieves a sustained throughput of 35 fps under 1080p resolutions using a sliding window with step of one pixel.

[1]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[2]  Pavel Zemcík,et al.  Local Rank Patterns - Novel Features for Rapid Object Detection , 2008, ICCVG.

[3]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[4]  W. Daniel Hillis,et al.  Data parallel algorithms , 1986, CACM.

[5]  Scott B. Baden,et al.  Accelerating Viola-Jones Face Detection to FPGA-Level Using GPUs , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[6]  Mark J. Harris,et al.  Parallel Prefix Sum (Scan) with CUDA , 2011 .

[7]  Yangdong Deng,et al.  GPU accelerated face detection , 2010, 2010 International Conference on Intelligent Control and Information Processing.

[8]  Ichiro Masaki,et al.  Efficient integral image computation on the GPU , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[9]  Shubhabrata Sengupta,et al.  Efficient Parallel Scan Algorithms for GPUs , 2011 .

[10]  Ryan Kastner,et al.  Fpga-based face detection system using Haar classifiers , 2009, FPGA '09.

[11]  Emmett Kilgariff,et al.  Fermi GF100 GPU Architecture , 2011, IEEE Micro.

[12]  C. Messom,et al.  High Precision GPU based Integral Images for Moment Invariant Image Processing Systems , 2008 .

[13]  Pavel Zemcík,et al.  Real-time object detection on CUDA , 2010, Journal of Real-Time Image Processing.

[14]  Amit A. Kale,et al.  Towards a robust, real-time face processing system using CUDA-enabled GPUs , 2009, 2009 International Conference on High Performance Computing (HiPC).

[15]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[16]  Nan Zhang,et al.  Working towards efficient parallel computing of integral images on multi-core processors , 2010, 2010 2nd International Conference on Computer Engineering and Technology.