Human head detection based on multi-stage CNN with voting strategy

As a classical problem in target detection, human head detection based on head features is an important basis of intelligent vehicle driving and people counting. Due to the irregularity and complexity of human head, artificial designed feature description methods have lower recognition rate and worse robustness. As an important part of deep learning, convolutional neural network (CNN) has applied to image recognition and speech analysis successfully. In view of the instability and hard description of head features, a new head detection method based on multi-stage CNN with voting strategy is proposed in this paper. Firstly, we use features abstracted by multi-stage CNN from different layers to classify respectively. Then, we use the results to get the final classification through voting strategy. Experimental results show that new method has higher recognition rate compared with traditional ones.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  H. Hongo,et al.  Face and head detection for a real-time surveillance system , 2004, ICPR 2004.

[3]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[5]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[6]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  Gerald Penn,et al.  Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  A. Giralt,et al.  Head detection inside vehicles with a modified SVM for safer airbags , 2001, ITSC 2001. 2001 IEEE Intelligent Transportation Systems. Proceedings (Cat. No.01TH8585).

[9]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Jinglu Tan,et al.  Automated fetal head detection and measurement in ultrasound images by iterative randomized Hough transform. , 2005, Ultrasound in medicine & biology.

[11]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[12]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Zhengyou Zhang,et al.  Improving multiview face detection with multi-task deep convolutional neural networks , 2014, IEEE Winter Conference on Applications of Computer Vision.