Vision-based vehicle detecting and counting for traffic flow analysis

In this paper, we present a system to detect and count the number of vehicles in traffic surveillance videos based on Fast Region-based Convolutional Network (Fast R-CNN). Fast R-CNN is a state-of-the-art object detection network, which takes an entire image and a set of object proposals as input, produces bounding-box positions with probability estimates over object classes as output. First, we fine-tune a pre-trained Fast R-CNN net with images captured from traffic videos for accuracy improvement. Second, we define a series of rules of bounding boxes screening for vehicle counting. The proposed system takes around 3 seconds per image to count vehicles on a GTX970 GPU, and then records the corresponding number of vehicles into a database for traffic flow analysis. Experimental results demonstrated that the proposed system can provide significant improvements on the detection accuracy. In addition, experiments on challenging videos with occlusions or full of vehicles show that the proposed system works effectively.

[1]  Kaizhu Huang,et al.  Is DeCAF Good Enough for Accurate Image Classification? , 2015, ICONIP.

[2]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[4]  Azeddine Beghdadi,et al.  Vehicle Tracking by non-Drifting Mean-shift using Projective Kalman Filter , 2008, 2008 11th International IEEE Conference on Intelligent Transportation Systems.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Jacob Scharcanski,et al.  A novel video based system for detecting and counting vehicles at user-defined virtual loops , 2015, Expert Syst. Appl..

[7]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[12]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[13]  Lei Xie,et al.  Real-time vehicles tracking based on Kalman filter in a video-based ITS , 2005, Proceedings. 2005 International Conference on Communications, Circuits and Systems, 2005..

[14]  Jacob Scharcanski,et al.  A Particle-Filtering Approach for Vehicular Tracking Adaptive to Occlusions , 2011, IEEE Transactions on Vehicular Technology.

[15]  Chris T. Kiranoudis,et al.  A background subtraction algorithm for detecting and tracking vehicles , 2011, Expert Syst. Appl..

[16]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[17]  John R. Smith,et al.  Real-time video surveillance for traffic monitoring using virtual line analysis , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[18]  Alex Waibel,et al.  Consonant recognition by modular construction of large phonemic time-delay neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[19]  Sergio A. Velastin,et al.  A Review of Computer Vision Techniques for the Analysis of Urban Traffic , 2011, IEEE Transactions on Intelligent Transportation Systems.

[20]  Junyu Dong,et al.  Stretching deep architectures for text recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[21]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[22]  Aura Conci,et al.  Video-Based Distance Traffic Analysis: Application to Vehicle Tracking and Counting , 2011, Computing in Science & Engineering.

[23]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[24]  Serdar Korukoglu,et al.  Moving object detection and tracking by using annealed background subtraction method in videos: Performance optimization , 2012, Expert Syst. Appl..

[25]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[27]  Akihiro Takeuchi,et al.  On-road vehicle tracking using deformable object model and particle filter with integrated likelihoods , 2010, 2010 IEEE Intelligent Vehicles Symposium.