A Semi-Automatic 2D Solution for Vehicle Speed Estimation from Monocular Videos

In this work, we present a novel approach for vehicle speed estimation from monocular videos. The pipeline consists of modules for multi-object detection, robust tracking, and speed estimation. The tracking algorithm has the capability for jointly tracking individual vehicles and estimating velocities in the image domain. However, since camera parameters are often unavailable and extensive variations are present in the scenes, transforming measurements in the image domain to real world is challenging. We propose a simple two-stage algorithm to approximate the transformation. Images are first rectified to restore affine properties, then the scaling factor is compensated for each scene. We show the effectiveness of the proposed method with extensive experiments on the traffic speed analysis dataset in the NVIDIA AI City challenge. We achieve a detection rate of 1.0 in vehicle detection and tracking, and Root Mean Square Error of 9.54 (mph) for the task of vehicle speed estimation in unconstrained traffic videos.

[1]  Adam Herout,et al.  Fully Automatic Roadside Camera Calibration for Traffic Surveillance , 2015, IEEE Transactions on Intelligent Transportation Systems.

[2]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[3]  D.J. Dailey,et al.  A novel technique to dynamically measure vehicle speed using uncalibrated roadway cameras , 2005, IEEE Proceedings. Intelligent Vehicles Symposium, 2005..

[4]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yu Liu,et al.  POI: Multiple Object Tracking with High Performance Detection and Appearance Feature , 2016, ECCV Workshops.

[6]  Carlos D. Castillo,et al.  L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.

[7]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[10]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Adam Herout,et al.  Real Projective Plane Mapping for Detection of Orthogonal Vanishing Points , 2013, BMVC.

[12]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[13]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Daniel J. Dailey,et al.  An algorithm to estimate mean traffic speed using uncalibrated cameras , 2000, IEEE Trans. Intell. Transp. Syst..

[15]  Nelson H. C. Yung,et al.  New method for overcoming ill-conditioning in vanishing-point-based camera calibration , 2007 .

[16]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Adam Herout,et al.  Traffic surveillance camera calibration by 3D model bounding box alignment for accurate vehicle speed measurement , 2017, Comput. Vis. Image Underst..

[18]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[19]  Έλλη Πέτσα,et al.  An automatic approach for camera calibration from vanishing points , 2015 .

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  L. Grammatikopoulos,et al.  AUTOMATIC ESTIMATION OF VEHICLE SPEED FROM UNCALIBRATED VIDEO SEQUENCES , 2005 .