Camera-based vehicle velocity estimation from monocular video

This paper documents the winning entry at the CVPR2017 vehicle velocity estimation challenge. Velocity estimation is an emerging task in autonomous driving which has not yet been thoroughly explored. The goal is to estimate the relative velocity of a specific vehicle from a sequence of images. In this paper, we present a light-weight approach for directly regressing vehicle velocities from their trajectories using a multilayer perceptron. Another contribution is an explorative study of features for monocular vehicle velocity estimation. We find that light-weight trajectory based features outperform depth and motion cues extracted from deep ConvNets, especially for far-distance predictions where current disparity and optical flow estimators are challenged significantly. Our light-weight approach is real-time capable on a single CPU and outperforms all competing entries in the velocity estimation challenge. On the test set, we report an average error of 1.12 m/s which is comparable to a (ground-truth) system that combines LiDAR and radar techniques to achieve an error of around 0.71 m/s.

[1]  Konrad Schindler,et al.  View-Consistent 3D Scene Flow Estimation over Multiple Frames , 2014, ECCV.

[2]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Oisin Mac Aodha,et al.  Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Charles E. Thorpe,et al.  Perception for collision avoidance and autonomous driving , 2003 .

[5]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[8]  Honglak Lee,et al.  Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Jörg Stückler,et al.  Semi-Supervised Deep Learning for Monocular Depth Map Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[15]  Jun-Wei Hsieh,et al.  Automatic traffic surveillance system for vehicle tracking and classification , 2006, IEEE Transactions on Intelligent Transportation Systems.

[16]  Daniel Cremers,et al.  Structure- and motion-adaptive regularization for high accuracy optic flow , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Nicu Sebe,et al.  Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  David Beymer,et al.  A real-time computer vision system for vehicle tracking and traffic surveillance , 1998 .

[21]  Philipp Wintersberger,et al.  Effects of exhaust gases on laser scanner data quality at low ambient temperatures , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[22]  Ali Farhadi,et al.  Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks , 2016, ECCV.

[23]  Gustavo Carneiro,et al.  Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue , 2016, ECCV.

[24]  Ralph Helmar Rasshofer,et al.  Influences of weather phenomena on automotive laser radar systems , 2011 .

[25]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Ronald K. Jurgen,et al.  Adaptive Cruise Control , 2006 .

[27]  Yi Li,et al.  Fully Convolutional Instance-Aware Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[29]  Yujie Wang,et al.  Flow-Guided Feature Aggregation for Video Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Vladlen Koltun,et al.  Learning to Act by Predicting the Future , 2016, ICLR.

[31]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[32]  Konrad Schindler,et al.  Piecewise Rigid Scene Flow , 2013, 2013 IEEE International Conference on Computer Vision.

[33]  Jiri Matas,et al.  Forward-Backward Error: Automatic Detection of Tracking Failures , 2010, 2010 20th International Conference on Pattern Recognition.

[34]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[37]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[38]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).