Modeling Traffic Scenes for Intelligent Vehicles Using CNN-Based Detection and Orientation Estimation

Object identification in images taken from moving vehicles is still a complex task within the computer vision field due to the dynamism of the scenes and the poorly defined structures of the environment. This research proposes an efficient approach to perform recognition on images from a stereo camera, with the goal of gaining insight of traffic scenes in urban and road environments. We rely on a deep learning framework able to simultaneously identify a broad range of entities, such as vehicles, pedestrians or cyclists, with a frame rate compatible with the strong requirements of onboard automotive applications. The results demonstrate the capabilities of the perception system for a wide variety of situations, thus providing valuable information to understand the traffic scenario.

[1]  Henry Leung,et al.  Overview of Environment Perception for Intelligent Vehicles , 2017, IEEE Transactions on Intelligent Transportation Systems.

[2]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Jitendra Malik,et al.  Viewpoints and keypoints , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Xiaoou Tang,et al.  Object Detection and Viewpoint Estimation with Auto-masking Neural Network , 2014, ECCV.

[6]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[7]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[9]  Alberto Broggi,et al.  PROUD-Public road urban driverless test: Architecture and results , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[10]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[11]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Dacheng Tao,et al.  Deep Neural Network for Structural Prediction and Lane Detection in Traffic Scene , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Aurelio Ponz,et al.  IVVI 2.0: An intelligent vehicle based on computational perception , 2014, Expert Syst. Appl..

[15]  Ebroul Izquierdo,et al.  Stereo visual odometry in urban environments based on detecting ground features , 2016, Robotics Auton. Syst..

[16]  Ralf G. Herrtwich,et al.  Making Bertha See , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Sergiu Nedevschi,et al.  Processing Dense Stereo Data Using Elevation Maps: Road Surface, Traffic Isle, and Obstacle Detection , 2010, IEEE Transactions on Vehicular Technology.

[20]  Alberto Broggi,et al.  A full-3D voxel-based dynamic obstacle detection for urban scenario using stereo vision , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[21]  Martin Lauer,et al.  Fast Cyclist Detection by Cascaded Detector and Geometric Constraint , 2015, 2015 IEEE 18th International Conference on Intelligent Transportation Systems.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Basam Musleh,et al.  U-V Disparity Analysis in Urban Environments , 2011, EUROCAST.

[24]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[26]  Jose M. Armingol,et al.  Joint object detection and viewpoint estimation using CNN features , 2017, 2017 IEEE International Conference on Vehicular Electronics and Safety (ICVES).

[27]  Rudolf Mester,et al.  Free Space Computation Using Stochastic Occupancy Grids and Dynamic Programming , 2008 .

[28]  Peter V. Gehler,et al.  Teaching 3D geometry to deformable part models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.