Unsupervised obstacle detection in driving environments using deep-learning-based stereovision

Abstract A vision-based obstacle detection system is a key enabler for the development of autonomous robots and vehicles and intelligent transportation systems. This paper addresses the problem of urban scene monitoring and tracking of obstacles based on unsupervised, deep-learning approaches. Here, we design an innovative hybrid encoder that integrates deep Boltzmann machines (DBM) and auto-encoders (AE). This hybrid auto-encode (HAE) model combines the greedy learning features of DBM with the dimensionality reduction capacity of AE to accurately and reliably detect the presence of obstacles. We combine the proposed hybrid model with the one-class support vector machines (OCSVM) to visually monitor an urban scene. We also propose an efficient approach to estimating obstacles location and track their positions via scene densities. Specifically, we address obstacle detection as an anomaly detection problem. If an obstacle is detected by the OCSVM algorithm, then localization and tracking algorithm is executed. We validated the effectiveness of our approach by using experimental data from two publicly available dataset, the Malaga stereovision urban dataset (MSVUD) and the Daimler urban segmentation dataset (DUSD). Results show the capacity of the proposed approach to reliably detect obstacles.

[1]  Christopher Leckie,et al.  High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning , 2016, Pattern Recognit..

[2]  Y. Ninomiya,et al.  Moving Obstacle Detection using Monocular Vision , 2006, 2006 IEEE Intelligent Vehicles Symposium.

[3]  Huanxin Zou,et al.  Surrounding Moving Obstacle Detection for Autonomous Driving Using Stereo Vision , 2013 .

[4]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[5]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[6]  Geoffrey E. Hinton,et al.  Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Guillermo Del Castillo,et al.  A sonar approach to obstacle detection for a vision-based autonomous wheelchair , 2006, Robotics Auton. Syst..

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[11]  Yang Yang,et al.  Stereo vision based autonomous robot calibration , 2017, Robotics Auton. Syst..

[12]  Alberto Broggi,et al.  Obstacle Detection with Stereo Vision for Off-Road Vehicle Navigation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[13]  Sang Jun Lee,et al.  Learning Framework for Robust Obstacle Detection, Recognition, and Tracking , 2017, IEEE Transactions on Intelligent Transportation Systems.

[14]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[15]  Idan Nadav,et al.  Off-road path and obstacle detection using monocular camera , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).

[16]  Gheorghe Leonte Mogan,et al.  Obstacle avoidance of redundant manipulators using neural networks based reinforcement learning , 2012 .

[17]  Georgios Ch. Sirakoulis,et al.  Real-time disparity map computation module , 2008, Microprocess. Microsystems.

[18]  Karsten Berns,et al.  A Stereo Vision Based Obstacle Detection System for Agricultural Applications , 2015, FSR.

[19]  Torsten Sattler,et al.  Obstacle detection for self-driving cars using only monocular cameras and wheel odometry , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Jean-Philippe Tarel,et al.  Real time obstacle detection in stereovision on non flat road geometry through "v-disparity" representation , 2002, Intelligent Vehicle Symposium, 2002. IEEE.

[21]  Zhang Xiong,et al.  A 3D model recognition mechanism based on deep Boltzmann machines , 2015, Neurocomputing.

[22]  Paulo Peixoto,et al.  3D Lidar-based static and moving obstacle detection in driving environments: An approach based on voxels and multi-region ground planes , 2016, Robotics Auton. Syst..

[23]  Hui Li,et al.  An Overview of Deep Generative Models , 2015 .

[24]  Hugo Larochelle,et al.  Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.

[25]  Geoffrey E. Hinton,et al.  Using very deep autoencoders for content-based image retrieval , 2011, ESANN.

[26]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[27]  Mehdi Dadkhah,et al.  Adaptive control algorithm of flexible robotic gripper by extreme learning machine , 2016 .

[28]  Danica Kragic,et al.  Theta-Disparity: An Efficient Representation of the 3D Scene Structure , 2014, IAS.

[29]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[30]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[31]  Tobi Delbruck,et al.  Real-time classification and sensor fusion with a spiking deep belief network , 2013, Front. Neurosci..

[32]  Kwanghoon Sohn,et al.  Real-time rear obstacle detection using reliable disparity for driver assistance , 2016, Expert Syst. Appl..

[33]  Stefan Roth,et al.  Stixmantics: A Medium-Level Model for Real-Time Semantic Scene Understanding , 2014, ECCV.

[34]  Stefan Roth,et al.  Efficient Multi-cue Scene Segmentation , 2013, GCPR.

[35]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[36]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[37]  Adrian Burlacu,et al.  Obstacle detection in stereo sequences using multiple representations of the disparity map , 2016, 2016 20th International Conference on System Theory, Control and Computing (ICSTCC).

[38]  Francisco Angel Moreno,et al.  The Málaga urban dataset: High-rate stereo and LiDAR in a realistic urban scenario , 2014, Int. J. Robotics Res..

[39]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[40]  Dominique Gruyer,et al.  Weighted V-disparity approach for obstacles localization in highway environments , 2013, 2013 IEEE Intelligent Vehicles Symposium (IV).

[41]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[42]  Qiang Ji,et al.  Posed and spontaneous facial expression differentiation using deep Boltzmann machines , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[43]  Helen M. Meng,et al.  Multi-distribution deep belief network for speech synthesis , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[44]  Z. Hu,et al.  U-V-disparity: an efficient algorithm for stereovision based scene analysis , 2005, IEEE Proceedings. Intelligent Vehicles Symposium, 2005..

[45]  Nakwan Kim,et al.  Vision based obstacle detection and collision risk estimation of an unmanned surface vehicle , 2016, 2016 13th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).