Driving me around the bend: Learning to drive from visual gist

This article proposes an approach to learning steering and road following behaviour from a human driver using holistic visual features. We use a random forest (RF) to regress a mapping between these features and the driver's actions, and propose an alternative to classical random forest regression based on the Medoid (RF-Medoid), that reduces the underestimation of extreme control values. We compare prediction performance using different holistic visual descriptors: GIST, Channel-GIST (C-GIST) and Pyramidal-HOG (P-HOG). The proposed methods are evaluated on two different datasets: predicting human behaviour on countryside roads and also for autonomous control of a robot on an indoor track. We show that 1) C-GIST leads to the best predictions on both sequences, and 2) RF-Medoid leads to a better estimation of extreme values, where a classical RF tends to under-steer. We use around 10% of the data for training and show excellent generalization over a dataset of thousands of images. Importantly, we do not engineer the solution but instead use machine learning to automatically identify the relationship between visual features and behaviour, providing an efficient, generic solution to autonomous control.

[1]  Nicolas Pugeault,et al.  Learning Pre-attentive Driving Behaviour from Holistic Visual Features , 2010, ECCV.

[2]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Alberto Broggi,et al.  The ARGO Autonomous Vehicle , 2007 .

[5]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[6]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[7]  Laurent Itti,et al.  Biologically Inspired Mobile Robot Vision Localization , 2009, IEEE Transactions on Robotics.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Nils J. Nilsson,et al.  A Mobile Automaton: An Application of Artificial Intelligence Techniques , 1969, IJCAI.

[10]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[11]  Volker Graefe,et al.  Dynamic monocular machine vision , 1988, Machine Vision and Applications.

[12]  Michael Felsberg,et al.  Efficient computation of channel-coded feature maps through piecewise polynomials , 2009, Image Vis. Comput..

[13]  Jannik Fritsch,et al.  Image-based classification of driving scenes by Hierarchical Principal Component Classification (HPCC) , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[14]  Sebastian Thrun,et al.  A Personal Account of the Development of Stanley, the Robot That Won the DARPA Grand Challenge , 2006, AI Mag..

[15]  Laurent Itti,et al.  Saliency and Gist Features for Target Detection in Satellite Images , 2011, IEEE Transactions on Image Processing.

[16]  Jitendra Malik,et al.  When is scene identification just texture recognition? , 2004, Vision Research.

[17]  Laurent Itti,et al.  Robot steering with spectral image information , 2005, IEEE Transactions on Robotics.

[18]  Ernst D. Dickmanns,et al.  Vehicles Capable of Dynamic Vision: A New Breed of Technical Beings? , 1998, Artif. Intell..

[19]  Matthew Turk,et al.  VITS-A Vision System for Autonomous Land Vehicle Navigation , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[21]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[22]  Nils J. Nilsson,et al.  A mobius automation: an application of artificial intelligence techniques , 1969, IJCAI 1969.

[23]  Volker Graefe,et al.  Applications of dynamic monocular machine vision , 1988, Machine Vision and Applications.

[24]  Laurent Itti,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[25]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[26]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.