How Much of Driving Is Preattentive?

Driving a car in an urban setting is an extremely difficult problem, incorporating a large number of complex visual tasks; however, this problem is solved daily by most adults with little apparent effort. This paper proposes a novel vision-based approach to autonomous driving that can predict and even anticipate a driver's behavior in real time, using preattentive vision only. Experiments on three large datasets totaling over 200 000 frames show that our preattentive model can (1) detect a wide range of driving-critical context such as crossroads, city center, and road type; however, more surprisingly, it can (2) detect the driver's actions (over 80% of braking and turning actions) and (3) estimate the driver's steering angle accurately. Additionally, our model is consistent with human data: First, the best steering prediction is obtained for a perception to action delay consistent with psychological experiments. Importantly, this prediction can be made before the driver's action. Second, the regions of the visual field used by the computational model strongly correlate with the driver's gaze locations, significantly outperforming many saliency measures and comparable to state-of-the-art approaches.

[1]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[2]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[3]  Gianni Conte,et al.  Experience of the ARGO autonomous vehicle , 1998, Defense, Security, and Sensing.

[4]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[6]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[7]  Ali Borji,et al.  Analysis of Scores, Datasets, and Models in Visual Saliency Prediction , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Jitendra Malik,et al.  When is scene identification just texture recognition? , 2004, Vision Research.

[9]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[10]  Laurent Itti,et al.  Robot steering with spectral image information , 2005, IEEE Transactions on Robotics.

[11]  Matthew Turk,et al.  VITS-A Vision System for Autonomous Land Vehicle Navigation , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Jean Underwood,et al.  Visual attention while driving: sequences of eye fixations made by experienced and novice drivers , 2003, Ergonomics.

[13]  David N. Lee,et al.  Where we look when we steer , 1994, Nature.

[14]  Marc Green,et al.  "How Long Does It Take to Stop?" Methodological Analysis of Driver Perception-Brake Times , 2000 .

[15]  Mandy Eberhart,et al.  Decision Forests For Computer Vision And Medical Image Analysis , 2016 .

[16]  Nils J. Nilsson,et al.  A mobius automation: an application of artificial intelligence techniques , 1969, IJCAI 1969.

[17]  Yann LeCun,et al.  Off-Road Obstacle Avoidance through End-to-End Learning , 2005, NIPS.

[18]  Jannik Fritsch,et al.  Image-based classification of driving scenes by Hierarchical Principal Component Classification (HPCC) , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[19]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[20]  D. Ballard,et al.  The role of uncertainty and reward on eye movements in a virtual driving task. , 2012, Journal of vision.

[21]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[22]  J. Wolfe,et al.  What Can 1 Million Trials Tell Us About Visual Search? , 1998 .

[23]  Michael Felsberg,et al.  Channel smoothing: efficient robust smoothing of low-level signal features , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[25]  Urs A. Muller,et al.  Learning long-range vision for autonomous off-road driving , 2009 .

[26]  B. Tatler,et al.  Steering with the head The visual strategy of a racing driver , 2001, Current Biology.

[27]  Abel G. Oliva,et al.  Gist of a scene , 2005 .

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Laurent Itti,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[30]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[31]  Dana H. Ballard,et al.  Modeling embodied visual behaviors , 2007, TAP.

[32]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[33]  Nicolas Pugeault,et al.  Learning Pre-attentive Driving Behaviour from Holistic Visual Features , 2010, ECCV.

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  J. Daugman Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[36]  Volker Graefe,et al.  Applications of dynamic monocular machine vision , 1988, Machine Vision and Applications.

[37]  Ernst D. Dickmanns,et al.  Vehicles Capable of Dynamic Vision: A New Breed of Technical Beings? , 1998, Artif. Intell..

[38]  Antón García-Díaz,et al.  Saliency from hierarchical adaptation through decorrelation and variance normalization , 2012, Image Vis. Comput..

[39]  Heikki Summala,et al.  Brake reaction times and driver behavior analysis , 2000 .

[40]  M. Felsberg,et al.  Autonomous navigation and sign detector learning , 2013, 2013 IEEE Workshop on Robot Vision (WORV).

[41]  Víctor Leborán,et al.  On the relationship between optical variability, visual saliency, and eye fixations: a computational approach. , 2012, Journal of vision.

[42]  Laurent Itti,et al.  Biologically Inspired Mobile Robot Vision Localization , 2009, IEEE Transactions on Robotics.

[43]  Nils J. Nilsson,et al.  A Mobile Automaton: An Application of Artificial Intelligence Techniques , 1969, IJCAI.

[44]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[45]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[46]  Gianni Conte,et al.  Automatic Vehicle Guidance: the Experience of the ARGO Autonomous Vehicle , 1999 .

[47]  R VanRullen,et al.  Is it a Bird? Is it a Plane? Ultra-Rapid Visual Categorisation of Natural and Artifactual Objects , 2001, Perception.

[48]  Yann LeCun,et al.  A multirange architecture for collision-free off-road robot navigation , 2009 .

[49]  Ali Borji,et al.  What/Where to Look Next? Modeling Top-Down Visual Attention in Complex Interactive Environments , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[50]  Nicolas Pugeault,et al.  Driving me around the bend: Learning to drive from visual gist , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[51]  Sebastian Thrun,et al.  Stanley: The robot that won the DARPA Grand Challenge , 2006, J. Field Robotics.

[52]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[53]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[54]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[55]  Volker Graefe,et al.  Dynamic monocular machine vision , 1988, Machine Vision and Applications.