Attentional demand estimation with attentive driving models

The task of driving can sometimes require the processing of large amounts of visual information; such situations can overload the perceptual systems of human drivers leading to ‘inattentional blindness’, where potentially critical visual information is overlooked. This phenomenon of ‘looking but failing to see’ is the third largest contributor to traffic accidents in the UK. In this work we develop a method to identify these particularly demanding driving scenes using an end-to-end driving architecture, imbued with a spatial attention mechanism and trained to mimic ground-truth driving controls from video input. At test time, the network’s attention distribution is segmented to identify relevant items in the driving scene which are used to estimate the attentional demand on the driver according to an established model in cognitive neuroscience. Without collecting any ground-truth attentional demand data instead using readily available odometry data in a novel way our approach is shown to outperform several baselines on a new dataset of 1200 driving scenes labelled for attentional demand in driving.

[1]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[2]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[3]  N. Lavie Attention, Distraction, and Cognitive Control Under Load , 2010 .

[4]  John D. Lee,et al.  Measuring the Effects of Driver Distraction: Direct Driving Performance Methods and Measures , 2009 .

[5]  Tiejun Zhao,et al.  Table-to-Text: Describing Table Region With Natural Language , 2018, AAAI.

[6]  Lei Zhang,et al.  Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  N. Lavie Perceptual load as a necessary condition for selective attention. , 1995, Journal of experimental psychology. Human perception and performance.

[8]  Jacques Bergeron,et al.  Monotony of road environment and driver fatigue: a simulator study. , 2003, Accident; analysis and prevention.

[9]  Ying Wang,et al.  The Effect of Traffic on Situation Awareness and Mental Workload: Simulator-Based Study , 2007, HCI.

[10]  N. Lavie Distracted and confused?: Selective attention under load , 2005, Trends in Cognitive Sciences.

[11]  Lawrence D. Jackel,et al.  Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car , 2017, ArXiv.

[12]  Luc Van Gool,et al.  End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners , 2018, ECCV.

[13]  Deva Ramanan,et al.  Attentional Pooling for Action Recognition , 2017, NIPS.

[14]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[15]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Dick de Waard,et al.  The measurement of drivers' mental workload , 1996 .

[17]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[18]  Christopher D. Wickens,et al.  Multiple Resources and Mental Workload , 2008, Hum. Factors.

[19]  John F. Canny,et al.  Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  G Marrkula,et al.  A STEERING WHEEL REVERSAL RATE METRIC FOR ASSESSING EFFECTS OF VISUAL AND COGNITIVE SECONDARY TASK LOAD , 2006 .

[24]  N. Lavie,et al.  On the Efficiency of Visual Selective Attention: Efficient Visual Search Leads to Inefficient Distractor Rejection , 1997 .

[25]  Simon G Hosking,et al.  Predicting driver drowsiness using vehicle measures: recent insights and future challenges. , 2009, Journal of safety research.

[26]  Andrea Palazzi,et al.  Predicting the Driver's Focus of Attention: The DR(eye)VE Project , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[28]  M. Posner,et al.  Orienting of Attention* , 1980, The Quarterly journal of experimental psychology.

[29]  Yiannis Demiris,et al.  Real-Time Workload Classification during Driving using HyperNetworks , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Jürgen Schmidhuber,et al.  Recurrent nets that time and count , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[31]  Nilli Lavie,et al.  Predicting the Perceptual Demands of Urban Driving with Video Regression , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[32]  M Jessurun,et al.  Effect of road layout and road environment on driving performance, drivers' physiology and road appreciation. , 1995, Ergonomics.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[36]  Richard Socher,et al.  Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Régis Lobjois,et al.  The effects of driving environment complexity and dual tasking on drivers’ mental workload and eye blink behavior , 2016 .

[38]  Dim P. Papadopoulos,et al.  How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).