Predicting Driver Attention in Critical Situations

Robust driver attention prediction for critical situations is a challenging computer vision problem, yet essential for autonomous driving. Because critical driving moments are so rare, collecting enough data for these situations is difficult with the conventional in-car data collection protocol—tracking eye movements during driving. Here, we first propose a new in-lab driver attention collection protocol and introduce a new driver attention dataset, Berkeley DeepDrive Attention (BDD-A) dataset, which is built upon braking event videos selected from a large-scale, crowd-sourced driving video dataset. We further propose Human Weighted Sampling (HWS) method, which uses human gaze behavior to identify crucial frames of a driving dataset and weights them heavily during model training. With our dataset and HWS, we built a driver attention prediction model that outperforms the state-of-the-art and demonstrates sophisticated behaviors, like attending to crossing pedestrians but not giving false alarms to pedestrians safely walking on the sidewalk. Its prediction results are nearly indistinguishable from ground-truth to humans. Although only being trained with our in-lab attention data, the model also predicts in-car driver attention data of routine driving with state-of-the-art accuracy. This result not only demonstrates the performance of our model but also proves the validity and usefulness of our dataset and data collection protocol.

[1]  Frans W Cornelissen,et al.  The Eyelink Toolbox: Eye tracking with MATLAB and the Psychophysics Toolbox , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[2]  Christopher Thomas OpenSalicon: An Open Source Implementation of the Salicon Saliency Model , 2016, ArXiv.

[3]  Aykut Erdem,et al.  Visual saliency estimation by nonlinearly integrating features using region covariances. , 2013, Journal of vision.

[4]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[5]  Andrea Palazzi,et al.  Learning where to attend like a human driver , 2016, 2017 IEEE Intelligent Vehicles Symposium (IV).

[6]  P. Cavanagh,et al.  Tracking multiple targets with multifocal attention , 2005, Trends in Cognitive Sciences.

[7]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[8]  John K. Tsotsos,et al.  An Information Theoretic Model of Saliency and Visual Search , 2008, WAPCV.

[9]  Stan Sclaroff,et al.  Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Matthias Bethge,et al.  DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[11]  Jean-Philippe Tarel,et al.  Alerting the drivers about road signs with poor visual saliency , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[12]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Hugo Larochelle,et al.  Recurrent Mixture Density Network for Spatiotemporal Visual Attention , 2016, ICLR.

[15]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[16]  G. Rizzolatti,et al.  Reorienting attention across the horizontal and vertical meridians: Evidence in favor of a premotor theory of attention , 1987, Neuropsychologia.

[17]  Jian Sun,et al.  Geodesic Saliency Using Background Priors , 2012, ECCV.

[18]  Trevor Darrell,et al.  BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling , 2018, ArXiv.

[19]  Naila Murray,et al.  Saliency estimation using a non-parametric low-level vision model , 2011, CVPR 2011.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Nicu Sebe,et al.  Image saliency by isocentric curvedness and color , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Matthias Bethge,et al.  DeepGaze II: Predicting fixations from deep features over time and tasks , 2017 .

[23]  Alex Fridman,et al.  Driver Gaze Region Estimation without Use of Eye Movement , 2015, IEEE Intelligent Systems.

[24]  Peter J. Disimile,et al.  Vortex entrainment and separation using flow superposition , 2006, J. Vis..

[25]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[26]  D. S. Wooding,et al.  Fixation sequences made during visual examination of briefly presented 2D images. , 1997, Spatial vision.

[27]  Byeongkeun Kang,et al.  A computational framework for driver's visual attention using a fully convolutional architecture , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[28]  R. Groner,et al.  Looking at Faces: Local and Global Aspects of Scanpaths , 1984 .

[29]  Rita Cucchiara,et al.  Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model , 2016, IEEE Transactions on Image Processing.

[30]  Tianming Liu,et al.  Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Katherine Humphrey,et al.  Decisions about objects in real-world scenes are influenced by visual saliency before and during their inspection , 2011, Vision Research.

[33]  Yang Gao,et al.  End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Andrea Palazzi,et al.  DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Frédo Durand,et al.  What Do Different Evaluation Metrics Tell Us About Saliency Models? , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Xuming He,et al.  Predicting Salient Face in Multiple-Face Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).