论文信息 - Predicting Driver Attention in Critical Situations

Predicting Driver Attention in Critical Situations

Robust driver attention prediction for critical situations is a challenging computer vision problem, yet essential for autonomous driving. Because critical driving moments are so rare, collecting enough data for these situations is difficult with the conventional in-car data collection protocol—tracking eye movements during driving. Here, we first propose a new in-lab driver attention collection protocol and introduce a new driver attention dataset, Berkeley DeepDrive Attention (BDD-A) dataset, which is built upon braking event videos selected from a large-scale, crowd-sourced driving video dataset. We further propose Human Weighted Sampling (HWS) method, which uses human gaze behavior to identify crucial frames of a driving dataset and weights them heavily during model training. With our dataset and HWS, we built a driver attention prediction model that outperforms the state-of-the-art and demonstrates sophisticated behaviors, like attending to crossing pedestrians but not giving false alarms to pedestrians safely walking on the sidewalk. Its prediction results are nearly indistinguishable from ground-truth to humans. Although only being trained with our in-lab attention data, the model also predicts in-car driver attention data of routine driving with state-of-the-art accuracy. This result not only demonstrates the performance of our model but also proves the validity and usefulness of our dataset and data collection protocol.

[1] Frans W Cornelissen,et al. The Eyelink Toolbox: Eye tracking with MATLAB and the Psychophysics Toolbox , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[2] Christopher Thomas. OpenSalicon: An Open Source Implementation of the Salicon Saliency Model , 2016, ArXiv.

[3] Aykut Erdem,et al. Visual saliency estimation by nonlinearly integrating features using region covariances. , 2013, Journal of vision.

[4] John K. Tsotsos,et al. Saliency Based on Information Maximization , 2005, NIPS.

[5] Andrea Palazzi,et al. Learning where to attend like a human driver , 2016, 2017 IEEE Intelligent Vehicles Symposium (IV).

[6] P. Cavanagh,et al. Tracking multiple targets with multifocal attention , 2005, Trends in Cognitive Sciences.

[7] Matthias Bethge,et al. Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[8] John K. Tsotsos,et al. An Information Theoretic Model of Saliency and Visual Search , 2008, WAPCV.

[9] Stan Sclaroff,et al. Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[10] Matthias Bethge,et al. DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.

[11] Jean-Philippe Tarel,et al. Alerting the drivers about road signs with poor visual saliency , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[12] Qi Zhao,et al. SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Hugo Larochelle,et al. Recurrent Mixture Density Network for Spatiotemporal Visual Attention , 2016, ICLR.

[15] John K. Tsotsos,et al. Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[16] G. Rizzolatti,et al. Reorienting attention across the horizontal and vertical meridians: Evidence in favor of a premotor theory of attention , 1987, Neuropsychologia.

[17] Jian Sun,et al. Geodesic Saliency Using Background Priors , 2012, ECCV.

[18] Trevor Darrell,et al. BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling , 2018, ArXiv.

[19] Naila Murray,et al. Saliency estimation using a non-parametric low-level vision model , 2011, CVPR 2011.

[20] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21] Nicu Sebe,et al. Image saliency by isocentric curvedness and color , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22] Matthias Bethge,et al. DeepGaze II: Predicting fixations from deep features over time and tasks , 2017 .

[23] Alex Fridman,et al. Driver Gaze Region Estimation without Use of Eye Movement , 2015, IEEE Intelligent Systems.

[24] Peter J. Disimile,et al. Vortex entrainment and separation using flow superposition , 2006, J. Vis..

[25] Pietro Perona,et al. Graph-Based Visual Saliency , 2006, NIPS.

[26] D. S. Wooding,et al. Fixation sequences made during visual examination of briefly presented 2D images. , 1997, Spatial vision.

[27] Byeongkeun Kang,et al. A computational framework for driver's visual attention using a fully convolutional architecture , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[28] R. Groner,et al. Looking at Faces: Local and Global Aspects of Scanpaths , 1984 .

[29] Rita Cucchiara,et al. Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model , 2016, IEEE Transactions on Image Processing.

[30] Tianming Liu,et al. Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Katherine Humphrey,et al. Decisions about objects in real-world scenes are influenced by visual saliency before and during their inspection , 2011, Vision Research.

[33] Yang Gao,et al. End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Andrea Palazzi,et al. DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35] Frédo Durand,et al. What Do Different Evaluation Metrics Tell Us About Saliency Models? , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Xuming He,et al. Predicting Salient Face in Multiple-Face Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).