DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving

Autonomous and assisted driving are undoubtedly hot topics in computer vision. However, the driving task is extremely complex and a deep understanding of drivers' behavior is still lacking. Several researchers are now investigating the attention mechanism in order to define computational models for detecting salient and interesting objects in the scene. Nevertheless, most of these models only refer to bottom up visual saliency and are focused on still images. Instead, during the driving experience the temporal nature and peculiarity of the task influence the attention mechanisms, leading to the conclusion that real life driving data is mandatory. In this paper we propose a novel and publicly available dataset acquired during actual driving. Our dataset, composed by more than 500,000 frames, contains drivers' gaze fixations and their temporal integration providing task-specific saliency maps. Geo-referenced locations, driving speed and course complete the set of released data. To the best of our knowledge, this is the first publicly available dataset of this kind and can foster new discussions on better understanding, exploiting and reproducing the driver's attention process in the autonomous and assisted cars of future generations.

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  R. Groner,et al.  Looking at Faces: Local and Global Aspects of Scanpaths , 1984 .

[3]  M. Posner,et al.  Inhibition of return : Neural basis and function , 1985 .

[4]  Susan L. Franzel,et al.  Guided search: an alternative to the feature integration model for visual search. , 1989, Journal of experimental psychology. Human perception and performance.

[5]  D. S. Wooding,et al.  Fixation sequences made during visual examination of briefly presented 2D images. , 1997, Spatial vision.

[6]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  J. Henderson Human gaze control during real-world scene perception , 2003, Trends in Cognitive Sciences.

[8]  HongJiang Zhang,et al.  Contrast-based image attention analysis by using fuzzy growing , 2003, MULTIMEDIA '03.

[9]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[10]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[11]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[12]  B. Schölkopf,et al.  Graph-Based Visual Saliency , 2007 .

[13]  Nuno Vasconcelos,et al.  On the plausibility of the discriminant center-surround hypothesis for visual saliency. , 2008, Journal of vision.

[14]  Jean-Philippe Tarel,et al.  Alerting the drivers about road signs with poor visual saliency , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[15]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Lihi Zelnik-Manor,et al.  Context-aware saliency detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Laurent Itti,et al.  A Bayesian model for efficient visual search and recognition , 2010, Vision Research.

[18]  Katherine Humphrey,et al.  Decisions about objects in real-world scenes are influenced by visual saliency before and during their inspection , 2011, Vision Research.

[19]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[20]  D. Ballard,et al.  Eye guidance in natural vision: reinterpreting salience. , 2011, Journal of vision.

[21]  Yan Liu,et al.  Video Saliency Detection via Dynamic Consistent Spatio-Temporal Attention Modelling , 2013, AAAI.

[22]  Mohan M. Trivedi,et al.  Robust and continuous estimation of driver gaze zone by dynamic analysis of multiple face videos , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[23]  Jean-Philippe Tarel,et al.  Where We Look When We Drive: A Multidisciplinary Approach , 2014 .

[24]  Nicolas Pugeault,et al.  How Much of Driving Is Preattentive? , 2015, IEEE Transactions on Vehicular Technology.

[25]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Fernando De la Torre,et al.  Driver Gaze Tracking and Eyes Off the Road Detection System , 2015, IEEE Transactions on Intelligent Transportation Systems.

[27]  Ali Borji,et al.  CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research , 2015, ArXiv.

[28]  Mubarak Shah,et al.  Visual Saliency Detection Using Group Lasso Regularization in Videos of Natural Scenes , 2016, International Journal of Computer Vision.

[29]  Alex Fridman,et al.  Driver Gaze Estimation Without Using Eye Movement , 2015, ArXiv.

[30]  Huimin Ma,et al.  3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[31]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[32]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Tianming Liu,et al.  Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Cristian Sminchisescu,et al.  Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Stan Sclaroff,et al.  Exploiting Surroundedness for Saliency Detection: A Boolean Map Approach , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.