Attention Monitoring and Hazard Assessment with Bio-Sensing and Vision: Empirical Analysis Utilizing CNNs on the KITTI Dataset

Assessing the driver's attention and detecting various hazardous and non-hazardous events during a drive are critical for driver's safety. Attention monitoring in driving scenarios has mostly been carried out using vision (camera-based) modality by tracking the driver's gaze and facial expressions. It is only recently that bio-sensing modalities such as Electroencephalogram (EEG) are being explored. But, there is another open problem which has not been explored sufficiently yet in this paradigm. This is the detection of specific events, hazardous and non-hazardous, during driving that affects the driver's mental and physiological states. The other challenge in evaluating multi-modal sensory applications is the absence of very large scale EEG data because of the various limitations of using EEG in the real world. In this paper, we use both of the above sensor modalities and compare them against the two tasks of assessing the driver's attention and detecting hazardous vs. non-hazardous driving events. We collect user data on twelve subjects and show how in the absence of very large-scale datasets, we can still use pre-trained deep learning convolution networks to extract meaningful features from both of the above modalities. We used the publicly available KITTI dataset for evaluating our platform and to compare it with previous studies. Finally, we show that the results presented in this paper surpass the previous benchmark set up in the above driver awareness-related applications.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Tzyy-Ping Jung,et al.  Multi-modal Approach for Affective Computing , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[3]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[4]  Keiichi Uchimura,et al.  Driver Inattention Monitoring System for Intelligent Vehicles: A Review , 2009, IEEE Transactions on Intelligent Transportation Systems.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Mohan M. Trivedi,et al.  Looking at the Driver/Rider in Autonomous Vehicles to Predict Take-Over Readiness , 2018, IEEE Transactions on Intelligent Vehicles.

[8]  Mykel J. Kochenderfer,et al.  Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Wolfram Burgard,et al.  Decoding hazardous Events in driving Videos , 2017, GBCIC.

[11]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[12]  Mohan Manubhai Trivedi,et al.  Dynamics of Driver's Gaze: Explorations in Behavior Modeling and Maneuver Prediction , 2018, IEEE Transactions on Intelligent Vehicles.

[13]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[14]  Mohan M. Trivedi,et al.  Looking at Humans in the Age of Self-Driving and Highly Automated Vehicles , 2016, IEEE Transactions on Intelligent Vehicles.

[15]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[16]  Wan-Young Chung,et al.  Mobile Healthcare for Automatic Driving Sleep-Onset Detection Using Wavelet-Based EEG and Respiration Signals , 2014, Sensors.

[17]  Mohan M. Trivedi,et al.  Driver hand localization and grasp analysis: A vision-based real-time approach , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[18]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[19]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Tzyy-Ping Jung,et al.  Real-time modeling and 3D visualization of source dynamics and connectivity using wearable EEG , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[21]  M. Trivedi,et al.  Head and eye gaze dynamics during visual attention shifts in complex environments. , 2012, Journal of vision.

[22]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[23]  Jun Zhang,et al.  Detection of Driver Vigilance Level Using EEG Signals and Driving Contexts , 2018, IEEE Transactions on Reliability.

[24]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Amit Sethi,et al.  Drowsy driver detection using representation learning , 2014, 2014 IEEE International Advance Computing Conference (IACC).

[27]  Tzyy-Ping Jung,et al.  A Wearable Multi-Modal Bio-Sensing System Towards Real-World Applications , 2019, IEEE Transactions on Biomedical Engineering.

[28]  Wolfram Burgard,et al.  Decoding Perceived Hazardousness from User's Brain States to Shape Human-Robot Interaction , 2017, HRI.