In the Blink of an Eye: Event-based Emotion Recognition

We introduce a wearable single-eye emotion recognition device and a real-time approach to recognizing emotions from partial observations of an emotion that is robust to changes in lighting conditions. At the heart of our method is a bio-inspired event-based camera setup and a newly designed lightweight Spiking Eye Emotion Network (SEEN). Compared to conventional cameras, event-based cameras offer a higher dynamic range (up to 140 dB vs. 80 dB) and a higher temporal resolution (in the order of μ s vs. 10s of ms). Thus, the captured events can encode rich temporal cues under challenging lighting conditions. However, these events lack texture information, posing problems in decoding temporal information effectively. SEEN tackles this issue from two different perspectives. First, we adopt convolutional spiking layers to take advantage of the spiking neural network’s ability to decode pertinent temporal information. Second, SEEN learns to extract essential spatial cues from corresponding intensity frames and leverages a novel weight-copy scheme to convey spatial attention to the convolutional spiking layers during training and inference. We extensively validate and demonstrate the effectiveness of our approach on a specially collected Single-eye Event-based Emotion (SEE) dataset. To the best of our knowledge, our method is the first eye-based emotion recognition method that leverages event-based cameras and spiking neural networks.

[1]  Felix Heide,et al.  Biologically Inspired Dynamic Thresholds for Spiking Neural Networks , 2022, NeurIPS.

[2]  Wayne Wu,et al.  EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model , 2022, SIGGRAPH.

[3]  Qingshan Liu,et al.  Former-DFER: Dynamic Facial Expression Recognition Transformer , 2021, ACM Multimedia.

[4]  Bo Dong,et al.  Object Tracking by Jointly Exploiting Frame and Event Domain , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Guodong Guo,et al.  TransFER: Learning Relation-aware Facial Expression Representations with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Venkatesh Saligrama,et al.  Time Adaptive Recurrent Neural Network , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Georgios Tzimiropoulos,et al.  Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Davide Scaramuzza,et al.  Combining Events and Frames Using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction , 2021, IEEE Robotics and Automation Letters.

[9]  Hongkai Wen,et al.  Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Naimul Mefraz Khan,et al.  Facial Expression Recognition Under Partial Occlusion from Virtual Reality Headsets based on Transfer Learning , 2020, 2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM).

[11]  Fengyuan Xu,et al.  EMO: real-time emotion recognition from single-eye images for resource-constrained eyewear devices , 2020, MobiSys.

[12]  Seungryong Kim,et al.  Multi-Modal Recurrent Attention Networks for Facial Expression Recognition , 2020, IEEE Transactions on Image Processing.

[13]  Bertram E. Shi,et al.  Multitask Emotion Recognition with Incomplete Labels , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[14]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[15]  Yuqian Zhou,et al.  MIMAMO Net: Integrating Micro- and Macro-motion for Video Emotion Recognition , 2019, AAAI.

[16]  Radu Tudor Ionescu,et al.  Recognizing Facial Expressions of Occluded Faces using Convolutional Neural Networks , 2019, ICONIP.

[17]  Seungryong Kim,et al.  Context-Aware Emotion Recognition Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Tanzeem Choudhury,et al.  BoostMeUp: Improving Cognitive Performance in the Moment by Unobtrusively Regulating Emotions with a Smartwatch , 2019, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[19]  Kyoung Mu Lee,et al.  Recurrent Neural Networks With Intra-Frame Iterations for Video Deblurring , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Chiara Bartolozzi,et al.  Event-Based Vision: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Davide Scaramuzza,et al.  End-to-End Learning of Representations for Asynchronous Event-Based Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  L. Velly,et al.  The effect of ambient-light conditions on quantitative pupillometry: a history of rubber cup , 2019, Neurocritical Care.

[23]  Mi Li,et al.  Emotion recognition from multichannel EEG signals using K-nearest neighbor classification , 2018, Technology and health care : official journal of the European Society for Engineering and Medicine.

[24]  Narciso García,et al.  Event-Based Vision Meets Deep Learning on Steering Prediction for Self-Driving Cars , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  S. Mathôt Pupillometry: Psychology, Physiology, and Function , 2018, Journal of cognition.

[26]  Yann LeCun,et al.  A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Yutaka Satoh,et al.  Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Irfan A. Essa,et al.  Eyemotion: Classifying Facial Expressions in VR Using Eye-Tracking Cameras , 2017, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[29]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Lei Deng,et al.  Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks , 2017, Front. Neurosci..

[31]  Xavier P. Burgos-Artizzu,et al.  Real-time expression-sensitive HMD face reconstruction , 2015, SIGGRAPH Asia Technical Briefs.

[32]  Chongyang Ma,et al.  Facial performance sensing head-mounted display , 2015, ACM Trans. Graph..

[33]  Kenji Suzuki,et al.  Design of a Wearable Device for Reading Positive Expressions from Facial EMG Signals , 2014, IEEE Transactions on Affective Computing.

[34]  Björn Schuller,et al.  Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies , 2010, IEEE Transactions on Affective Computing.

[35]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  B. Appelhans,et al.  Heart Rate Variability as an Index of Regulated Emotional Responding , 2006 .

[37]  Wulfram Gerstner,et al.  SPIKING NEURON MODELS Single Neurons , Populations , Plasticity , 2002 .

[38]  K G Munhall,et al.  A model of facial biomechanics for speech production. , 1999, The Journal of the Acoustical Society of America.

[39]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[40]  Weihong Deng,et al.  Relative Uncertainty Learning for Facial Expression Recognition , 2021, NeurIPS.

[41]  Elif Derya Übeyli,et al.  Recurrent Neural Networks , 2018 .

[42]  Kjell Elenius,et al.  Emotion Recognition , 2009, Computers in the Human Interaction Loop.