Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection

The detection and monitoring of emotions are important in various applications, e.g. to enable naturalistic and personalised human-robot interaction. Emotion detection often require modelling of various data inputs from multiple modalities, including physiological signals (e.g.EEG and GSR), environmental data (e.g. audio and weather), videos (e.g. for capturing facial expressions and gestures) and more recently motion and location data. Many traditional machine learning algorithms have been utilised to capture the diversity of multimodal data at the sensors and features levels for human emotion classification. While the feature engineering processes often embedded in these algorithms are beneficial for emotion modelling, they inherit some critical limitations which may hinder the development of reliable and accurate models. In this work, we adopt a deep learning approach for emotion classification through an iterative process by adding and removing large number of sensor signals from different modalities. Our dataset was collected in a real-world study from smart-phones and wearable devices. It merges local interaction of three sensor modalities: on-body, environmental and location into global model that represents signal dynamics along with the temporal relationships of each modality. Our approach employs a series of learning algorithms including a hybrid approach using Convolutional Neural Network and Long Short-term Memory Recurrent Neural Network (CNN-LSTM) on the raw sensor data, eliminating the needs for manual feature extraction and engineering. The results show that the adoption of deep-learning approaches is effective in human emotion classification when large number of sensors input is utilised (average accuracy 95% and F-Measure=%95) and the hybrid models outperform traditional fully connected deep neural network (average accuracy 73% and F-Measure=73%). Furthermore, the hybrid models outperform previously developed Ensemble algorithms that utilise feature engineering to train the model average accuracy 83% and F-Measure=82%)

[1]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Krystian Mikolajczyk,et al.  Deep correlation for matching images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Bruce A. Draper,et al.  Feature selection from huge feature sets , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  Eiman Kanjo,et al.  EmoEcho: A Tangible Interface to Convey and Communicate Emotions , 2018, UbiComp/ISWC Adjunct.

[5]  Eman M. G. Younis,et al.  Towards unravelling the relationship between on-body, environmental and emotion data using sensor information fusion approach , 2018, Inf. Fusion.

[6]  Yike Guo,et al.  Survey on Feature Extraction and Applications of Biosignals , 2016, Machine Learning for Health Informatics.

[7]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[8]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[9]  Rahul Sathawane,et al.  Analysis of Emotion Recognition using Facial Expressions, using Bezier curve , 2015 .

[10]  Alan Chamberlain,et al.  Shopmobia: An Emotion-Based Shop Rating System , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[11]  Sung-Bae Cho,et al.  Human activity recognition with smartphone sensors using deep learning neural networks , 2016, Expert Syst. Appl..

[12]  Thomas Plötz,et al.  Deep, Convolutional, and Recurrent Models for Human Activity Recognition Using Wearables , 2016, IJCAI.

[13]  Bo Yu,et al.  Convolutional Neural Networks for human activity recognition using mobile sensors , 2014, 6th International Conference on Mobile Computing, Applications and Services.

[14]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[15]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[16]  Svetha Venkatesh,et al.  DeepCare: A Deep Dynamic Memory Model for Predictive Medicine , 2016, PAKDD.

[17]  Alan Chamberlain,et al.  Emotions in context: examining pervasive affective sensing systems, applications, and analyses , 2015, Personal and Ubiquitous Computing.

[18]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[19]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[20]  Matteo Gadaleta,et al.  IDNet: Smartphone-based Gait Recognition with Convolutional Neural Networks , 2016, Pattern Recognit..

[21]  Daniel S Pine,et al.  Facial expression recognition in adolescents with mood and anxiety disorders. , 2003, The American journal of psychiatry.

[22]  Miguel A. Labrador,et al.  A Survey on Human Activity Recognition using Wearable Sensors , 2013, IEEE Communications Surveys & Tutorials.

[23]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[24]  Cesare Furlanello,et al.  Convolutional Neural Network for Stereotypical Motor Movement Detection in Autism , 2015, ArXiv.

[25]  Carlos Busso,et al.  Emotion recognition using a hierarchical binary decision tree approach , 2011, Speech Commun..

[26]  Eiman Kanjo,et al.  Things of the Internet (ToI): Physicalization of Notification Data , 2018, UbiComp/ISWC Adjunct.

[27]  Ilia Uma physiological signals based human emotion recognition a review , 2014 .

[28]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29]  Sung-Bae Cho,et al.  Deep Convolutional Neural Networks for Human Activity Recognition with Smartphone Sensors , 2015, ICONIP.

[30]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[33]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[34]  K. Westerterp,et al.  Physical Activity Assessment With Accelerometers: An Evaluation Against Doubly Labeled Water , 2007, Obesity.

[35]  Hyo Jong Lee,et al.  Deep learninig of EEG signals for emotion recognition , 2015, 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[36]  Rongfang Bie,et al.  Deep Learning Based Affective Model for Speech Emotion Recognition , 2016, 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld).

[37]  Uwe Schlink,et al.  A conceptual framework for integrated analysis of environmental quality and quality of life , 2014 .

[38]  Dimitrios Hatzinakos,et al.  ECG Pattern Analysis for Emotion Detection , 2012, IEEE Transactions on Affective Computing.

[39]  Eman M. G. Younis,et al.  NeuroPlace: Categorizing urban places according to mental states , 2017, PloS one.

[40]  Hwee Pink Tan,et al.  Deep Activity Recognition Models with Triaxial Accelerometers , 2015, AAAI Workshop: Artificial Intelligence Applied to Assistive Technologies and Smart Environments.

[41]  Geoffrey E. Hinton,et al.  Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.

[42]  Rosalind W. Picard Affective computing: challenges , 2003, Int. J. Hum. Comput. Stud..

[44]  Melanie Dumas,et al.  Emotional Expression Recognition using Support Vector Machines , 2001 .

[45]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[46]  Xiaoli Li,et al.  Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition , 2015, IJCAI.

[47]  Jason Weston,et al.  Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.

[48]  Alex Fridman,et al.  Learning Human Identity from Motion Patterns , 2015, IEEE Access.

[49]  Chee Siang Ang,et al.  NotiMind: Utilizing Responses to Smart Phone Notifications as Affective Sensors , 2017, IEEE Access.