User-adaptive models for activity and emotion recognition using deep transfer learning and data augmentation

Building predictive models for human-interactive systems is a challenging task. Every individual has unique characteristics and behaviors. A generic human–machine system will not perform equally well for each user given the between-user differences. Alternatively, a system built specifically for each particular user will perform closer to the optimum. However, such a system would require more training data for every specific user, thus hindering its applicability for real-world scenarios. Collecting training data can be time consuming and expensive. For example, in clinical applications it can take weeks or months until enough data is collected to start training machine learning models. End users expect to start receiving quality feedback from a given system as soon as possible without having to rely on time consuming calibration and training procedures. In this work, we build and test user-adaptive models (UAM) which are predictive models that adapt to each users’ characteristics and behaviors with reduced training data. Our UAM are trained using deep transfer learning and data augmentation and were tested on two public datasets. The first one is an activity recognition dataset from accelerometer data. The second one is an emotion recognition dataset from speech recordings. Our results show that the UAM have a significant increase in recognition performance with reduced training data with respect to a general model. Furthermore, we show that individual characteristics such as gender can influence the models’ performance.

[1]  Gary M. Weiss,et al.  Activity recognition using cell phone accelerometers , 2011, SKDD.

[2]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[3]  Gary M. Weiss,et al.  The Benefits of Personalized Smartphone-Based Activity Recognition Models , 2014, SDM.

[4]  Emily Mower Provost,et al.  Ecologically valid long-term mood monitoring of individuals with bipolar disorder using speech , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Julia Richter,et al.  Activity Recognition for Elderly Care by Evaluating Proximity to Objects and Human Skeleton Data , 2016, ICPRAM.

[6]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[7]  Georgy L. Gimel'farb,et al.  Unsupervised Stress Detection Algorithm and Experiments with Real Life Data , 2017, EPIA.

[8]  Bala Srinivasan,et al.  StreamAR: Incremental and Active Learning with Evolving Sensory Data for Activity Recognition , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[9]  Yan Cui,et al.  Transfer Learning for Molecular Cancer Classification Using Deep Neural Networks , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Hassan Ghasemzadeh,et al.  Personalization without User Interruption: Boosting Activity Recognition in New Subjects Using Unlabeled Data , 2017, 2017 ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS).

[11]  Mahesh Sooriyabandara,et al.  HealthyOffice: Mood recognition at work using smartphones and wearable sensors , 2016, 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).

[12]  Paul J. M. Havinga,et al.  Fusion of Smartphone Motion Sensors for Physical Activity Recognition , 2014, Sensors.

[13]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[14]  Dimitris Kanellopoulos,et al.  Handling imbalanced datasets: A review , 2006 .

[15]  Michael Riegler,et al.  Mental health monitoring with multimodal sensing and machine learning: A survey , 2018, Pervasive Mob. Comput..

[16]  Cuntai Guan,et al.  Cluster-Based Analysis for Personalized Stress Evaluation Using Physiological Signals , 2015, IEEE Journal of Biomedical and Health Informatics.

[17]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[18]  Ramón F. Brena,et al.  Activity Recognition Using Community Data to Complement Small Amounts of Labeled Instances , 2016, Sensors.

[19]  Mohammad Soleymani,et al.  Multimodal analysis of user behavior and browsed content under different image search intents , 2018, International Journal of Multimedia Information Retrieval.

[20]  Deokjai Choi,et al.  Personalization in Mobile Activity Recognition System Using K-Medoids Clustering Algorithm , 2013, Int. J. Distributed Sens. Networks.

[21]  Sung Wook Baik,et al.  Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network , 2017, 2017 International Conference on Platform Technology and Service (PlatCon).

[22]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[23]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[24]  Theodoros Giannakopoulos pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis , 2015, PloS one.

[25]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[26]  David S. Rosenblum,et al.  Context-aware mobile music recommendation for daily activities , 2012, ACM Multimedia.

[27]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[28]  Gang Wei,et al.  Speech emotion recognition based on HMM and SVM , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[29]  Vicente Pelechano,et al.  Inferring loneliness levels in older adults from smartphones , 2015, J. Ambient Intell. Smart Environ..

[30]  Marcela D. Rodríguez,et al.  Activity Inference for Ambient Intelligence Through Handling Artifacts in a Healthcare Environment , 2012, Sensors.

[31]  B. Kable Mental health. , 2005, Australian family physician.

[32]  Ye Xu,et al.  Enabling large-scale human activity inference on smartphones using community similarity networks (csn) , 2011, UbiComp '11.

[33]  Angelo M. Sabatini,et al.  Machine Learning Methods for Classifying Human Physical Activity from On-Body Accelerometers , 2010, Sensors.

[34]  Oscar Mayora-Ibarra,et al.  Smartphone-Based Recognition of States and State Changes in Bipolar Disorder Patients , 2015, IEEE Journal of Biomedical and Health Informatics.

[35]  Faisal Khan,et al.  Hand Gesture Recognition Using Input Impedance Variation of Two Antennas with Transfer Learning , 2018, IEEE Sensors Journal.

[36]  Daniel Gatica-Perez,et al.  StressSense: detecting stress in unconstrained acoustic environments using smartphones , 2012, UbiComp.

[37]  Hassan Ghasemzadeh,et al.  Personalized Human Activity Recognition Using Convolutional Neural Networks , 2018, AAAI.

[38]  Tao Xiang,et al.  Joint Semantic and Latent Attribute Modelling for Cross-Class Transfer Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Oscar Mayora-Ibarra,et al.  Automatic Stress Detection in Working Environments From Smartphones’ Accelerometer Data: A First Step , 2015, IEEE Journal of Biomedical and Health Informatics.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Paul J. M. Havinga,et al.  Activity Recognition Using Inertial Sensing for Healthcare, Wellbeing and Sports Applications: A Survey , 2010, ARCS Workshops.

[42]  Cendri A. C. Hutcherson,et al.  The moral emotions: a social-functionalist account of anger, disgust, and contempt. , 2011, Journal of personality and social psychology.

[43]  Ramón F. Brena,et al.  Building Personalized Activity Recognition Models with Scarce Labeled Data Based on Class Similarities , 2015, UCAmI.

[44]  Joyjit Chatterjee,et al.  Speech Emotion Recognition Using Cross-Correlation and Acoustic Features , 2018, 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech).

[45]  Ran R. Hassin,et al.  Angry, Disgusted, or Afraid? , 2008, Psychological science.

[46]  Andrzej Majkowski,et al.  Emotion recognition using facial expressions , 2017, ICCS.

[47]  Noel E. O'Connor,et al.  Classification of Sporting Activities Using Smartphone Accelerometers , 2013, Sensors.

[48]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[49]  S. Lalitha,et al.  Speech emotion recognition , 2014, 2014 International Conference on Advances in Electronics Computers and Communications.

[50]  Juan-Luis Gorricho,et al.  Activity Recognition from Accelerometer Data on a Mobile Phone , 2009, IWANN.

[51]  Oscar Mayora-Ibarra,et al.  Stress modelling and prediction in presence of scarce data , 2016, J. Biomed. Informatics.

[52]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[53]  Jussi Leppänen,et al.  Adaptive Activity and Environment Recognition for Mobile Phones , 2014, Sensors.

[54]  Angélica Muñoz-Meléndez,et al.  High-Level Features for Recognizing Human Actions in Daily Living Environments Using Wearable Sensors , 2018, UCAmI.

[55]  Sergey Levine,et al.  Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[56]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[57]  M. Johnson,et al.  Circulating microRNAs in Sera Correlate with Soluble Biomarkers of Immune Activation but Do Not Predict Mortality in ART Treated Individuals with HIV-1 Infection: A Case Control Study , 2015, PloS one.

[58]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[59]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[60]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..