Multimodal autoencoder: A deep learning approach to filling in missing sensor data and enabling better mood prediction

To accomplish forecasting of mood in real-world situations, affective computing systems need to collect and learn from multimodal data collected over weeks or months of daily use. Such systems are likely to encounter frequent data loss, e.g. when a phone loses location access, or when a sensor is recharging. Lost data can handicap classifiers trained with all modalities present in the data. This paper describes a new technique for handling missing multimodal data using a specialized denoising autoencoder: the Multimodal Autoencoder (MMAE). Empirical results from over 200 participants and 5500 days of data demonstrate that the MMAE is able to predict the feature values from multiple missing modalities more accurately than reconstruction methods such as principal components analysis (PCA). We discuss several practical benefits of the MMAE's encoding and show that it can provide robust mood prediction even when up to three quarters of the data sources are lost.

[1]  O Shcherbakov,et al.  Image inpainting based on stacked autoencoders , 2014 .

[2]  Qiang Ji,et al.  2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, ACII 2013, Geneva, Switzerland, September 2-5, 2013 , 2013, ACII.

[3]  Andrew Gelman,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models: Missing-data imputation , 2006 .

[4]  Akane Sano,et al.  Multi-task , Multi-Kernel Learning for Estimating Individual Wellbeing , 2015 .

[5]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[6]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[7]  Arthur C. Graesser,et al.  Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features , 2010, User Modeling and User-Adapted Interaction.

[8]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Björn W. Schuller,et al.  Universum Autoencoder-Based Domain Adaptation for Speech Emotion Recognition , 2017, IEEE Signal Processing Letters.

[10]  Erik Marchi,et al.  Sparse Autoencoder-Based Feature Transfer Learning for Speech Emotion Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[11]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12]  Qirong Mao,et al.  Learning speech emotion features by joint disentangling-discrimination , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[13]  Ashish Kapoor,et al.  Multimodal affect recognition in learning environments , 2005, ACM Multimedia.

[14]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[15]  Rosalind W. Picard,et al.  Multi-task Learning for Predicting Health , Stress , and Happiness , 2016 .

[16]  Akane Sano,et al.  Measuring college students' sleep, stress, mental health and wellbeing with wearable sensors and mobile phones , 2016 .

[17]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[18]  Akane Sano,et al.  Prediction of Happy-Sad mood from daily behaviors and previous sleep history , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[19]  Enhong Chen,et al.  Image Denoising and Inpainting with Deep Neural Networks , 2012, NIPS.

[20]  Omolola A. Adedokun,et al.  Analysis of Paired Dichotomous Data: A Gentle Introduction to the McNemar Test in SPSS , 2011, Journal of MultiDisciplinary Evaluation.

[21]  Maryuri Quintero,et al.  Missing Data Imputation for Ordinal Data , 2018, International Journal of Computer Applications.

[22]  Akane Sano,et al.  Predicting students' happiness from physiology, phone, mobility, and behavioral data , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).