Learning behavioral context recognition with multi-stream temporal convolutional networks

Smart devices of everyday use (such as smartphones and wearables) are increasingly integrated with sensors that provide immense amounts of information about a person's daily life such as behavior and context. The automatic and unobtrusive sensing of behavioral context can help develop solutions for assisted living, fitness tracking, sleep monitoring, and several other fields. Towards addressing this issue, we raise the question: can a machine learn to recognize a diverse set of contexts and activities in a real-life through joint learning from raw multi-modal signals (e.g. accelerometer, gyroscope and audio etc.)? In this paper, we propose a multi-stream temporal convolutional network to address the problem of multi-label behavioral context recognition. A four-stream network architecture handles learning from each modality with a contextualization module which incorporates extracted representations to infer a user's context. Our empirical evaluation suggests that a deep convolutional network trained end-to-end achieves an optimal recognition rate. Furthermore, the presented architecture can be extended to include similar sensors for performance improvements and handles missing modalities through multi-task learning without any manual feature engineering on highly imbalanced and sparsely labeled dataset.

[1]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[2]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[3]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Diane J. Cook,et al.  Keeping the Resident in the Loop: Adapting the Smart Home to the User , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[5]  Hwee Pink Tan,et al.  Deep Activity Recognition Models with Triaxial Accelerometers , 2015, AAAI Workshop: Artificial Intelligence Applied to Assistive Technologies and Smart Environments.

[6]  Thomas Plötz,et al.  Using unlabeled data in a sparse-coding framework for human activity recognition , 2014, Pervasive Mob. Comput..

[7]  Sung-Bae Cho,et al.  Human activity recognition with smartphone sensors using deep learning neural networks , 2016, Expert Syst. Appl..

[8]  Xingshe Zhou,et al.  Disorientation detection by mining GPS trajectories for cognitively-impaired elders , 2015, Pervasive Mob. Comput..

[9]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[10]  Miguel A. Labrador,et al.  A Survey on Human Activity Recognition using Wearable Sensors , 2013, IEEE Communications Surveys & Tutorials.

[11]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[12]  VALENTIN RADU,et al.  Multimodal Deep Learning for Activity and Context Recognition , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[13]  Joachim M. Buhmann,et al.  The Balanced Accuracy and Its Posterior Distribution , 2010, 2010 20th International Conference on Pattern Recognition.

[14]  Nicholas D. Lane,et al.  DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning , 2015, UbiComp.

[15]  Mi Zhang,et al.  MyBehavior: automatic personalized health feedback from user behaviors and preferences using smartphones , 2015, UbiComp.

[16]  Alex Mihailidis,et al.  A Survey on Ambient-Assisted Living Tools for Older Adults , 2013, IEEE Journal of Biomedical and Health Informatics.

[17]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Thomas Plötz,et al.  Deep, Convolutional, and Recurrent Models for Human Activity Recognition Using Wearables , 2016, IJCAI.

[19]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[20]  Bo Yu,et al.  Convolutional Neural Networks for human activity recognition using mobile sensors , 2014, 6th International Conference on Mobile Computing, Applications and Services.

[21]  Lukasz Kaiser,et al.  Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.

[22]  Diogo R. Ferreira,et al.  Preprocessing techniques for context recognition from accelerometer data , 2010, Personal and Ubiquitous Computing.

[23]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[24]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[25]  Zhaozheng Yin,et al.  Human Activity Recognition Using Wearable Sensors by Deep Convolutional Neural Networks , 2015, ACM Multimedia.

[26]  Rahim Tafazolli,et al.  A survey on smartphone-based systems for opportunistic user context recognition , 2013, CSUR.

[27]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Xiaoli Li,et al.  Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition , 2015, IJCAI.

[29]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Paul J. M. Havinga,et al.  A Survey of Online Activity Recognition Using Mobile Phones , 2015, Sensors.

[31]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[32]  Gert R. G. Lanckriet,et al.  Recognizing Detailed Human Context in the Wild from Smartphones and Smartwatches , 2016, IEEE Pervasive Computing.

[33]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[34]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[35]  Inseok Hwang,et al.  SocioPhone: everyday face-to-face interaction monitoring platform using multi-phone sensor fusion , 2013, MobiSys '13.

[36]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[37]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[38]  Shyamal Patel,et al.  Mercury: a wearable sensor network platform for high-fidelity motion analysis , 2009, SenSys '09.

[39]  Andrew T. Campbell,et al.  BeWell+: multi-dimensional wellbeing monitoring with community-guided user feedback and energy optimization , 2012, Wireless Health.

[41]  Patrick Olivier,et al.  Feature Learning for Activity Recognition in Ubiquitous Computing , 2011, IJCAI.

[42]  Mirco Musolesi,et al.  Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application , 2008, SenSys '08.