A GRU Neural Network with attention mechanism for detection of risk situations on multimodal lifelog data

Multimedia today is also in multimodality. Working with heterogeneous signals we use multimedia techniques of data fusion and mining. Classification from real world datasets are often challenging. The paper is devoted to the detection of personal risk situations of fragile people from multi-modal sensing real world lifelog data named BIRDS. Using a real-world data is challenging as the risk situations are rare and last just a few seconds compared to the global volume of the dataset. In this paper we propose a GRU architecture with global attention block to recognise semantic risk situations from a limited taxonomy. Attention is also focused on data organisation and pre-processing with imputation and normalisation. The proposed method is applied to a real-world collected multimodal dataset and to the OpenSource dataset UCI-HAR for the sake of comparison with the state-of-the-art.

[1]  Gary M. Weiss,et al.  Activity recognition using cell phone accelerometers , 2011, SKDD.

[2]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[3]  Jenny Benois-Pineau,et al.  Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks , 2020, Multimedia Tools and Applications.

[4]  Hanyu Wang,et al.  LSTM-CNN Architecture for Human Activity Recognition , 2020, IEEE Access.

[5]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[6]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[7]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[8]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[9]  Ricardo Chavarriaga,et al.  The Opportunity challenge: A benchmark database for on-body sensor-based activity recognition , 2013, Pattern Recognit. Lett..

[10]  Cathal Gurrin,et al.  Detection of Semantic Risk Situations in Lifelog Data for Improving Life of Frail People , 2020, ICMR.

[11]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[12]  Jenny Benois-Pineau,et al.  Multi-sensing of fragile persons for risk situation detection: devices, methods, challenges , 2019, 2019 International Conference on Content-Based Multimedia Indexing (CBMI).

[13]  P. Barberger‐Gateau,et al.  Recent Trends in Disability-Free Life Expectancy in the French Elderly , 2013, Annual Review of Gerontology and Geriatrics.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  M. Tinetti,et al.  The patient who falls: "It's always a trade-off". , 2010, JAMA.

[16]  M. Tinetti,et al.  Predictors and prognosis of inability to get up after falls among elderly persons. , 1993, JAMA.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[19]  Davide Anguita,et al.  A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[20]  Alejandro Cartas,et al.  Activities of Daily Living Monitoring via a Wearable Camera: Toward Real-World Applications , 2020, IEEE Access.