Activity Recognition for Ambient Assisted Living with Videos, Inertial Units and Ambient Sensors

Worldwide demographic projections point to a progressively older population. This fact has fostered research on Ambient Assisted Living, which includes developments on smart homes and social robots. To endow such environments with truly autonomous behaviours, algorithms must extract semantically meaningful information from whichever sensor data is available. Human activity recognition is one of the most active fields of research within this context. Proposed approaches vary according to the input modality and the environments considered. Different from others, this paper addresses the problem of recognising heterogeneous activities of daily living centred in home environments considering simultaneously data from videos, wearable IMUs and ambient sensors. For this, two contributions are presented. The first is the creation of the Heriot-Watt University/University of Sao Paulo (HWU-USP) activities dataset, which was recorded at the Robotic Assisted Living Testbed at Heriot-Watt University. This dataset differs from other multimodal datasets due to the fact that it consists of daily living activities with either periodical patterns or long-term dependencies, which are captured in a very rich and heterogeneous sensing environment. In particular, this dataset combines data from a humanoid robot’s RGBD (RGB + depth) camera, with inertial sensors from wearable devices, and ambient sensors from a smart home. The second contribution is the proposal of a Deep Learning (DL) framework, which provides multimodal activity recognition based on videos, inertial sensors and ambient sensors from the smart home, on their own or fused to each other. The classification DL framework has also validated on our dataset and on the University of Texas at Dallas Multimodal Human Activities Dataset (UTD-MHAD), a widely used benchmark for activity recognition based on videos and inertial sensors, providing a comparative analysis between the results on the two datasets considered. Results demonstrate that the introduction of data from ambient sensors expressively improved the accuracy results.

[1]  Rytis Maskeliunas,et al.  A Review of Internet of Things Technologies for Ambient Assisted Living Environments , 2019, Future Internet.

[2]  Alexandros André Chaaraoui,et al.  A review on vision techniques applied to Human Behaviour Analysis for Ambient-Assisted Living , 2012, Expert Syst. Appl..

[3]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[4]  Majid Ali Khan Quaid,et al.  Wearable sensors based human behavioral pattern recognition using statistical features and reweighted genetic algorithm , 2019, Multimedia Tools and Applications.

[5]  Héctor Pomares,et al.  A benchmark dataset to evaluate sensor displacement in activity recognition , 2012, UbiComp.

[6]  Kerstin Dautenhahn,et al.  On the Integration of Adaptive and Interactive Robotic Smart Spaces , 2015, Paladyn J. Behav. Robotics.

[7]  Ian Craddock,et al.  A dataset for room level indoor localization using a smart home in a box , 2019, Data in brief.

[8]  Chris D. Nugent,et al.  Ensemble classifier of long short-term memory with fuzzy temporal windows on binary sensors for activity recognition , 2018, Expert Syst. Appl..

[9]  Mohammad Mehedi Hassan,et al.  A Hybrid Deep Learning Model for Human Activity Recognition Using Multimodal Body Sensing Data , 2019, IEEE Access.

[10]  Daijin Kim,et al.  A Depth Video Sensor-Based Life-Logging Human Activity Recognition System for Elderly Care in Smart Indoor Environments , 2014, Sensors.

[11]  Andrew Zisserman,et al.  Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Andrew Zisserman,et al.  A Short Note on the Kinetics-700 Human Action Dataset , 2019, ArXiv.

[13]  Wojciech Pieczynski,et al.  An adaptive and on-line IMU-based locomotion activity classification method using a triplet Markov model , 2019, Neurocomputing.

[14]  Jeffrey M. Hausdorff,et al.  Potentials of Enhanced Context Awareness in Wearable Assistants for Parkinson's Disease Patients with the Freezing of Gait Syndrome , 2009, 2009 International Symposium on Wearable Computers.

[15]  Daijin Kim,et al.  Depth silhouettes context: A new robust feature for human tracking and activity recognition based on embedded HMMs , 2015, 2015 12th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).

[16]  Fida Hussain,et al.  Video Representation via Fusion of Static and Motion Features Applied to Human Activity Recognition , 2019, KSII Trans. Internet Inf. Syst..

[17]  Tamás D. Gedeon,et al.  Deep Feature Learning and Visualization for EEG Recording Using Autoencoders , 2018, ICONIP.

[18]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[19]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[20]  Didier Stricker,et al.  Introducing a New Benchmarked Dataset for Activity Monitoring , 2012, 2012 16th International Symposium on Wearable Computers.

[21]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[23]  Daniel Cremers,et al.  A primal-dual framework for real-time dense RGB-D scene flow , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Henry C. Lin,et al.  Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions , 2006, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[25]  Ruzena Bajcsy,et al.  Berkeley MHAD: A comprehensive Multimodal Human Action Database , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[26]  Daijin Kim,et al.  Depth Images-based Human Detection, Tracking and Activity Recognition Using Spatiotemporal Features and Modified HMM , 2016 .

[27]  Bingbing Ni,et al.  RGBD-HuDaAct: A color-depth video database for human daily activity recognition , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[28]  Diane J. Cook,et al.  CASAS: A Smart Home in a Box , 2013, Computer.

[29]  Zhengyou Zhang,et al.  Microsoft Kinect Sensor and Its Effect , 2012, IEEE Multim..

[30]  R. Hasenauer,et al.  New Efficiency: Introducing Social Assistive Robots in Social Eldercare Organizations , 2019, 2019 IEEE International Symposium on Innovation and Entrepreneurship (TEMS-ISIE).

[31]  Dario Maio,et al.  A multimodal approach for human activity recognition based on skeleton and RGB data , 2020, Pattern Recognit. Lett..

[32]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[33]  Yuto Lim,et al.  A Novel Human Activity Recognition and Prediction in Smart Home Based on Interaction , 2019, Sensors.

[34]  Daijin Kim,et al.  Robust human activity recognition from depth video using spatiotemporal multi-fused features , 2017, Pattern Recognit..

[35]  Luca Benini,et al.  Activity Recognition from On-Body Sensors: Accuracy-Power Trade-Off by Dynamic Sensor Selection , 2008, EWSN.

[36]  C. Schmid,et al.  Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Gang Wang,et al.  Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Bogdan Kwolek,et al.  Fall Detection on Embedded Platform Using Kinect and Wireless Accelerometer , 2012, ICCHP.

[39]  Yiming Li,et al.  Recognition of Daily Activities of Two Residents in a Smart Home Based on Time Clustering , 2020, Sensors.

[40]  Qing Lei,et al.  A Comprehensive Survey of Vision-Based Human Action Recognition Methods , 2019, Sensors.

[41]  Ahmad Jalal,et al.  Wearable Sensors for Activity Analysis using SMO-based Random Forest over Smart home and Sports Datasets , 2020, 2020 3rd International Conference on Advancements in Computational Sciences (ICACS).

[42]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Daijin Kim,et al.  Shape and Motion Features Approach for Activity Tracking and Recognition from Kinect Video Camera , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops.

[44]  Pichao Wang,et al.  Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Mehrtash Tafazzoli Harandi,et al.  Going deeper into action recognition: A survey , 2016, Image Vis. Comput..

[46]  Ghassan Al-Regib,et al.  TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition , 2017, Signal Process. Image Commun..

[47]  Jiebo Luo,et al.  Deep Multimodal Representation Learning from Temporal Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Ricardo Chavarriaga,et al.  The Opportunity challenge: A benchmark database for on-body sensor-based activity recognition , 2013, Pattern Recognit. Lett..

[49]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[50]  Paolo Sernani,et al.  Exploring the ambient assisted living domain: a systematic review , 2017, J. Ambient Intell. Humaniz. Comput..

[51]  Felipe Aparecido Garcia,et al.  Temporal Approaches for Human Activity Recognition Using Inertial Sensors , 2019, 2019 Latin American Robotics Symposium (LARS), 2019 Brazilian Symposium on Robotics (SBR) and 2019 Workshop on Robotics in Education (WRE).

[52]  Elena Mugellini,et al.  ChAirGest: a challenge for multimodal mid-air gesture recognition for close HCI , 2013, ICMI '13.

[53]  Tan-Hsu Tan,et al.  Unobtrusive Activity Recognition of Elderly People Living Alone Using Anonymous Binary Sensors and DCNN , 2019, IEEE Journal of Biomedical and Health Informatics.

[54]  Chris D. Nugent,et al.  A Knowledge-Driven Approach to Activity Recognition in Smart Homes , 2012, IEEE Transactions on Knowledge and Data Engineering.

[55]  João Gama,et al.  Human Activity Recognition Using Inertial Sensors in a Smartphone: An Overview , 2019, Sensors.

[56]  Gernot A. Fink,et al.  Learning Attribute Representation for Human Activity Recognition , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[57]  Pascal Vasseur,et al.  Introduction to Multisensor Data Fusion , 2005, The Industrial Information Technology Handbook.

[58]  Stephen J. McKenna,et al.  Combining embedded accelerometers with computer vision for recognizing food preparation activities , 2013, UbiComp.

[59]  Nasser Kehtarnavaz,et al.  Fusion of Video and Inertial Sensing for Deep Learning–Based Human Action Recognition , 2019, Sensors.

[60]  Shih-Fu Chang,et al.  Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.

[61]  Nasser Kehtarnavaz,et al.  UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[62]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Nasser Kehtarnavaz,et al.  C-MHAD: Continuous Multimodal Human Action Dataset of Simultaneous Video and Inertial Sensing , 2020, Sensors.

[64]  Zicheng Liu,et al.  HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[66]  Yifeng He,et al.  Human action recognition via multiview discriminative analysis of canonical correlations , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[67]  Henrik Blunck,et al.  Robust Human Activity Recognition using smartwatches and smartphones , 2018, Eng. Appl. Artif. Intell..

[68]  Ahmad Jalal,et al.  Vision-Based Human Activity Recognition System Using Depth Silhouettes: A Smart Home System for Monitoring the Residents , 2019, Journal of Electrical Engineering & Technology.

[69]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[70]  Chris D. Nugent,et al.  From Activity Recognition to Intention Recognition for Assisted Living Within Smart Homes , 2017, IEEE Transactions on Human-Machine Systems.

[71]  Hossein Amirkhani,et al.  Smart home resident identification based on behavioral patterns using ambient sensors , 2019, Personal and Ubiquitous Computing.

[72]  Roseli A. F. Romero,et al.  Uncovering Human Multimodal Activity Recognition with a Deep Learning Approach , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[73]  Weihua Sheng,et al.  Detection of privacy-sensitive situations for social robots in smart homes , 2016, 2016 IEEE International Conference on Automation Science and Engineering (CASE).

[74]  Ennio Gambi,et al.  Proposal and Experimental Evaluation of Fall Detection Solution Based on Wearable and Depth Data Fusion , 2015, ICT Innovations.

[75]  Senem Velipasalar,et al.  Autonomous Human Activity Classification From Wearable Multi-Modal Sensors , 2019, IEEE Sensors Journal.

[76]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[77]  Joo-Hwee Lim,et al.  Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[78]  S. Mukhopadhyay,et al.  Activity and Anomaly Detection in Smart Home: A Survey , 2016 .

[79]  Kibum Kim,et al.  RGB-D Images for Object Segmentation, Localization and Recognition in Indoor Scenes using Feature Descriptor and Hough Voting , 2020, 2020 17th International Bhurban Conference on Applied Sciences and Technology (IBCAST).

[80]  Gang Wang,et al.  NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[81]  Amir Nadeem,et al.  Human Actions Tracking and Recognition Based on Body Parts Detection via Artificial Neural Network , 2020, 2020 3rd International Conference on Advancements in Computational Sciences (ICACS).

[82]  Davide Bacciu,et al.  A Benchmark Dataset for Human Activity Recognition and Ambient Assisted Living , 2016, ISAmI.

[83]  Gang Yu,et al.  Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction , 2014, ACCV.

[84]  Weiming Hu,et al.  Tangent Fisher Vector on Matrix Manifolds for Action Recognition , 2020, IEEE Transactions on Image Processing.

[85]  Nasser Kehtarnavaz,et al.  A survey of depth and inertial sensor fusion for human action recognition , 2015, Multimedia Tools and Applications.

[86]  Balasubramanian Raman,et al.  Evaluating fusion of RGB-D and inertial sensors for multimodal human action recognition , 2020, J. Ambient Intell. Humaniz. Comput..

[87]  Haroon Idrees,et al.  The THUMOS challenge on action recognition for videos "in the wild" , 2016, Comput. Vis. Image Underst..

[88]  Nishant Doshi,et al.  Human Activity Recognition: A Survey , 2019, Procedia Computer Science.

[89]  Jessica K. Hodgins,et al.  Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database , 2008 .

[90]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[91]  Emanuele Frontoni,et al.  A sequential deep learning application for recognising human activities in smart homes , 2020, Neurocomputing.

[92]  Menachem Domb,et al.  Smart Home Systems Based on Internet of Things , 2019, IoT and Smart Home Automation [Working Title].

[93]  Bernard Ghanem,et al.  ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Shaharyar Kamal,et al.  Dense RGB-D Map-Based Human Tracking and Activity Recognition using Skin Joints Features and Self-Organizing Map , 2015, KSII Trans. Internet Inf. Syst..

[95]  Davide Anguita,et al.  Transition-Aware Human Activity Recognition Using Smartphones , 2016, Neurocomputing.

[96]  Jing Zhang,et al.  Action Recognition From Depth Maps Using Deep Convolutional Neural Networks , 2016, IEEE Transactions on Human-Machine Systems.

[97]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[98]  Hans W Guesgen,et al.  Using Rough Sets to Improve Activity Recognition Based on Sensor Data † , 2020, Sensors.

[99]  Xiaohui Peng,et al.  Deep Learning for Sensor-based Activity Recognition: A Survey , 2017, Pattern Recognit. Lett..

[100]  Davide Bacciu,et al.  An ambient intelligence approach for learning in smart robotic environments , 2019, Comput. Intell..