A Hierarchical Deep Fusion Framework for Egocentric Activity Recognition using a Wearable Hybrid Sensor System

Recently, egocentric activity recognition has attracted considerable attention in the pattern recognition and artificial intelligence communities because of its wide applicability in medical care, smart homes, and security monitoring. In this study, we developed and implemented a deep-learning-based hierarchical fusion framework for the recognition of egocentric activities of daily living (ADLs) in a wearable hybrid sensor system comprising motion sensors and cameras. Long short-term memory (LSTM) and a convolutional neural network are used to perform egocentric ADL recognition based on motion sensor data and photo streaming in different layers, respectively. The motion sensor data are used solely for activity classification according to motion state, while the photo stream is used for further specific activity recognition in the motion state groups. Thus, both motion sensor data and photo stream work in their most suitable classification mode to significantly reduce the negative influence of sensor differences on the fusion results. Experimental results show that the proposed method not only is more accurate than the existing direct fusion method (by up to 6%) but also avoids the time-consuming computation of optical flow in the existing method, which makes the proposed algorithm less complex and more suitable for practical application.

[1]  Edward D. Lemaire,et al.  Change-of-state determination to recognize mobility activities using a BlackBerry smartphone , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[2]  Thanos G. Stavropoulos,et al.  Multi-modal activity recognition from egocentric vision, semantic enrichment and lifelogging applications for the care of dementia , 2018, J. Vis. Commun. Image Represent..

[3]  Zhaozheng Yin,et al.  Human Activity Recognition Using Wearable Sensors by Deep Convolutional Neural Networks , 2015, ACM Multimedia.

[4]  Alejandro Cartas,et al.  Recognizing Activities of Daily Living from Egocentric Images , 2017, IbPRIA.

[5]  Paul J. M. Havinga,et al.  Complex Human Activity Recognition Using Smartphone and Wrist-Worn Motion Sensors , 2016, Sensors.

[6]  Thomas Plötz,et al.  Ensembles of Deep LSTM Learners for Activity Recognition using Wearables , 2017, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[7]  Petia Radeva,et al.  Leveraging Activity Indexing for Egocentric Image Retrieval , 2017, IbPRIA.

[8]  Miguel A. Labrador,et al.  A Survey on Human Activity Recognition using Wearable Sensors , 2013, IEEE Communications Surveys & Tutorials.

[9]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[13]  Senem Velipasalar,et al.  Wearable Camera- and Accelerometer-Based Fall Detection on Portable Devices , 2016, IEEE Embedded Systems Letters.

[14]  Laurent Itti,et al.  Situation awareness via sensor-equipped eyeglasses , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Timo Sztyler,et al.  Improving Motion-based Activity Recognition with Ego-centric Vision , 2018, 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops).

[16]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[17]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[18]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[19]  Petia Radeva,et al.  Object Discovery Using CNN Features in Egocentric Videos , 2015, IbPRIA.

[20]  Chao Gao,et al.  Recognition of human activities with wearable sensors , 2012, EURASIP J. Adv. Signal Process..

[21]  Joo-Hwee Lim,et al.  Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Zhen Li,et al.  An exploratory study on a chest-worn computer for evaluation of diet, physical activity and lifestyle. , 2015, Journal of healthcare engineering.

[23]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[24]  Faicel Chamroukhi,et al.  Physical Human Activity Recognition Using Wearable Sensors , 2015, Sensors.

[25]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26]  Tahmina Zebin,et al.  Human activity recognition with inertial sensors using a deep learning approach , 2016, 2016 IEEE SENSORS.

[27]  Matthias Rauterberg,et al.  The Evolution of First Person Vision Methods: A Survey , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Greg Mori,et al.  A Hierarchical Deep Temporal Model for Group Activity Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Daniel Cremers,et al.  An Improved Algorithm for TV-L 1 Optical Flow , 2009, Statistical and Geometrical Approaches to Visual Motion Analysis.

[31]  Fabien Lagriffoul,et al.  Activity Recognition Using an Egocentric Perspective of Everyday Objects , 2007, UIC.

[32]  James M. Rehg,et al.  Delving into egocentric actions , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Alejandro Cartas,et al.  Batch-Based Activity Recognition from Egocentric Photo-Streams , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[34]  Senem Velipasalar,et al.  A Survey on Activity Detection and Classification Using Wearable Sensors , 2017, IEEE Sensors Journal.

[35]  Jesse Hoey,et al.  Sensor-Based Activity Recognition , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[36]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[37]  Mingui Sun,et al.  Segmentation for efficient browsing of chronical video recorded by a wearable device , 2010, Proceedings of the 2010 IEEE 36th Annual Northeast Bioengineering Conference (NEBEC).

[38]  Petia Radeva,et al.  Toward Storytelling From Visual Lifelogging: An Overview , 2015, IEEE Transactions on Human-Machine Systems.

[39]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Zhen Li,et al.  Daily life event segmentation for lifestyle evaluation based on multi-sensor data recorded by a wearable device , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[41]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[42]  Fabio Ramos,et al.  Multi-scale Conditional Random Fields for first-person activity recognition on elders and disabled patients , 2015 .

[43]  Gregory D. Abowd,et al.  Predicting daily activities from egocentric images using deep learning , 2015, SEMWEB.

[44]  Jean-Christophe Nebel,et al.  Recognition of Activities of Daily Living with Egocentric Vision: A Review , 2016, Sensors.

[45]  Alexander G. Hauptmann,et al.  Multi-camera Egocentric Activity Detection for Personal Assistant , 2013, MMM.

[46]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Yiran Chen,et al.  eButton: A wearable computer for health monitoring and personal assistance , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[48]  Kris M. Kitani,et al.  Going Deeper into First-Person Activity Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Shmuel Peleg,et al.  An Egocentric Look at Video Photographer Identity , 2014, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).