论文信息 - Feature learning for Human Activity Recognition using Convolutional Neural Networks

Feature learning for Human Activity Recognition using Convolutional Neural Networks

The use of Convolutional Neural Networks (CNNs) as a feature learning method for Human Activity Recognition (HAR) is becoming more and more common. Unlike conventional machine learning methods, which require domain-specific expertise, CNNs can extract features automatically. On the other hand, CNNs require a training phase, making them prone to the cold-start problem. In this work, a case study is presented where the use of a pre-trained CNN feature extractor is evaluated under realistic conditions. The case study consists of two main steps: (1) different topologies and parameters are assessed to identify the best candidate models for HAR, thus obtaining a pre-trained CNN model. The pre-trained model (2) is then employed as feature extractor evaluating its use with a large scale real-world dataset. Two CNN applications were considered: Inertial Measurement Unit (IMU) and audio based HAR. For the IMU data, balanced accuracy was 91.98% on the UCI-HAR dataset, and 67.51% on the real-world Extrasensory dataset. For the audio data, the balanced accuracy was 92.30% on the DCASE 2017 dataset, and 35.24% on the Extrasensory dataset.

[1] Nadir Weibel,et al. Context Recognition In-the-Wild , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[2] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3] Tuomas Virtanen,et al. End-to-End Polyphonic Sound Event Detection Using Convolutional Recurrent Neural Networks with Learned Time-Frequency Representation Input , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[4] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[5] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8] Chris D. Nugent,et al. A Public Domain Dataset for Human Activity Recognition in Free-Living Conditions , 2019, 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[9] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[10] Dan Stowell,et al. Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets , 2018, Applied Sciences.

[11] Katarzyna Radecka,et al. A Comprehensive Analysis on Wearable Acceleration Sensors in Human Activity Recognition , 2017, Sensors.

[12] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13] George R. Thoma,et al. Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images , 2018, PeerJ.

[14] Mark D. Plumbley,et al. Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation , 2018, LVA/ICA.

[15] Zhi-Hua Zhou,et al. Fast Multi-Instance Multi-Label Learning , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Cem Ersoy,et al. A Review and Taxonomy of Activity Recognition on Mobile Phones , 2013 .

[17] Kimiaki Shirahama,et al. Comparison of Feature Learning Methods for Human Activity Recognition Using Wearable Sensors , 2018, Sensors.

[18] Dimitrios Tzovaras,et al. Comparing CNN and Human Crafted Features for Human Activity Recognition , 2019, 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[19] Sung-Bae Cho,et al. Human activity recognition with smartphone sensors using deep learning neural networks , 2016, Expert Syst. Appl..

[20] Gernot A. Fink,et al. Convolutional Neural Networks for Human Activity Recognition Using Body-Worn Sensors , 2018, Informatics.

[21] Alejandro Baldominos Gómez,et al. A Comparison of Machine Learning and Deep Learning Techniques for Activity Recognition using Mobile Devices , 2019, Sensors.

[22] Francesc Alías,et al. homeSound: Real-Time Audio Event Detection Based on High Performance Computing for Behaviour and Surveillance Remote Monitoring , 2017, Sensors.

[23] DeLiang Wang,et al. Analyzing noise robustness of MFCC and GFCC features in speaker identification , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24] Chris D. Nugent,et al. Human Activity Recognition from the Acceleration Data of a Wearable Device. Which Features Are More Relevant by Activities? , 2018, UCAmI.

[25] Fotis Foukalas,et al. Wireless Communication Technologies for Safe Cooperative Cyber Physical Systems , 2018, Sensors.

[26] Roberto Togneri,et al. Random forest classification based acoustic event detection utilizing contextual-information and bottleneck features , 2018, Pattern Recognit..

[27] Richard Socher,et al. Improving Generalization Performance by Switching from Adam to SGD , 2017, ArXiv.

[28] Vesa T. Peltonen,et al. Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[29] Bernt Schiele,et al. A tutorial on human activity recognition using body-worn inertial sensors , 2014, CSUR.

[30] Gert R. G. Lanckriet,et al. Recognizing Detailed Human Context in the Wild from Smartphones and Smartwatches , 2016, IEEE Pervasive Computing.

[31] J. Riekki,et al. Auditory Context Recognition Using SVMs , 2008, 2008 The Second International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies.

[32] Francesc Alías,et al. Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification , 2012, IEEE Transactions on Multimedia.

[33] Davide Anguita,et al. A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[34] Daniel Roggen,et al. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[35] Vesa T. Peltonen,et al. Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[36] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37] Davide Anguita,et al. Transition-Aware Human Activity Recognition Using Smartphones , 2016, Neurocomputing.

[38] Jafet Morales,et al. Physical activity recognition by smartphones, a survey , 2017 .

[39] Ankit Shah,et al. DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System , 2017, DCASE.

[40] Tanir Ozcelebi,et al. Learning behavioral context recognition with multi-stream temporal convolutional networks , 2018, ArXiv.