AttriNet: learning mid-level features for human activity recognition with deep belief networks

Human activity recognition (HAR) is essential to many context-aware applications in mobile and ubiquitous computing. A human's physical activity can be decomposed into a sequence of simple actions or body movements, corresponding to what we denote as mid-level features. Such mid-level features ("leg up," 'leg down," "leg still,"...), which we contrast to high-level activities ("walking," "sitting,"...) and low-level features (raw sensor readings), can be developed manually. While proven to be effective, this manual approach is not scalable and relies heavily on human domain expertise. In this paper, we address this limitation by proposing a machine learning method, AttriNet, based on deep belief networks. Our AttriNet method automatically constructs mid-level features and outperforms baseline approaches. Interestingly, we show in experiments that some of the features learned by AttriNet highly correlate with manually defined features. This result demonstrates the potential of using deep learning techniques for learning mid-level features that are semantically meaningful, as a replacement to handcrafted features. Generally, this empirical finding provides an improved understanding of deep learning methods for HAR.

[1]  Thomas Plötz,et al.  Using unlabeled data in a sparse-coding framework for human activity recognition , 2014, Pervasive Mob. Comput..

[2]  Bernt Schiele,et al.  Discovery of activity patterns using topic models , 2008 .

[3]  Ming Zeng,et al.  Semi-supervised convolutional neural networks for human activity recognition , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[4]  Fei-Fei Li,et al.  Attribute Learning in Large-Scale Datasets , 2010, ECCV Workshops.

[5]  Martin L. Griss,et al.  NuActiv: recognizing unseen new activities using semantic attribute-based learning , 2013, MobiSys '13.

[6]  Mike Y. Chen,et al.  Tracking Free-Weight Exercises , 2007, UbiComp.

[7]  Bo Yu,et al.  Convolutional Neural Networks for human activity recognition using mobile sensors , 2014, 6th International Conference on Mobile Computing, Applications and Services.

[8]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[9]  Silvio Savarese,et al.  Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Martin L. Griss,et al.  Towards zero-shot learning for human activity recognition using semantic attribute sequence model , 2013, UbiComp.

[11]  Héctor Pomares,et al.  mHealthDroid: A Novel Framework for Agile Development of Mobile Health Applications , 2014, IWAAL.

[12]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[13]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[14]  Jiang Zhu,et al.  Helix: Unsupervised Grammar Induction for Structured Activity Recognition , 2011, 2011 IEEE 11th International Conference on Data Mining.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Ming Zeng,et al.  Adaptive activity recognition with dynamic heterogeneous sensor fusion , 2014, 6th International Conference on Mobile Computing, Applications and Services.

[17]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  T. Griffiths,et al.  Bayesian nonparametric latent feature models , 2007 .

[19]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Fanglin Chen,et al.  StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones , 2014, UbiComp.

[21]  Joshua B. Tenenbaum,et al.  Learning with Hierarchical-Deep Models , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Jiang Zhu,et al.  MobiSens: A Versatile Mobile Sensing Platform for Real-World Applications , 2013, Mob. Networks Appl..

[23]  Bernt Schiele,et al.  Remember and transfer what you have learned - recognizing composite activities based on activity spotting , 2010, International Symposium on Wearable Computers (ISWC) 2010.

[24]  Hae Young Noh,et al.  FootprintID , 2017, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[25]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[26]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[27]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[28]  Ole J. Mengshoel,et al.  Hybridizing Personal and Impersonal Machine Learning Models for Activity Recognition on Mobile Devices , 2016, MobiCASE.

[29]  Patrick Olivier,et al.  Feature Learning for Activity Recognition in Ubiquitous Computing , 2011, IJCAI.

[30]  Michael I. Jordan,et al.  Bayesian Nonparametric Latent Feature Models , 2011 .

[31]  Jiang Zhu,et al.  Mobile Lifelogger - Recording, Indexing, and Understanding a Mobile User's Life , 2010, MobiCASE.

[32]  Ling Bao,et al.  Activity Recognition from User-Annotated Acceleration Data , 2004, Pervasive.

[33]  Nicholas D. Lane,et al.  Can Deep Learning Revolutionize Mobile Sensing? , 2015, HotMobile.

[34]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[35]  Ming Zeng,et al.  Understanding and improving recurrent networks for human activity recognition by continuous attention , 2018, UbiComp.

[36]  Silvio Savarese,et al.  Recognizing human actions by attributes , 2011, CVPR 2011.