A streaming ensemble classifier with multi-class imbalance learning for activity recognition

Stream multi-class imbalance learning in smart home applications is an evolving learning area that incorporates the challenges of both multi-class imbalance and stream learning. Moreover, another argument in the learning from the imbalanced multi-class distributions that cause misleading classification outcomes, is the imbalanced ratio in a sensor data stream which is vigorously changing. Due to the presence of an inadequate representation of sensor data stream and class distribution skews, learning from such data entails a new algorithm to transform balanced data into a model in a stream fashion. In this paper, we propose a new multi-class stream imbalance ensemble method where the base learner is a Naïve Bayesian classifier. In this approach, each training instance from any of the classes involved in learning based on thresholding on the median prior probability to aid in balancing the classes. Our proposed method diverges from state-of-the-art approaches with regard to being robust to outliers, retains more useful information, and is less sensitive to over-fitting. Also, it has a simple conceptual justification and is easy to implement. We illustrate the effectiveness of the proposed method on two smart home testbed datasets. Our proposed method compares favourably with state-of-the-art approaches.

[1]  Diane J. Cook,et al.  Activity recognition on streaming sensor data , 2014, Pervasive Mob. Comput..

[2]  Xin Yao,et al.  A learning framework for online class imbalance learning , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[3]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[4]  Hien M. Nguyen,et al.  Online learning from imbalanced data streams , 2011, 2011 International Conference of Soft Computing and Pattern Recognition (SoCPaR).

[5]  Nicolò Cesa-Bianchi,et al.  Synergy of multi-label hierarchical ensembles, data fusion, and cost-sensitive methods for gene functional inference , 2012, Machine Learning.

[6]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[7]  Özlem Durmaz Incel,et al.  ARAS human activity datasets in multiple homes with multiple residents , 2013, 2013 7th International Conference on Pervasive Computing Technologies for Healthcare and Workshops.

[8]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[9]  Gwenn Englebienne,et al.  Accurate activity recognition in a home setting , 2008, UbiComp.

[10]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[11]  Stan Matwin,et al.  Learning from Imbalanced Data Using Ensemble Methods and Cluster-Based Undersampling , 2014, NFMCP.

[12]  Grigorios Tsoumakas,et al.  Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification , 2011, IJCAI.

[13]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Zhiping Lin,et al.  One-vs-all for class imbalance learning , 2013, 2013 9th International Conference on Information, Communications & Signal Processing.

[15]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[16]  Jerzy Stefanowski,et al.  Extending Bagging for Imbalanced Data , 2013, CORES.

[17]  H. Kashima,et al.  Roughly balanced bagging for imbalanced data , 2009 .

[18]  Nitesh V. Chawla,et al.  Adaptive Methods for Classification in Arbitrarily Imbalanced and Drifting Data Streams , 2009, PAKDD Workshops.

[19]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .