Optimal Bayesian classification in nonstationary streaming environments

A novel method of classifying data drawn from a nonstationary distribution with drifting mean and variance is presented. The novelty of the approach is based on splitting the problem of tracking a nonstationary distribution into separate classification and time series state estimation problems. State space models for drift in both the mean and variance are presented, which are then successfully tracked using a Kaiman filter and a particle filter for the linear and non-linear parts respectively. Preliminary results, which show the promising potential of the approach, are also presented, along with concluding remarks for potential uses of the proposed approach.

[1]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[2]  Gregory Ditzler,et al.  Semi-supervised learning in nonstationary environments , 2011, The 2011 International Joint Conference on Neural Networks.

[3]  Robi Polikar,et al.  Active learning in nonstationary environments , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[4]  北川 源四郎,et al.  Introduction to time series modeling , 2010 .

[5]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[6]  A. Doucet,et al.  A Tutorial on Particle Filtering and Smoothing: Fifteen years later , 2008 .

[7]  Robi Polikar,et al.  COMPOSE: A Semisupervised Learning Framework for Initially Labeled Nonstationary Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[9]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[10]  J. Heckman Sample selection bias as a specification error , 1979 .

[11]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[12]  Anna Margolis,et al.  Automatic Annotation of Spoken Language Using Out-of-Domain Resources and Domain Adaptation , 2011 .

[13]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[14]  Robi Polikar,et al.  Semi-supervised learning in initially labeled non-stationary environments with gradual drift , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[15]  Genshiro Kitagawa Analysis of Time Series with a State-Space Model , 2010 .

[16]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[17]  D. Simon Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches , 2006 .

[18]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .