Non-stationary data sequence classification using online class priors estimation

Online classification is important for real time data sequence classification. Its most challenging problem is that the class priors may vary for non-stationary data sequences. Most of the current online-data-sequence-classification algorithms assume that the class labels of some new-arrived data samples are known and retrain the classifier accordingly. Unfortunately, such assumption is often violated in real applications. But if we were able to estimate the class priors on the test data sequence accurately, we could adjust the classifier without retraining it while preserving a reasonable accuracy. There has been some work on the class priors estimation to classify static data sets using the offline iterative EM algorithm, which has been proved to be quite effective to adjust the classifier. Inspired by the offline iterative EM algorithm for static data sets, in this paper, we propose an online incremental EM algorithm to estimate the class priors along the data sequence. The classifier is adjusted accordingly to keep pace with the varying distribution. The proposed online algorithm is more computationally efficient because it scans the sequence only once. Experimental results show that the proposed algorithm indeed performs better than the conventional offline iterative EM algorithm when the class priors are non-stationary.

[1]  Zoran Obradovic,et al.  Classification on Data with Biased Class Distribution , 2001, ECML.

[2]  Marco Saerens,et al.  Adjusting the Outputs of a Classifier to New a Priori Probabilities May Significantly Improve Classification Accuracy: Evidence from a multi-class problem in remote sensing , 2001, ICML.

[3]  Juan M. Corchado,et al.  Applying lazy learning algorithms to tackle concept drift in spam filtering , 2007, Expert Syst. Appl..

[4]  Demetrios Kazakos,et al.  Recursive estimation of prior probabilities using a mixture , 1977, IEEE Trans. Inf. Theory.

[5]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[6]  Marco Saerens,et al.  Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure , 2002, Neural Computation.

[7]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[8]  Marcus A. Maloof,et al.  Using additive expert ensembles to cope with concept drift , 2005, ICML.

[9]  John Yen,et al.  Relevant data expansion for learning concept drift from sparsely labeled data , 2005, IEEE Transactions on Knowledge and Data Engineering.

[10]  Simon Haykin,et al.  Modern signal processing , 1988 .

[11]  David J. Miller,et al.  Transductive Methods for the Distributed Ensemble Classification Problem , 2007, Neural Computation.

[12]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[13]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[14]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[15]  Ran El-Yaniv,et al.  Stable Transductive Learning , 2006, COLT.

[16]  Philip S. Yu,et al.  Suppressing model overfitting in mining concept-drifting data streams , 2006, KDD '06.

[17]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[18]  George Forman,et al.  Quantifying trends accurately despite classifier error and class imbalance , 2006, KDD '06.

[19]  Thomas Kailath,et al.  Modern signal processing , 1985 .

[20]  Nikola K. Kasabov,et al.  On-line pattern analysis by evolving self-organizing maps , 2003, Neurocomputing.

[21]  George Forman,et al.  Tackling concept drift by temporal inductive transfer , 2006, SIGIR.

[22]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[23]  G. R. Dattatreya,et al.  Asymptotically efficient estimation of prior probabilities in multiclass finite mixtures , 1991, IEEE Trans. Inf. Theory.

[24]  Jayanta Basak,et al.  Online Adaptive Decision Trees: Pattern Classification and Function Approximation , 2006, Neural Computation.

[25]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[26]  David G. Stork,et al.  Pattern Classification , 1973 .

[27]  LastMark Online classification of nonstationary data streams , 2002 .

[28]  Mark Last,et al.  Online classification of nonstationary data streams , 2002, Intell. Data Anal..