Use of ensembles of Fourier spectra in capturing recurrent concepts in data streams

In this research, we apply ensembles of Fourier encoded spectra to capture and mine recurring concepts in a data stream environment. Previous research showed that compact versions of Decision Trees can be obtained by applying the Discrete Fourier Transform to accurately capture recurrent concepts in a data stream. However, in highly volatile environments where new concepts emerge often, the approach of encoding each concept in a separate spectrum is no longer viable due to memory overload and thus in this research we present an ensemble approach that addresses this problem. Our empirical results on real world data and synthetic data exhibiting varying degrees of recurrence reveal that the ensemble approach outperforms the single spectrum approach in terms of classification accuracy, memory and execution time.

[1]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[2]  Yiyu Yao,et al.  Rough Sets and Current Trends in Computing , 2001, Lecture Notes in Computer Science.

[3]  Cesare Alippi,et al.  Just-In-Time Classifiers for Recurrent Concepts , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Hillol Kargupta,et al.  Knowledge discovery from heterogeneous data streams using fourier spectrum of decision trees , 2001 .

[5]  Yun Sing Koh,et al.  Detecting concept change in dynamic data streams , 2013, Machine Learning.

[6]  Ernestina Menasalvas Ruiz,et al.  Tracking recurrent concepts using context , 2010, Intell. Data Anal..

[7]  Haimonti Dutta,et al.  Orthogonal decision trees , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[8]  Xindong Wu,et al.  Mining Recurring Concept Drifts with Limited Labeled Streaming Data , 2010, TIST.

[9]  Albert Bifet,et al.  Massive Online Analysis , 2009 .

[10]  Mihai Lazarescu,et al.  A Multi-Resolution Learning Approach to Tracking Concept Drift and Recurrent Concepts , 2005, PRIS.

[11]  Grigorios Tsoumakas,et al.  An Ensemble of Classifiers for coping with Recurring Contexts in Data Streams , 2008, ECAI.

[12]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[13]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[14]  Tony R. Martinez,et al.  Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[15]  Raj Bhatnagar,et al.  Tracking recurrent concept drift in streaming data using ensemble classifiers , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[16]  Yun Sing Koh,et al.  CBDT: A Concept Based Approach to Data Stream Mining , 2009, PAKDD.

[17]  João Gama,et al.  Learning about the Learning Process , 2011, IDA.

[18]  Russel Pears,et al.  Mining Recurrent Concepts in Data Streams Using the Discrete Fourier Transform , 2014, DaWaK.