A new ensemble method for multi-label data stream classification in non-stationary environment

Most existing approaches for the data stream classification focus on single-label data in non-stationary environment. In these methods, each instance can only be tagged with one label. However, in many realistic applications, each instance should be tagged with more than one label. To address the challenge of classifying multi-label stream in evolving environment, we propose a novel Multi-Label Dynamic Ensemble (MLDE) approach. The proposed MLDE integrates a number of Multi-Label Cluster-based Classifiers (MLCCs). MLDE includes an adaptive ensemble method and an ensemble voting method with two important weights, subset accuracy weight and similarity weight. Experimental results reveal that MLDE achieves better performance than state-of-the-art multi-label stream classification algorithms.

[1]  Geoff Holmes,et al.  Scalable and efficient multi-label classification for evolving data streams , 2012, Machine Learning.

[2]  Li Guo,et al.  Mining Multi-Label Data Streams Using Ensemble-Based Active Learning , 2012, SDM.

[3]  Korris Fu-Lai Chung,et al.  A subspace decision cluster classifier for text classification , 2011, Expert Syst. Appl..

[4]  Ralf Klinkenberg,et al.  An Ensemble Classifier for Drifting Concepts , 2005 .

[5]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[6]  Yunming Ye,et al.  A Cluster Tree Method For Text Categorization , 2011 .

[7]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[8]  D. Brzezinski MINING DATA STREAMS WITH CONCEPT DRIFT , 2010 .

[9]  Latifur Khan,et al.  Lacking Labels in the Stream: Classifying Evolving Stream Data with Few Labels , 2009, ISMIS.

[10]  Johannes Gehrke,et al.  Mining data streams under block evolution , 2002, SKDD.

[11]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[12]  Philip S. Yu,et al.  An ensemble-based approach to fast classification of multi-label data streams , 2011, 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[13]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[14]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[15]  Thomas Seidl,et al.  MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering , 2010, WAPA.

[16]  Leo Breiman,et al.  Pasting Small Votes for Classification in Large Databases and On-Line , 1999, Machine Learning.

[17]  Grigorios Tsoumakas,et al.  Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams , 2006 .

[18]  Jerzy Stefanowski,et al.  Accuracy Updated Ensemble for Data Streams with Concept Drift , 2011, HAIS.

[19]  Hamid Beigy,et al.  Semi-supervised Ensemble Learning of Data Streams in the Presence of Concept Drift , 2012, HAIS.

[20]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..