论文信息 - Classifying Data Streams with Skewed Class Distributions and Concept Drifts

Classifying Data Streams with Skewed Class Distributions and Concept Drifts

Classification is an important data analysis tool that uses a model built from historical data to predict class labels for new observations. More and more applications are featuring data streams, rather than finite stored data sets, which are a challenge for traditional classification algorithms. Concept drifts and skewed distributions, two common properties of data stream applications, make the task of learning in streams difficult. The authors aim to develop a new approach to classify skewed data streams that uses an ensemble of models to match the distribution over under-samples of negatives and repeated samples of positives.

[1] Marcus A. Maloof,et al. Using additive expert ensembles to cope with concept drift , 2005, ICML.

[2] Ian Witten,et al. Data Mining , 2000 .

[3] Jennifer Widom,et al. Models and issues in data stream systems , 2002, PODS.

[4] Shonali Krishnaswamy,et al. Mining data streams: a review , 2005, SGMD.

[5] Geoff Hulten,et al. Mining time-changing data streams , 2001, KDD '01.

[6] Nitesh V. Chawla,et al. Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[7] Gustavo E. A. P. A. Batista,et al. A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[8] Gerhard Widmer,et al. Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[9] Kagan Tumer,et al. Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[10] Charu C. Aggarwal,et al. Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[11] Pedro M. Domingos. A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[12] Philip S. Yu,et al. Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[13] S. Muthukrishnan,et al. Data streams: algorithms and applications , 2005, SODA '03.

[14] Philip S. Yu,et al. On demand classification of data streams , 2004, KDD.

[15] Pedro M. Domingos. A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.