Concept Drift Detection on Unlabeled Data Streams: A Systematic Literature Review

Dynamic data streams applications are bound to potential changes in data distribution, of which in the context of data stream mining, will cause concept drift. Data stream mining model must have the capability to adapt to concept drift, otherwise, risk the deterioration of its learning performance. Most of the existing surveys on concept drift detection methods focused on labeled data streams, that may be inapplicable to scenario where true labels are unavailable. The aim of this systematic literature review is to study the existing concept drift detection methods on unlabeled data streams, focusing on the learning process and the way concept drift is monitored in the data stream mining model. A total of 15 articles are selected for final analysis, and it is found that most of the drift detection methods were applied in a supervised learning setting. An experimental evaluation of the methods can be performed in the future work to investigate their performance in unsupervised learning setting.

[1]  Stan Matwin,et al.  Fast Unsupervised Online Drift Detection Using Incremental Kolmogorov-Smirnov Test , 2016, KDD.

[2]  Elaine Ribeiro de Faria,et al.  Pruned Sets for Multi-Label Stream Classification without True Labels , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[3]  M. C. Padma,et al.  CD2A: Concept Drift Detection Approach Toward Imbalanced Data Stream , 2019, Lecture Notes in Electrical Engineering.

[4]  Aruna Tiwari,et al.  A Fast Adaptive Classification Approach Using Kernel Ridge Regression and Clustering for Non-stationary Data Stream , 2019 .

[5]  Eulanda Miranda dos Santos,et al.  A Drift Detection Method Based on Active Learning , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[6]  Rafael Giusti,et al.  An overview of unsupervised drift detection methods , 2020, WIREs Data Mining Knowl. Discov..

[7]  Cheong Hee Park,et al.  An Efficient Concept Drift Detection Method for Streaming Data under Limited Labeling , 2017, IEICE Trans. Inf. Syst..

[8]  D. Himaja,et al.  An Unsupervised Drift Detector for Online Imbalanced Evolving Streams , 2019, DATA.

[9]  Bartosz Krawczyk,et al.  Unsupervised Drift Detector Ensembles for Data Stream Mining , 2019, 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[10]  Rodrigo Fernandes de Mello,et al.  On learning guarantees to unsupervised concept drift detection on data streams , 2019, Expert Syst. Appl..

[11]  Mehmed M. Kantardzic,et al.  Don't Pay for Validation: Detecting Drifts from Unlabeled data Using Margin Density , 2015, INNS Conference on Big Data.

[12]  Roberto Souto Maior de Barros,et al.  An overview and comprehensive comparison of ensembles for concept drift , 2019, Inf. Fusion.

[13]  Latifur Khan,et al.  Semi Supervised Adaptive Framework for Classifying Evolving Data Stream , 2015, PAKDD.

[14]  Adel Said Elmaghraby,et al.  Aggregate density-based concept drift identification for dynamic sensor data models , 2020, Neural Computing and Applications.

[15]  Cheong Hee Park,et al.  Anomaly Pattern Detection on Data Streams , 2018, 2018 IEEE International Conference on Big Data and Smart Computing (BigComp).

[16]  Herna L. Viktor,et al.  Context-Based Abrupt Change Detection and Adaptation for Categorical Data Streams , 2017, DS.

[17]  Khaled Ghédira,et al.  Discussion and review on evolving data streams and concept drift adapting , 2018, Evol. Syst..

[18]  Scott Wares,et al.  Data stream mining: methods and challenges for handling concept drift , 2019, SN Applied Sciences.

[19]  Fulin Wei,et al.  Two birds with one stone: Classifying positive and unlabeled examples on uncertain data streams , 2018, Neurocomputing.

[20]  Ning Lu,et al.  Concept drift detection via competence models , 2014, Artif. Intell..

[21]  João Paulo Papa,et al.  An Overview on Concept Drift Learning , 2019, IEEE Access.

[22]  Guangquan Zhang,et al.  Learning under Concept Drift: A Review , 2019, IEEE Transactions on Knowledge and Data Engineering.

[23]  J. C. Schlimmer,et al.  Incremental learning from noisy data , 2004, Machine Learning.

[24]  Concha Bielza,et al.  Clustering of Data Streams With Dynamic Gaussian Mixture Models: An IoT Application in Industrial Processes , 2018, IEEE Internet of Things Journal.

[25]  Mehmed Kantardzic,et al.  No Free Lunch Theorem for concept drift detection in streaming data classification: A review , 2019, WIREs Data Mining Knowl. Discov..