Volatility Drift Prediction for Transactional Data Streams

The reasons for concept drift in a data stream can vary widely, from deterioration of a machine to a change in peoples' buying patterns. In order to effectively detect concept drifts, most predictive stream mining systems contain a drift detector that monitors and signals concept drifts. However, few of these systems are designed to find drifts in transactional datasets, which have unlabelled data. Transactional datasets describe events, such as orders or payments, which are traditionally analysed using association rules. In this paper, we propose a novel drift detection technique, ProChange, that has two parts. The first part is a drift detector, VR-Change, that finds both real and virtual drifts in unlabelled transactional data streams using the Hellinger distance. The second part is a drift predictor, which models the volatility of drifts using a probabilistic network to predict the location of future drifts. Using the predictor, we can dynamically adapt the confidence threshold, enabling VR-Change to be more sensitive around potential future drift points. We evaluated the performance of ProChange by comparing it against traditional detectors showing that it detects both real and virtual drifts effectively and efficiently in terms of accuracy.

[1]  Yun Sing Koh CD-TDS: Change detection in transactional data streams for frequent pattern mining , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[2]  Gillian Dobbie,et al.  Detecting Volatility Shift in Data Streams , 2014, 2014 IEEE International Conference on Data Mining.

[3]  Gregory Ditzler,et al.  Hellinger distance based drift detection for nonstationary environments , 2011, 2011 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments (CIDUE).

[4]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[5]  Gillian Dobbie,et al.  Drift Detection Using Stream Volatility , 2015, ECML/PKDD.

[6]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[7]  Mohammad Hadi Sadreddini,et al.  EclatDS: An efficient sliding window based frequent pattern mining method for data streams , 2011, Intell. Data Anal..

[8]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[9]  Yun Sing Koh,et al.  Proactive drift detection: Predicting concept drifts in data streams using probabilistic networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[10]  Manoranjan Dash,et al.  Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams , 2008, DaWaK.