A New Study of Two Divergence Metrics for Change Detection in Data Streams

Streaming data are dynamic in nature with frequent changes. To detect such changes, most methods measure the difference between the data distributions in a current time window and a reference window. Divergence metrics and density estimation are required to measure the difference between the data distributions. Our study shows that the Kullback-Leibler (KL) divergence, the most popular metric for comparing distributions, fails to detect certain changes due to its asymmetric property and its dependence on the variance of the data. We thus consider two metrics for detecting changes in univariate data streams: a symmetric KL-divergence and a divergence metric measuring the intersection area of two distributions. The experimental results show that these two metrics lead to more accurate results in change detection than baseline methods such as Change Finder and using conventional KL-divergence.