Estimation of Multiple Quantiles in Dynamically Varying Data Streams

In this paper we consider the problem of estimating quantiles when data are received sequentially (data stream). For real life data streams, the distribution of the data typically varies with time making estimation of quantiles challenging. We present a method that simultaneously maintain estimates of multiple quantiles of the data stream distribution. The method is based on making incremental updates of the quantile estimates every time a new sample from the data stream is received. The method is memory and computationally efficient since it only stores one value for each quantile estimate and only performs one operation per quantile estimate when a new sample is received from the data stream. The estimates are realistic in the sense that the monotone property of quantiles is satisfied in every iteration. Experiments show that the method efficiently tracks multiple quantiles and outperforms state of the art methods.

[1]  Paul Barford,et al.  Accurate and efficient SLA compliance monitoring , 2007, SIGCOMM '07.

[2]  Alexander Fischer,et al.  Quantile based noise estimation for spectral subtraction and Wiener filtering , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Anis Yazidi,et al.  Multiplicative Update Methods for Incremental Quantile Estimation , 2019, IEEE Transactions on Cybernetics.

[4]  Yong Guan,et al.  Detecting Click Fraud in Pay-Per-Click Streams of Online Advertising Networks , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[5]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[6]  M. Gilli,et al.  An Application of Extreme Value Theory for Measuring Financial Risk , 2006 .

[7]  Incremental tracking of multiple quantiles for network monitoring in cellular networks , 2009, MICNET '09.

[8]  M. Frank Norman,et al.  Markov Processes and Learning Models , 2012 .

[9]  Paul Barford,et al.  Multiobjective Monitoring for SLA Compliance , 2010, IEEE/ACM Transactions on Networking.

[10]  G. Hegerl,et al.  Indices for monitoring changes in extremes based on daily temperature and precipitation data , 2011 .

[11]  Sudipto Guha,et al.  Fast, small-space algorithms for approximate histogram maintenance , 2002, STOC '02.

[12]  Manfred Gilli,et al.  An Application of Extreme Value Theory for Measuring Risk , 2003 .

[13]  Qiang Ma,et al.  Frugal Streaming for Estimating Quantiles , 2013, Space-Efficient Data Structures, Streams, and Algorithms.

[14]  Jin Cao,et al.  Tracking Quantiles of Network Data Streams with Dynamic Operations , 2010, 2010 Proceedings IEEE INFOCOM.

[15]  Gurmeet Singh Manku,et al.  Approximate counts and quantiles over sliding windows , 2004, PODS.

[16]  Zhi-Li Zhang,et al.  Quantile sampling for practical delay monitoring in Internet backbone networks , 2007, Comput. Networks.

[17]  Babak Abbasi,et al.  Bootstrap control charts in monitoring value at risk in insurance , 2013, Expert Syst. Appl..

[18]  Hongjun Lu,et al.  Continuously maintaining quantile summaries of the most recent N elements over a data stream , 2004, Proceedings. 20th International Conference on Data Engineering.

[19]  Luke Tierney,et al.  A Space-Efficient Recursive Procedure for Estimating a Quantile of an Unknown Distribution , 1983 .

[20]  S. Muthukrishnan,et al.  How to Summarize the Universe: Dynamic Maintenance of Quantiles , 2002, VLDB.

[21]  Fei Chen,et al.  Incremental quantile estimation for massive tracking , 2000, KDD '00.