Survey on Query Estimation in Data Streams

Query estimation plays an important role in query optimization by choosing a particular query plan. Performing Query estimation becomes quite challenging in case of fast, continuous, online data streams. Different summarization methods like Sampling, Histograms, Wavelets, Sketches, Discrete cosine series etc. are used to store data distribution for query estimation. In this paper a brief survey of query estimation techniques in view of data streams is presented.

[1]  Noga Alon,et al.  Tracking join and self-join sizes in limited storage , 1999, PODS '99.

[2]  Dimitrios Gunopulos,et al.  Selectivity estimators for multidimensional range queries over real attributes , 2005, The VLDB Journal.

[3]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[4]  Amit Kumar,et al.  Join-distinct aggregate estimation over update streams , 2005, PODS '05.

[5]  S. Muthukrishnan,et al.  Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries , 2001, VLDB.

[6]  Prakasa Rao Nonparametric functional estimation , 1983 .

[7]  Kyuseok Shim,et al.  Approximate query processing using wavelets , 2001, The VLDB Journal.

[8]  Divyakant Agrawal,et al.  Applying the golden rule of sampling for query estimation , 2001, SIGMOD '01.

[9]  Wen-Chi Hou,et al.  Join Size Estimation Over Data Streams Using Cosine Series , 2007 .

[10]  Sudipto Guha,et al.  Approximating a data stream for querying and estimation: algorithms and performance evaluation , 2002, Proceedings 18th International Conference on Data Engineering.

[11]  Sumit Ganguly,et al.  Practical Algorithms for Tracking Database Join Sizes , 2005, FSTTCS.

[12]  Yossi Matias,et al.  New sampling-based summary statistics for improving approximate query answers , 1998, SIGMOD '98.

[13]  Nick Roussopoulos,et al.  Adaptive selectivity estimation using query feedback , 1994, SIGMOD '94.

[14]  Francesco Buccafurri,et al.  Improving range query estimation on histograms , 2002, Proceedings 18th International Conference on Data Engineering.

[15]  Rajeev Rastogi,et al.  Processing complex aggregate queries over data streams , 2002, SIGMOD '02.

[16]  Wen-Chi Hou,et al.  Selectivity estimation of range queries based on data density approximation via cosine series , 2007, Data Knowl. Eng..

[17]  Jeffrey Scott Vitter,et al.  Wavelet-based histograms for selectivity estimation , 1998, SIGMOD '98.

[18]  Sridhar Ramaswamy,et al.  Join synopses for approximate query answering , 1999, SIGMOD '99.

[19]  Torsten Suel,et al.  Optimal Histograms with Quality Guarantees , 1998, VLDB.

[20]  Francesco Buccafurri,et al.  Fast range query estimation by N-level tree histograms , 2004, Data Knowl. Eng..