Practical Algorithms for Tracking Database Join Sizes

We present novel algorithms for estimating the size of the natural join of two data streams that have efficient update processing times and provide excellent quality of estimates.

[1]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[2]  Martín Farach-Colton LATIN 2004: Theoretical Informatics , 2004, Lecture Notes in Computer Science.

[3]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[4]  Graham Cormode,et al.  Sketching Streams Through the Net: Distributed Approximate Query Tracking , 2005, VLDB.

[5]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[6]  Noga Alon,et al.  Tracking join and self-join sizes in limited storage , 1999, PODS '99.

[7]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[8]  Wen-Chi Hou,et al.  Statistical estimators for relational algebra expressions , 1988, PODS '88.

[9]  Mikkel Thorup,et al.  Tabulation based 4-universal hashing with applications to second moment estimation , 2004, SODA '04.

[10]  Yossi Matias,et al.  Bifocal sampling for skew-resistant join size estimation , 1996, SIGMOD '96.

[11]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[12]  Rajeev Rastogi,et al.  Processing complex aggregate queries over data streams , 2002, SIGMOD '02.

[13]  Jeffrey F. Naughton,et al.  Practical selectivity estimation through adaptive sampling , 1990, SIGMOD '90.

[14]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[15]  Rajeev Rastogi,et al.  Processing Data-Stream Join Aggregates Using Skimmed Sketches , 2004, EDBT.