Sampling Algorithms for Evolving Datasets

[1]  Sridhar Ramaswamy,et al.  Join synopses for approximate query answering , 1999, SIGMOD '99.

[2]  Peter J. Haas,et al.  A bi-level Bernoulli scheme for database sampling , 2004, SIGMOD '04.

[3]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[4]  David J. DeWitt,et al.  Practical Skew Handling in Parallel Joins , 1992, VLDB.

[5]  Yossi Matias,et al.  New sampling-based summary statistics for improving approximate query answers , 1998, SIGMOD '98.

[6]  Peter J. Haas,et al.  On synopses for distinct-value estimation under multiset operations , 2007, SIGMOD '07.

[7]  Dorothy E. Denning,et al.  Secure statistical databases with random sample queries , 1980, TODS.

[8]  Mong-Li Lee,et al.  ICICLES: Self-Tuning Samples for Approximate Query Answering , 2000, VLDB.

[9]  Srikanta Tirthapura,et al.  Estimating simple functions on the union of data streams , 2001, SPAA '01.

[10]  Paul Brown,et al.  CORDS: automatic discovery of correlations and soft functional dependencies , 2004, SIGMOD '04.

[11]  Peter J. Haas,et al.  Maintaining bounded-size sample synopses of evolving datasets , 2008, The VLDB Journal.

[12]  Viswanath Poosala,et al.  Congressional Samples for Approximate Answering of Group-By Queries , 2000, SIGMOD Conference.

[13]  Raghu Ramakrishnan,et al.  Synopses for query optimization: A space-complexity perspective , 2004, TODS.

[14]  Yossi Matias,et al.  Fast incremental maintenance of approximate histograms , 1997, TODS.

[15]  A. I. McLeod,et al.  A Convenient Algorithm for Drawing a Simple Random Sample , 1983 .

[16]  Rainer Gemulla,et al.  Sampling algorithms for evolving datasets , 2008 .

[17]  Chris Jermaine,et al.  Scalable approximate query processing with the DBO engine , 2007, SIGMOD '07.

[18]  Ruoming Jin,et al.  New Sampling-Based Estimators for OLAP Queries , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[19]  Yufei Tao,et al.  Random Sampling for Continuous Streams with Arbitrary Updates , 2007 .

[20]  Peter J. Haas,et al.  Techniques for Warehousing of Sample Data , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Rajeev Motwani,et al.  Overcoming limitations of sampling for aggregation queries , 2001, Proceedings 17th International Conference on Data Engineering.

[22]  Jeffrey F. Naughton,et al.  Selectivity and Cost Estimation for Joins Based on Random Sampling , 1996, J. Comput. Syst. Sci..

[23]  Nick G. Duffield,et al.  Trajectory sampling for direct traffic observation , 2001, TNET.

[24]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[25]  Chris Jermaine,et al.  Sampling-based estimators for subset-based queries , 2008, The VLDB Journal.

[26]  Carsten Lund,et al.  Charging from sampled network usage , 2001, IMW '01.

[27]  Surajit Chaudhuri,et al.  Effective use of block-level sampling in statistics estimation , 2004, SIGMOD '04.

[28]  S. Muthukrishnan,et al.  Estimating Rarity and Similarity over Data Stream Windows , 2002, ESA.

[29]  Yossi Matias,et al.  Bifocal sampling for skew-resistant join size estimation , 1996, SIGMOD '96.

[30]  Paul Brown,et al.  BHUNT: Automatic Discovery of Fuzzy Algebraic Constraints in Relational Data , 2003, VLDB.

[31]  Peter J. Haas,et al.  Improved histograms for selectivity estimation of range predicates , 1996, SIGMOD '96.

[32]  Jeffrey F. Naughton,et al.  End-biased Samples for Join Cardinality Estimation , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[33]  Rajeev Motwani,et al.  Sampling from a moving window over streaming data , 2002, SODA '02.

[34]  Piotr Indyk,et al.  Sampling in dynamic data streams and applications , 2005, Int. J. Comput. Geom. Appl..

[35]  Peter J. Haas,et al.  Maintaining bernoulli samples over evolving multisets , 2007, PODS '07.

[36]  David J. DeWitt,et al.  Parallel sorting on a shared-nothing architecture using probabilistic splitting , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[37]  Rajeev Motwani,et al.  Random sampling for histogram construction: how much is enough? , 1998, SIGMOD '98.

[38]  Wolfgang Lehner,et al.  Sampling time-based sliding windows in bounded space , 2008, SIGMOD Conference.

[39]  Peter J. Haas,et al.  A dip in the reservoir: maintaining sample synopses of evolving datasets , 2006, VLDB.

[40]  Wolfgang Lehner,et al.  Cardinality estimation using sample views with quality assurance , 2007, SIGMOD '07.

[41]  Aristides Gionis,et al.  Clustering aggregation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[42]  Aristides Gionis,et al.  Assessing data mining results via swap randomization , 2007, TKDD.

[43]  Graham Cormode,et al.  Summarizing and Mining Inverse Distributions on Data Streams via Dynamic Inverse Sampling , 2005, VLDB.