Power-Law Based Estimation of Set Similarity Join Size
暂无分享,去创建一个
[1] George Kingsley Zipf,et al. Human behavior and the principle of least effort , 1949 .
[2] Theodore Johnson,et al. Mining database structure; or, how to build a data quality browser , 2002, SIGMOD '02.
[3] Sunita Sarawagi,et al. Efficient set joins on similarity predicates , 2004, SIGMOD '04.
[4] Henrik Grosskreutz,et al. A Randomized Approach for Approximating the Number of Frequent Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.
[5] Christos Faloutsos,et al. Spatial join selectivity using power laws , 2000, SIGMOD '00.
[6] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.
[7] Mehran Sahami,et al. A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.
[8] Mark E. J. Newman,et al. Power-Law Distributions in Empirical Data , 2007, SIAM Rev..
[9] Raghav Kaushik,et al. Efficient exact set-similarity joins , 2006, VLDB.
[10] Xiaohui Yu,et al. Hashed samples: selectivity estimators for set similarity selection queries , 2008, Proc. VLDB Endow..
[11] Edith Cohen,et al. Size-Estimation Framework with Applications to Transitive Closure and Reachability , 1997, J. Comput. Syst. Sci..
[12] Roberto J. Bayardo,et al. Scaling up all pairs similarity search , 2007, WWW '07.
[13] Hussein H. Aly,et al. Mining association rules , 2001, CATA.
[14] Kyuseok Shim,et al. Approximate substring selectivity estimation , 2009, EDBT '09.
[15] S. Muthukrishnan,et al. Selectively estimation for Boolean queries , 2000, PODS '00.
[16] Ming-Syan Chen,et al. Power-law relationship and self-similarity in the itemset support distribution: analysis and applications , 2008, The VLDB Journal.
[17] Divyakant Agrawal,et al. Detectives: detecting coalition hit inflation attacks in advertising networks streams , 2007, WWW '07.
[18] Dong Wang,et al. Estimating the number of frequent itemsets in a large database , 2009, EDBT '09.
[19] Surajit Chaudhuri,et al. A Primitive Operator for Similarity Joins in Data Cleaning , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[20] Divesh Srivastava,et al. Fast Indexes and Algorithms for Set Similarity Selection Queries , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[21] Renée J. Miller,et al. ConQuer: efficient management of inconsistent databases , 2005, SIGMOD '05.
[22] Theoni Pitoura,et al. Self-Join Size Estimation in Large-scale Distributed Data Systems , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[23] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.
[24] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[25] Kyuseok Shim,et al. Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance , 2007, VLDB.
[26] John F. Roddick,et al. Association mining , 2006, CSUR.
[27] Charles L. Lawson,et al. Solving least squares problems , 1976, Classics in applied mathematics.