Fast and Scalable Distributed Set Similarity Joins for Big Data Analytics
暂无分享,去创建一个
[1] Guoliang Li,et al. MassJoin: A mapreduce-based method for scalable string similarity joins , 2014, 2014 IEEE 30th International Conference on Data Engineering.
[2] Christos Faloutsos,et al. V-SMART-Join: A Scalable MapReduce Framework for All-Pair Similarity Joins of Multisets and Vectors , 2012, Proc. VLDB Endow..
[3] Sunita Sarawagi,et al. Efficient set joins on similarity predicates , 2004, SIGMOD '04.
[4] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .
[5] William E. Winkler,et al. The State of Record Linkage and Current Research Problems , 1999 .
[6] Chen Li,et al. Efficient parallel set-similarity joins using MapReduce , 2010, SIGMOD Conference.
[7] Surajit Chaudhuri,et al. A Primitive Operator for Similarity Joins in Data Cleaning , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[8] Andreas Thor,et al. Load Balancing for MapReduce-based Entity Resolution , 2011, 2012 IEEE 28th International Conference on Data Engineering.
[9] Guoliang Li,et al. PASS-JOIN: A Partition-based Method for Similarity Joins , 2011, Proc. VLDB Endow..
[10] Raghav Kaushik,et al. Efficient exact set-similarity joins , 2006, VLDB.
[11] William R. Hersh,et al. Managing Gigabytes—Compressing and Indexing Documents and Images (Second Edition) , 2001, Information Retrieval.
[12] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[13] Guoliang Li,et al. Can we beat the prefix filtering?: an adaptive framework for similarity join and search , 2012, SIGMOD Conference.
[14] Luis Gravano,et al. Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.
[15] Roberto J. Bayardo,et al. Scaling up all pairs similarity search , 2007, WWW '07.
[16] Anthony K. H. Tung,et al. Efficient and Scalable Processing of String Similarity Join , 2013, IEEE Transactions on Knowledge and Data Engineering.
[17] Charles Elkan,et al. The Field Matching Problem: Algorithms and Applications , 1996, KDD.
[18] Jian Li,et al. Efficient Similarity Join and Search on Multi-Attribute Data , 2015, SIGMOD Conference.
[19] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.
[20] Jiaheng Lu,et al. String similarity measures and joins with synonyms , 2013, SIGMOD '13.
[21] Ian H. Witten,et al. Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .
[22] Jiaheng Lu,et al. Efficient Merging and Filtering Algorithms for Approximate String Searches , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[23] Jeffrey Xu Yu,et al. Efficient similarity joins for near-duplicate detection , 2011, TODS.