Large-Scale Similarity Join with Edit-Distance Constraints
暂无分享,去创建一个
Chen Lin | Xianmang He | Haiyang Yu | Wei Weng | Chen Lin | Haiyang Yu | Xianmang He | Wei Weng
[1] Surajit Chaudhuri,et al. Transformation-based Framework for Record Matching , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[2] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[3] Xuemin Lin,et al. Top-k Set Similarity Joins , 2009, 2009 IEEE 25th International Conference on Data Engineering.
[4] J. Bobadilla,et al. Recommender systems survey , 2013, Knowl. Based Syst..
[5] Jignesh M. Patel,et al. A comparison of join algorithms for log processing in MaPreduce , 2010, SIGMOD Conference.
[6] Surajit Chaudhuri,et al. A Primitive Operator for Similarity Joins in Data Cleaning , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[7] Luis Gravano,et al. Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.
[8] Douglas Stott Parker,et al. Map-reduce-merge: simplified relational data processing on large clusters , 2007, SIGMOD '07.
[9] Jeffrey Xu Yu,et al. Efficient similarity joins for near-duplicate detection , 2011, TODS.
[10] Xuemin Lin,et al. Ed-Join: an efficient algorithm for similarity joins with edit distance constraints , 2008, Proc. VLDB Endow..
[11] Gerhard Weikum,et al. The SphereSearch Engine for Unified Ranked Retrieval of Heterogeneous XML and Web Documents , 2005, VLDB.
[12] Din J. Wasem,et al. Mining of Massive Datasets , 2014 .
[13] Marco Thines,et al. Signatures of Adaptation to Obligate Biotrophy in the Hyaloperonospora arabidopsidis Genome , 2010, Science.
[14] Guoliang Li,et al. PASS-JOIN: A Partition-based Method for Similarity Joins , 2011, Proc. VLDB Endow..
[15] Setsuo Ohsuga,et al. INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES , 1977 .
[16] Raghav Kaushik,et al. Efficient exact set-similarity joins , 2006, VLDB.
[17] Beng Chin Ooi,et al. Proceedings of the 2007 ACM SIGMOD international conference on Management of data , 2007, SIGMOD 2007.
[18] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[19] Roberto J. Bayardo,et al. Scaling up all pairs similarity search , 2007, WWW '07.
[20] Surajit Chaudhuri,et al. An efficient filter for approximate membership checking , 2008, SIGMOD Conference.
[21] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[22] David J. DeWitt,et al. A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.
[23] Guoliang Li,et al. Efficient parallel partition-based algorithms for similarity search and join with edit distance constraints , 2013, EDBT '13.
[24] Chen Li,et al. Efficient parallel set-similarity joins using MapReduce , 2010, SIGMOD Conference.
[25] Christopher Olston,et al. Automatic Optimization of Parallel Dataflow Programs , 2008, USENIX Annual Technical Conference.