Adaptive Top-k Overlap Set Similarity Joins
暂无分享,去创建一个
[1] Xuemin Lin,et al. Top-k Set Similarity Joins , 2009, 2009 IEEE 25th International Conference on Data Engineering.
[2] Christoph Quix,et al. Data Lake , 2019, Encyclopedia of Big Data Technologies.
[3] Guoliang Li,et al. String similarity search and join: a survey , 2016, Frontiers of Computer Science.
[4] Ophir Frieder,et al. Collection statistics for fast duplicate document detection , 2002, TOIS.
[5] Lijun Chang,et al. Leveraging Set Relations in Exact Set Similarity Join , 2017, Proc. VLDB Endow..
[6] Renée J. Miller,et al. LSH Ensemble: Internet-Scale Domain Search , 2016, Proc. VLDB Endow..
[7] Rasmus Pagh,et al. Set similarity search beyond MinHash , 2017, STOC.
[8] Nikos Mamoulis,et al. Spatio-textual similarity joins , 2012, Proc. VLDB Endow..
[9] Chen Li,et al. Efficient parallel set-similarity joins using MapReduce , 2010, SIGMOD Conference.
[10] Jukka Riekki,et al. Implementing Big Data Lake for Heterogeneous Data Sources , 2019, 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW).
[11] Guoliang Li,et al. An Efficient Partition Based Method for Exact Set Similarity Joins , 2015, Proc. VLDB Endow..
[12] Ling Shao,et al. LCJoin: Set Containment Join via List Crosscutting , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).
[13] Ying Zhang,et al. An Efficient Framework for Exact Set Similarity Search Using Tree Structure Indexes , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).
[14] Ping Li,et al. Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment , 2015, WWW.
[15] Nikolaus Augsten,et al. An Empirical Evaluation of Set Similarity Join Techniques , 2016, Proc. VLDB Endow..
[16] Renée J. Miller,et al. Data Lake Management: Challenges and Opportunities , 2019, Proc. VLDB Endow..
[17] Rasmus Pagh,et al. Set Similarity Search for Skewed Data , 2018, PODS.
[18] Raghav Kaushik,et al. Efficient exact set-similarity joins , 2006, VLDB.
[19] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[20] Xuemin Lin,et al. TT-Join: Efficient Set Containment Join , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).
[21] Roberto J. Bayardo,et al. Scaling up all pairs similarity search , 2007, WWW '07.
[22] Yufei Tao,et al. Overlap Set Similarity Joins with Theoretical Guarantees , 2018, SIGMOD Conference.
[23] Hector Garcia-Molina,et al. Adaptive algorithms for set containment joins , 2003, TODS.
[24] Rasmus Pagh. Locality-sensitive Hashing without False Negatives , 2016, SODA.
[25] William W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity , 1998, SIGMOD '98.
[26] Jeffrey Xu Yu,et al. Efficient similarity joins for near-duplicate detection , 2011, TODS.
[27] Surajit Chaudhuri,et al. A Primitive Operator for Similarity Joins in Data Cleaning , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[28] Guoliang Li,et al. Can we beat the prefix filtering?: an adaptive framework for similarity join and search , 2012, SIGMOD Conference.
[29] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[30] Gang Chen,et al. Metric Similarity Joins Using MapReduce , 2017, IEEE Transactions on Knowledge and Data Engineering.
[31] Renée J. Miller,et al. JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes , 2019, SIGMOD Conference.