A Progressive Method for Detecting Duplication Entities Based on Bloom Filters
暂无分享,去创建一个
Derong Shen | Tiezheng Nie | Ge Yu | Yue Kou | Yebing Luo
[1] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.
[2] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[3] Jeffrey Xu Yu,et al. Efficient similarity joins for near-duplicate detection , 2011, TODS.
[4] Mayank Bawa,et al. LSH forest: self-tuning indexes for similarity search , 2005, WWW '05.
[5] Claudia Niederée,et al. A Blocking Framework for Entity Resolution in Highly Heterogeneous Information Spaces , 2013, IEEE Transactions on Knowledge and Data Engineering.
[6] Peter Christen,et al. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication , 2012, IEEE Transactions on Knowledge and Data Engineering.
[7] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[8] Howard B. Newcombe,et al. Record linkage: making maximum use of the discriminating power of identifying information , 1962, CACM.
[9] Jeffrey Xu Yu,et al. Efficient similarity joins for near duplicate detection , 2008, WWW.
[10] Hyesook Lim,et al. Cache sharing using bloom filters in named data networking , 2017, J. Netw. Comput. Appl..
[11] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[12] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[13] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.
[14] Hector Garcia-Molina,et al. Pay-As-You-Go Entity Resolution , 2013, IEEE Transactions on Knowledge and Data Engineering.
[15] Zhe Wang,et al. Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.