暂无分享,去创建一个
[1] Christopher J. C. Burges,et al. A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.
[2] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[3] Din J. Wasem,et al. Mining of Massive Datasets , 2014 .
[4] S. Sudarshan,et al. Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.
[5] Peter Christen,et al. Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach , 2008, AusDM.
[6] Gautam Shroff,et al. Approximate Incremental Big-Data Harmonization , 2013, 2013 IEEE International Congress on Big Data.
[7] ChristenPeter. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication , 2012 .
[8] Yehoshua Sagiv,et al. Finding and approximating top-k answers in keyword proximity search , 2006, PODS '06.
[9] Hector Garcia-Molina,et al. Pay-As-You-Go Entity Resolution , 2013, IEEE Transactions on Knowledge and Data Engineering.
[10] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[11] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[12] William E. Winkler,et al. The State of Record Linkage and Current Research Problems , 1999 .
[13] Wolfgang Nejdl,et al. Efficient Incremental Near Duplicate Detection Based on Locality Sensitive Hashing , 2010, DEXA.
[14] Dongwon Lee,et al. HARRA: fast iterative hashed record linkage for large-scale data collections , 2010, EDBT '10.
[15] András A. Benczúr,et al. Infrastructures and bound for distributed entity resolution , 2011 .
[16] Pradeep Ravikumar,et al. A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.
[17] Lise Getoor,et al. Collective entity resolution in relational data , 2007, TKDD.
[18] Claudia Niederée,et al. Beyond 100 million entities: large-scale blocking-based resolution for heterogeneous data , 2012, WSDM '12.
[19] Lise Getoor,et al. Query-time entity resolution , 2006, KDD '06.
[20] Divesh Srivastava,et al. Record linkage with uniqueness constraints and erroneous values , 2010, Proc. VLDB Endow..
[21] Jennifer Widom,et al. Swoosh: a generic approach to entity resolution , 2008, The VLDB Journal.
[22] Panagiotis G. Ipeirotis,et al. Duplicate Record Detection: A Survey , 2007 .
[23] Alan M. Frieze,et al. Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..
[24] Alexandr Andoni,et al. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[25] Erhard Rahm,et al. Generic Schema Matching with Cupid , 2001, VLDB.