Efficient Duplicate Record Detection Based on Similarity Estimation
暂无分享,去创建一个
[1] William W. Cohen. Data integration using similarity joins and a word-based information representation language , 2000, TOIS.
[2] Ethem Alpaydin,et al. Introduction to machine learning , 2004, Adaptive computation and machine learning.
[3] Rajeev Motwani,et al. Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.
[4] Peter N. Yianilos,et al. Learning String-Edit Distance , 1996, IEEE Trans. Pattern Anal. Mach. Intell..
[5] S. T. Buckland,et al. An Introduction to the Bootstrap. , 1994 .
[6] William W. Cohen,et al. Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.
[7] Surajit Chaudhuri,et al. Transformation-based Framework for Record Matching , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[8] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[9] James Munkres. On the assignment and transportation problems (abstract) , 1957 .
[10] Sunita Sarawagi,et al. Automatic segmentation of text into structured records , 2001, SIGMOD '01.
[11] William W. Cohen,et al. Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods , 2004, KDD.
[12] Harold W. Kuhn,et al. The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.
[13] Divesh Srivastava,et al. Benchmarking declarative approximate selection predicates , 2007, SIGMOD '07.
[14] Paul A. Viola,et al. Learning to extract information from semi-structured text using a discriminative context free grammar , 2005, SIGIR '05.
[15] J. Munkres. ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .
[16] Raghav Kaushik,et al. A grammar-based entity representation framework for data cleaning , 2009, SIGMOD Conference.
[17] Panagiotis G. Ipeirotis,et al. Duplicate Record Detection: A Survey , 2007 .