Improving database quality through eliminating duplicate records
暂无分享,去创建一个
[1] Maxime Crochemore,et al. Two-way string-matching , 1991, JACM.
[2] Ricardo A. Baeza-Yates,et al. Matchsimile: a Flexible Approximate Matching Tool for Searching Proper Name , 2003, J. Assoc. Inf. Sci. Technol..
[3] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[4] Karen Kukich,et al. Spelling correction for the telecommunications network for the deaf , 1992, CACM.
[5] Karen Kukich,et al. Techniques for automatically correcting words in text , 1992, CSUR.
[6] Surajit Chaudhuri,et al. Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.
[7] Luis Gravano,et al. Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.
[8] Hongjun Lu,et al. Cleansing Data for Mining and Warehousing , 1999, DEXA.
[9] C. Lee Giles,et al. Two supervised learning approaches for name disambiguation in author citations , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..
[10] Esko Ukkonen,et al. Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..
[11] Charles Elkan,et al. An Efficient Domain-Independent Algorithm for Detecting Approximately Duplicate Database Records , 1997, DMKD.
[12] Charles Elkan,et al. The Field Matching Problem: Algorithms and Applications , 1996, KDD.
[13] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[14] Matthew A. Jaro,et al. Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida , 1989 .
[15] William E. Winkler,et al. String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. , 1990 .
[16] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[17] Rajeev Motwani,et al. Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.
[18] Theodore Johnson,et al. Exploratory Data Mining and Data Cleaning , 2003 .
[19] Gonzalo Navarro,et al. A guided tour to approximate string matching , 2001, CSUR.
[20] David Sankoff,et al. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .
[21] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.