Probabilistic Data Generation for Deduplication and Data Linkage
暂无分享,去创建一个
[1] Pradeep Ravikumar,et al. A Hierarchical Graphical Model for Record Linkage , 2004, UAI.
[2] P. Ivax,et al. A THEORY FOR RECORD LINKAGE , 2004 .
[3] Lifang Gu,et al. Adaptive Filtering for Efficient Record Linkage , 2004, SDM.
[4] Un Yong Nahm and Mikhail Bilenko and Raymond J. Mooney,et al. Two Approaches to Handling Noisy Variation in Text Mining , 2002 .
[5] Peter Christen,et al. Febrl - A Parallel Open Source Data Linkage System: http://datamining.anu.edu.au/linkage.html , 2004, PAKDD.
[6] Karen Kukich,et al. Techniques for automatically correcting words in text , 1992, CSUR.
[7] Ahmed K. Elmagarmid,et al. TAILOR: a record linkage toolbox , 2002, Proceedings 18th International Conference on Data Engineering.
[8] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.
[9] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .
[10] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[11] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[12] Patrick A. V. Hall,et al. Approximate String Matching , 1994, Encyclopedia of Algorithms.
[13] Rajeev Motwani,et al. Robust identification of fuzzy duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).
[14] Craig A. Knoblock,et al. Learning domain-independent string transformation weights for high accuracy object identification , 2002, KDD.
[15] William E. Yancey. An Adaptive String Comparator for Record Linkage , 2004 .
[16] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[17] Lyle H. Ungar,et al. String Edit Analysis for Merging Databases , 2000, KDD 2000.
[18] Antonio Zamora,et al. Automatic spelling correction in scientific and scholarly text , 1984, CACM.
[19] Fred J. Damerau,et al. A technique for computer detection and correction of spelling errors , 1964, CACM.
[20] Mikhail Bilenko and Raymond J. Mooney,et al. On Evaluation and Training-Set Construction for Duplicate Detection , 2003 .
[21] Pradeep Ravikumar,et al. A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.
[22] Michael Giffin,et al. New South Wales mothers and babies 2001. , 2002, New South Wales public health bulletin.
[23] William W. Cohen,et al. Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.