A framework for entity resolution with efficient blocking
暂无分享,去创建一个
Clement T. Yu | Neil R. Smalheiser | Yue Han | Weiyi Meng | Liangcai Shu | Can Lin | W. Meng | N. Smalheiser | Can Lin | Liangcai Shu | Yue-Shuan Han
[1] Weiyi Meng,et al. A Latent Topic Model for Complete Entity Resolution , 2009, 2009 IEEE 25th International Conference on Data Engineering.
[2] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[3] Lise Getoor,et al. Iterative record linkage for cleaning and integration , 2004, DMKD '04.
[4] William W. Cohen,et al. Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.
[5] Carlo Batini,et al. Data Quality: Concepts, Methodologies and Techniques , 2006, Data-Centric Systems and Applications.
[6] Ivan P. Fellegi,et al. A Theory for Record Linkage , 1969 .
[7] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[8] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[9] H B NEWCOMBE,et al. Automatic linkage of vital records. , 1959, Science.
[10] Ian H. Witten,et al. Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.
[11] W. Winkler. Overview of Record Linkage and Current Research Directions , 2006 .
[12] Esko Ukkonen,et al. Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..
[13] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[14] M E J Newman,et al. Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[15] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[16] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .
[17] Lise Getoor,et al. Deduplication and Group Detection using Links , 2004 .
[18] Peter Christen,et al. Febrl - Freely extensible biomedical record linkage , 2002 .
[19] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.
[20] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[21] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[22] Luis Gravano,et al. Text joins in an RDBMS for web data integration , 2003, WWW '03.
[23] Renée J. Miller,et al. Framework for Evaluating Clustering Algorithms in Duplicate Detection , 2009, Proc. VLDB Endow..
[24] Don X. Sun,et al. Methods for Linking and Mining Massive Heterogeneous Databases , 1998, KDD.
[25] Jon Williamson,et al. Bayesian Nets and Causality: Philosophical and Computational Foundations , 2005 .
[26] Ahmed K. Elmagarmid,et al. Automating the approximate record-matching process , 2000, Inf. Sci..
[27] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .
[28] Carlo Batini,et al. Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications) , 2006 .
[29] Stuart J. Russell,et al. Identity Uncertainty and Citation Matching , 2002, NIPS.
[30] Jianzhong Li,et al. Reasoning about Record Matching Rules , 2009, Proc. VLDB Endow..
[31] Georgia Koutrika,et al. Entity resolution with iterative blocking , 2009, SIGMOD Conference.
[32] Lise Getoor,et al. A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.
[33] Sudipto Guha,et al. Merging the Results of Approximate Match Operations , 2004, VLDB.
[34] Weiyi Meng,et al. Efficient SPectrAl Neighborhood blocking for entity resolution , 2011, 2011 IEEE 27th International Conference on Data Engineering.