An unsupervised blocking technique for more efficient record linkage
暂无分享,去创建一个
Kevin O'Hare | Anna Jurek-Loughrey | Cassio de Campos | Anna Jurek-Loughrey | Cassio de Campos | K. O'Hare
[1] Hector Garcia-Molina,et al. Pay-As-You-Go Entity Resolution , 2013, IEEE Transactions on Knowledge and Data Engineering.
[2] Peter Christen,et al. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication , 2012, IEEE Transactions on Knowledge and Data Engineering.
[3] George Papastefanatos,et al. Scaling Entity Resolution to Large, Heterogeneous Data with Enhanced Meta-blocking , 2016, EDBT.
[4] Daniel P. Miranker,et al. On Linking Heterogeneous Dataset Collections , 2014, SEMWEB.
[5] Craig A. Knoblock,et al. Learning Blocking Schemes for Record Linkage , 2006, AAAI.
[6] Anna Jurek-Loughrey,et al. A Review of Unsupervised and Semi-supervised Blocking Methods for Record Linkage , 2018, Unsupervised and Semi-Supervised Learning.
[7] Peter Christen,et al. Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface , 2008, KDD.
[8] Claudia Niederée,et al. A Blocking Framework for Entity Resolution in Highly Heterogeneous Information Spaces , 2013, IEEE Transactions on Knowledge and Data Engineering.
[9] Felix Naumann,et al. Progressive Duplicate Detection , 2015, IEEE Transactions on Knowledge and Data Engineering.
[10] Marcos André Gonçalves,et al. BLOSS: Effective meta-blocking with almost no effort , 2018, Inf. Syst..
[11] Andreas Thor,et al. Learning-Based Approaches for Matching Web Data Entities , 2010, IEEE Internet Computing.
[12] Vassilios S. Verykios,et al. Scalable Blocking for Privacy Preserving Record Linkage , 2015, KDD.
[13] Peter Fankhauser,et al. Efficient entity resolution for large heterogeneous information spaces , 2011, WSDM '11.
[14] Erhard Rahm,et al. Frameworks for entity matching: A comparison , 2010, Data Knowl. Eng..
[15] David G. Stork,et al. Pattern Classification , 1973 .
[16] Jeff Heflin,et al. Automatically Generating Data Linkages Using a Domain-Independent Candidate Selection Approach , 2011, SEMWEB.
[17] George Papastefanatos,et al. Supervised Meta-blocking , 2014, Proc. VLDB Endow..
[18] Anil K. Jain,et al. Algorithms for Clustering Data , 1988 .
[19] Kaizhong Zhang,et al. Evaluating a class of distance-mapping algorithms for data mining and clustering , 1999, KDD '99.
[20] Felix Naumann,et al. Schema matching using duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).
[21] Peter Christen,et al. Data Matching , 2012, Data-Centric Systems and Applications.
[22] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[23] Vassilios S. Verykios,et al. An LSH-Based Blocking Approach with a Homomorphic Matching Technique for Privacy-Preserving Record Linkage , 2015, IEEE Transactions on Knowledge and Data Engineering.
[24] Ahmed K. Elmagarmid,et al. TAILOR: a record linkage toolbox , 2002, Proceedings 18th International Conference on Data Engineering.
[25] Panagiotis G. Ipeirotis,et al. Duplicate Record Detection: A Survey , 2007 .
[26] Raymond J. Mooney,et al. Adaptive Blocking: Learning to Scale Up Record Linkage , 2006, Sixth International Conference on Data Mining (ICDM'06).
[27] Anna Jurek,et al. A new technique of selecting an optimal blocking method for better record linkage , 2018, Inf. Syst..
[28] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[29] Chen Li,et al. Efficient record linkage in large data sets , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..
[30] Sonia Bergamaschi,et al. BLAST: a Loosely Schema-aware Meta-blocking Approach for Entity Resolution , 2016, Proc. VLDB Endow..
[31] Avigdor Gal,et al. Comparative Analysis of Approximate Blocking Techniques for Entity Resolution , 2016, Proc. VLDB Endow..
[32] P Deepak,et al. Semi-supervised and Unsupervised Approaches to Record Pairs Classification in Multi-Source Data Linkage , 2018, Unsupervised and Semi-Supervised Learning.
[33] Wolfgang Nejdl,et al. Meta-Blocking: Taking Entity Resolutionto the Next Level , 2014, IEEE Transactions on Knowledge and Data Engineering.
[34] Christos Faloutsos,et al. FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.
[35] Anand Rajaraman,et al. Mining of Massive Datasets , 2011 .
[36] Raymond K. Wong,et al. Unsupervised Blocking of Imbalanced Datasets for Record Matching , 2016, WISE.
[37] Huizhi Liang,et al. Semantic-Aware Blocking for Entity Resolution , 2016, IEEE Trans. Knowl. Data Eng..
[38] R. Mooney,et al. Learnable similarity functions and their application to record linkage and clustering , 2006 .
[39] George Papastefanatos,et al. Boosting the Efficiency of Large-Scale Entity Resolution with Enhanced Meta-Blocking , 2016, Big Data Res..
[40] Daniel P. Miranker,et al. Semi-supervised Instance Matching Using Boosted Classifiers , 2015, ESWC.
[41] Daniel P. Miranker,et al. A two-step blocking scheme learner for scalable link discovery , 2014, OM.
[42] Andreas Thor,et al. Evaluation of entity resolution approaches on real-world match problems , 2010, Proc. VLDB Endow..
[43] George Papadakis,et al. Blocking for large-scale Entity Resolution: Challenges, algorithms, and practical examples , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).
[44] Daniel P. Miranker,et al. An Unsupervised Algorithm for Learning Blocking Schemes , 2013, 2013 IEEE 13th International Conference on Data Mining.
[45] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.