Exploiting block co-occurrence to control block sizes for entity resolution
暂无分享,去创建一个
[1] Carlos Eduardo S. Pires,et al. Data Quality Monitoring of Cloud Databases Based on Data Quality SLAs , 2015, Big-Data Analytics and Cloud Computing.
[2] Pasi Fränti,et al. Balanced K-Means for Clustering , 2014, S+SSPR.
[3] Peter Christen,et al. Hashing-Based Distributed Multi-party Blocking for Privacy-Preserving Record Linkage , 2016, PAKDD.
[4] Carlos Eduardo S. Pires,et al. Adaptive sorted neighborhood blocking for entity matching with MapReduce , 2015, SAC.
[5] Raymond J. Mooney,et al. Adaptive Blocking: Learning to Scale Up Record Linkage , 2006, Sixth International Conference on Data Mining (ICDM'06).
[6] Sanjay Chawla,et al. Robust record linkage blocking using suffix arrays , 2009, CIKM.
[7] Peter Christen,et al. Clustering-Based Scalable Indexing for Multi-party Privacy-Preserving Record Linkage , 2015, PAKDD.
[8] Wolfgang Nejdl,et al. Meta-Blocking: Taking Entity Resolutionto the Next Level , 2014, IEEE Transactions on Knowledge and Data Engineering.
[9] Shafiq R. Joty,et al. Distributed Representations of Tuples for Entity Resolution , 2018, Proc. VLDB Endow..
[10] Carlos Eduardo S. Pires,et al. Heuristic-based approaches for speeding up incremental record linkage , 2018, J. Syst. Softw..
[11] Shumeet Baluja,et al. LSH banding for large-scale retrieval with memory and recall constraints , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[12] Jordi Forné,et al. A modification of the k-means method for quasi-unsupervised learning , 2013, Knowl. Based Syst..
[13] Qing Wang,et al. A Clustering-Based Framework to Control Block Sizes for Entity Resolution , 2015, KDD.
[14] Renée J. Miller,et al. Framework for Evaluating Clustering Algorithms in Duplicate Detection , 2009, Proc. VLDB Endow..
[15] Carlo Batini,et al. Methodologies for data quality assessment and improvement , 2009, CSUR.
[16] Felix Naumann,et al. Progressive Duplicate Detection , 2015, IEEE Transactions on Knowledge and Data Engineering.
[17] Georgia Koutrika,et al. Entity resolution with iterative blocking , 2009, SIGMOD Conference.
[18] Peter Christen,et al. Data Matching , 2012, Data-Centric Systems and Applications.
[19] Peter Christen,et al. A taxonomy of privacy-preserving record linkage techniques , 2013, Inf. Syst..
[20] Craig A. Knoblock,et al. Learning Blocking Schemes for Record Linkage , 2006, AAAI.
[21] Shunzhi Zhu,et al. Data clustering with size constraints , 2010, Knowl. Based Syst..
[22] Hector Garcia-Molina,et al. Pay-As-You-Go Entity Resolution , 2013, IEEE Transactions on Knowledge and Data Engineering.
[23] Gianni Costa,et al. An incremental clustering scheme for data de-duplication , 2009, Data Mining and Knowledge Discovery.
[24] Vassilios S. Verykios,et al. Privacy preserving record linkage approaches , 2009, Int. J. Data Min. Model. Manag..
[25] Carlos Eduardo S. Pires,et al. Improving load balancing for MapReduce-based entity matching , 2013, 2013 IEEE Symposium on Computers and Communications (ISCC).
[26] Nikolaus Augsten,et al. An Empirical Evaluation of Set Similarity Join Techniques , 2016, Proc. VLDB Endow..
[27] Huizhi Liang,et al. Dynamic Sorted Neighborhood Indexing for Real-Time Entity Resolution , 2015, ACM J. Data Inf. Qual..
[28] Shafiq R. Joty,et al. DeepER - Deep Entity Resolution , 2017, ArXiv.
[29] Divesh Srivastava,et al. Incremental Record Linkage , 2014, Proc. VLDB Endow..
[30] Divesh Srivastava,et al. Record linkage: similarity measures and algorithms , 2006, SIGMOD Conference.
[31] William W. Cohen,et al. Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.
[32] Peter Christen,et al. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication , 2012, IEEE Transactions on Knowledge and Data Engineering.
[33] Peter Christen,et al. Sorted Nearest Neighborhood Clustering for Efficient Private Blocking , 2013, PAKDD.
[34] Andreas Thor,et al. Multi-pass sorted neighborhood blocking with MapReduce , 2012, Computer Science - Research and Development.
[35] C. K. Michael Tse,et al. Data Clustering with Cluster Size Constraints Using a Modified K-Means Algorithm , 2014, 2014 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.
[36] C. Lee Giles,et al. Adaptive sorted neighborhood methods for efficient record linkage , 2007, JCDL '07.
[37] George Papastefanatos,et al. Supervised Meta-blocking , 2014, Proc. VLDB Endow..
[38] Carlos Eduardo S. Pires,et al. Towards the efficient parallelization of multi-pass adaptive blocking for entity matching , 2017, J. Parallel Distributed Comput..
[39] Carlo Batini,et al. Data Quality Dimensions , 2016 .
[40] Christophe G. Giraud-Carrier,et al. Effective record linkage for mining campaign contribution data , 2014, Knowledge and Information Systems.