Bagging, bumping, multiview, and active learning for record linkage with empirical results on patient identity data
暂无分享,去创建一个
[1] Carlos Soares,et al. Is the UCI Repository Useful for Data Mining? , 2003, EPIA.
[2] J. Darroch,et al. Generalized Iterative Scaling for Log-Linear Models , 1972 .
[3] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[4] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[5] Murat Sariyar,et al. Controlling false match rates in record linkage using extreme value theory , 2011, J. Biomed. Informatics.
[6] Xiaojin Zhu,et al. Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[7] ipred : Improved Predictors , 2009 .
[8] Melba M. Crawford,et al. View Generation for Multiview Maximum Disagreement Based Active Learning for Hyperspectral Image Classification , 2012, IEEE Transactions on Geoscience and Remote Sensing.
[9] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[10] Fabio Roli,et al. Using Co-training and Self-training in Semi-supervised Multiple Classifier Systems , 2006, SSPR/SPR.
[11] Matthias Egger,et al. The Swiss National Cohort: a unique database for national and international researchers , 2010, International Journal of Public Health.
[12] A Wajda,et al. The art and science of record linkage: methods that work with few identifiers. , 1986, Computers in biology and medicine.
[13] A. Campbell,et al. Progress in Artificial Intelligence , 1995, Lecture Notes in Computer Science.
[14] Wilfred Ng,et al. Applying Co-training to Clickthrough Data for Search Engine Adaptation , 2004, DASFAA.
[15] SchwenkerFriedhelm,et al. 2010 Special Issue , 2010 .
[16] Lifang Gu,et al. Decision Models for Record Linkage , 2006, Selected Papers from AusDM.
[17] William E. Winkler,et al. Advanced Methods For Record Linkage , 1994 .
[18] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[19] Xiaojin Zhu,et al. --1 CONTENTS , 2006 .
[20] Ion Muslea,et al. Active Learning with Multiple Views , 2009, Encyclopedia of Data Warehousing and Mining.
[21] George V. Moustakides,et al. A Bayesian decision model for cost optimal record matching , 2003, The VLDB Journal.
[22] William E. Winkler,et al. Data quality and record linkage techniques , 2007 .
[23] Graham J. Williams,et al. Data Mining - Theory, Methodology, Techniques, and Applications , 2006, Lecture Notes in Computer Science.
[24] Ran El-Yaniv,et al. Large margin vs. large volume in transductive learning , 2008, Machine Learning.
[25] Chao Deng,et al. A new co-training-style random forest for computer aided diagnosis , 2011, Journal of Intelligent Information Systems.
[26] Mikhail F. Kanevski,et al. A Survey of Active Learning Algorithms for Supervised Remote Sensing Image Classification , 2011, IEEE Journal of Selected Topics in Signal Processing.
[27] William E. Yancey. Evaluating String Comparator Performance for Record Linkage , 2005 .
[28] Christopher M. Bishop,et al. Classification and regression , 1997 .
[29] Dennis Shasha,et al. Efficient data reconciliation , 2001, Inf. Sci..
[30] William W. Cohen,et al. Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.
[31] Ran El-Yaniv,et al. Large Margin vs. Large Volume in Transductive Learning , 2008, ECML/PKDD.
[32] Ahmed K. Elmagarmid,et al. Automating the approximate record-matching process , 2000, Inf. Sci..
[33] Ling Qiu,et al. Preserving privacy in association rule mining with bloom filters , 2006, Journal of Intelligent Information Systems.
[34] U. M. Feyyad. Data mining and knowledge discovery: making sense out of data , 1996 .
[35] Günther Palm,et al. Semi-supervised learning for tree-structured ensembles of RBF networks with Co-Training , 2010, Neural Networks.
[36] Murat Sariyar,et al. Evaluation of Record Linkage Methods for Iterative Insertions , 2009, Methods of Information in Medicine.
[37] Ivan P. Fellegi,et al. A Theory for Record Linkage , 1969 .
[38] Michael Haber. Fitting a General Log‐Linear Model , 1984 .
[39] G. Niklas Norén,et al. Duplicate detection in adverse drug reaction surveillance , 2007, Data Mining and Knowledge Discovery.
[40] Murat Sariyar,et al. Missing values in deduplication of electronic patient data , 2012, J. Am. Medical Informatics Assoc..
[41] R. Tibshirani,et al. Model Search by Bootstrap “Bumping” , 1999 .
[42] Zehra Cataltepe,et al. Co-training with relevant random subspaces , 2010, Neurocomputing.
[43] Jianyi Guo,et al. Question classification based on co-training style semi-supervised learning , 2010, Pattern Recognit. Lett..
[44] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[45] N. Ohashi,et al. Agreement , 2002 .
[46] George Hripcsak,et al. Technical Brief: Agreement, the F-Measure, and Reliability in Information Retrieval , 2005, J. Am. Medical Informatics Assoc..
[47] Leo Breiman,et al. Classification and Regression Trees , 1984 .
[48] P. Bühlmann,et al. Analyzing Bagging , 2001 .
[49] Vladimir Vapnik,et al. Statistical learning theory , 1998 .