Methods for evaluating and creating data quality
暂无分享,去创建一个
[1] William E. Winkler,et al. SET-COVERING AND EDITING DISCRETE DATA , 1998 .
[2] Matthew A. Jaro,et al. Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida , 1989 .
[3] H B NEWCOMBE,et al. Automatic linkage of vital records. , 1959, Science.
[4] William E. Winkler,et al. The State of Record Linkage and Current Research Problems , 1999 .
[5] William E. Winkler. EDITING DISCRETE DATA , 1997 .
[6] Clement T. Yu,et al. Term Weighting in Information Retrieval Using the Term Precision Model , 1982, JACM.
[7] W. Winkler. IMPROVED DECISION RULES IN THE FELLEGI-SUNTER MODEL OF RECORD LINKAGE , 1993 .
[8] William E. Winkler,et al. Advanced Methods For Record Linkage , 1994 .
[9] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[10] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[11] Felix Naumann,et al. Object Identification Quality , 2003 .
[12] Ming‐Pi Mi. Handbook of record linkage: Methods for health and statistical studies, administration, and business, Howard B. Newcombe, Oxford, England: Oxford University Press, 1988, 210 pp, $40.00 , 1989 .
[13] P. Ivax,et al. A THEORY FOR RECORD LINKAGE , 2004 .
[14] W. Winkler. USING THE EM ALGORITHM FOR WEIGHT COMPUTATION IN THE FELLEGI-SUNTER MODEL OF RECORD LINKAGE , 2000 .
[15] Ahmed K. Elmagarmid,et al. TAILOR: a record linkage toolbox , 2002, Proceedings 18th International Conference on Data Engineering.
[16] Craig A. Knoblock,et al. Learning object identification rules for information integration , 2001, Inf. Syst..
[17] Surajit Chaudhuri,et al. Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.
[18] Lise Getoor,et al. Learning Probabilistic Relational Models , 1999, IJCAI.
[19] D. Rubin,et al. Iterative Automated Record Linkage Using Mixture Models , 2001 .
[20] Chen Li,et al. Efficient record linkage in large data sets , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..
[21] William E. Winkler. Quality of Very Large Databases , 2001 .
[22] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[23] Craig A. Knoblock,et al. Learning domain-independent string transformation weights for high accuracy object identification , 2002, KDD.
[24] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[25] Bor-Chung Chen,et al. Set Covering Algorithms in Edit Generation , 1998 .
[26] Eric R. Ziegel,et al. Business survey methods , 1995 .
[27] Gonzalo Navarro,et al. A guided tour to approximate string matching , 2001, CSUR.
[28] Larry P. English. Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits , 1999 .
[29] Peter Christen,et al. Preparation of name and address data for record linkage using hidden Markov models , 2002, BMC Medical Informatics Decis. Mak..
[30] T. De Waal. A Fast and Simple Algorithm for Automatic Editing of Mixed Data , 2003 .
[31] Howard B. Newcombe,et al. Record linkage: making maximum use of the discriminating power of identifying information , 1962, CACM.
[32] William S. Cooper,et al. Foundations of Probabilistic and Utility-Theoretic Indexing , 1978, JACM.
[33] Antonio Zamora,et al. Automatic spelling correction in scientific and scholarly text , 1984, CACM.
[34] D. Rubin,et al. A method for calibrating false-match rates in record linkage , 1995 .
[35] William E. Winkler,et al. Methods for Record Linkage and Bayesian Networks , 2002 .
[36] P. Lahiri,et al. Regression Analysis With Linked Data , 2005 .
[37] William E. Winkler,et al. String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. , 1990 .
[38] William E. Winkler,et al. THE DISCRETE EDIT SYSTEM , 1997 .
[39] William E. Yancey. Improving EM Algorithm Estimates for Record Linkage Parameters , 2002 .
[40] Erhard Rahm,et al. A survey of approaches to automatic schema matching , 2001, The VLDB Journal.
[41] Antonio Sassano,et al. Optimization Techniques for an Error Free Data Collecting , 2001 .
[42] Luca De Santis,et al. Automatic Record Matching in Cooperative Information Systems , 2002 .
[43] William W. Cohen,et al. Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.
[44] John D. Lafferty,et al. Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[45] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[46] Howard B. Newcombe,et al. Handbook of record linkage: methods for health and statistical studies, administration, and business , 1988 .
[47] David Loshin. Enterprise knowledge management: the data quality approach , 2000 .
[48] R. Burkard,et al. Assignment and Matching Problems: Solution Methods with FORTRAN-Programs , 1980 .
[49] W. Winkler. Machine Learning , Information Retrieval , and Record Linkage , 2000 .
[50] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .
[51] Fritz Scheuren,et al. Regression Analysis of Data Files that Are Computer Matched , 1993 .
[52] Tiziana Catarci,et al. Managing Data Quality in Cooperative Information Systems , 2002, OTM.
[53] Avi Pfeffer,et al. Probabilistic Frame-Based Systems , 1998, AAAI/IAAI.
[54] G. McLachlan,et al. The EM algorithm and extensions , 1996 .
[55] Roberto Grossi,et al. The string B-tree: a new data structure for string search in external memory and its applications , 1999, JACM.
[56] Julius T. Tou,et al. Information Systems , 1973, GI Jahrestagung.
[57] Romina Fraboni,et al. Economic Commission for Europe. , 1982, POPIN bulletin.
[58] D. Holt,et al. A Systematic Approach to Automatic Edit and Imputation , 1976 .
[59] R. S. Garfinkel,et al. Optimal Imputation of Erroneous Data: Categorical Data, General Edits , 1986, Oper. Res..
[60] Thomas Redman,et al. Data quality for the information age , 1996 .
[61] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[62] Patrick A. V. Hall,et al. Approximate String Matching , 1994, Encyclopedia of Algorithms.