Determining a Set of Edits

This paper covers methods of creating a set of edits for a set of data. The edits are logical constraints designed to detect errors on the data such as five-year-old children being married or the wages of a particular employee being five times as high as the wages of an employee in a similar position. Edits are intended to improve the quality of data. If data is edited and erroneous fields are identified and imputed, then they can be used for statistical analyses or business purposes. If errors remain in the data, then it is possible that analysts and policy makers who use the data will make inappropriate decisions.

[1]  Renato Bruni,et al.  Discrete models for data imputation , 2004, Discret. Appl. Math..

[2]  Nicholas Cox,et al.  Exploratory Data Mining and Data Cleaning , 2004 .

[3]  Robert Chambers,et al.  Robust automatic methods for outlier and error detection , 2004 .

[4]  A. G. de Waal,et al.  Processing of Erroneous and Unsafe Data , 2003 .

[5]  T. Johnson,et al.  Exploratory Data Mining and Data Cleaning , 2003 .

[6]  Theodore Johnson,et al.  Squashing flat files flatter , 1999, KDD '99.

[7]  Andrew W. Moore,et al.  Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets , 1998, J. Artif. Intell. Res..

[8]  John G. Kovar,et al.  Editing of Survey Data: How Much Is Enough? , 1997 .

[9]  D. Sloane,et al.  An Introduction to Categorical Data Analysis , 1996 .

[10]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988, Wiley interscience series in discrete mathematics and optimization.

[11]  R. S. Garfinkel,et al.  Optimal Imputation of Erroneous Data: Categorical Data, General Edits , 1986, Oper. Res..

[12]  F. Mosteller,et al.  Data Analysis and Regression , 1978 .

[13]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[14]  D. Holt,et al.  A Systematic Approach to Automatic Edit and Imputation , 1976 .

[15]  Joseph Naus,et al.  Data Quality Control and Editing , 1975 .

[16]  William E. Winkler,et al.  General Methods and Algorithms for Modeling and Imputing Discrete Data under a Variety of Constraints , 2008 .

[17]  English Only A Contingency-Table Model for Imputing Data Satisfying Analytic Constraints , 2003 .

[18]  W. Winkler,et al.  A Contingency-Table Model for Imputing Data Satisfying Analytic Constraints , 2002 .

[19]  William E. Winkler,et al.  BALANCING AND RATIO EDITING WITH THE NEW SPEER SYSTEM , 2002 .

[20]  William E. Winkler,et al.  SET-COVERING AND EDITING DISCRETE DATA , 1998 .

[21]  W. Winkler,et al.  Developing Analytic Programming Capability to Empower the Survey Organization , 1998 .

[22]  J. G. Bethlehem,et al.  Data editing perspectives , 1997 .

[23]  Svein Nordbotten,et al.  Automatic editing of individual statistical observations , 1963 .