New Methods of Editing and Imputation

Editing of data collected for preparation of statistics is a time and resource consuming process. This paper presents experiments with artificial neural networks as a potential tool for increasing the effectiveness of statistical editing and imputation. To maintain accuracy in resulting statistics, the possibility of deriving reliable accuracy predictions is also discussed. Producers of statistics have always been concerned about the quality of their statistics. Hundred years ago, the name data editors was used on the staff who had the responsibility to inspect and control the data collected from the respondents. The appearance of programmed electronic computers 50 years ago offered opportunities to automate the editing. Already early in the 1960's , computerized editing of the US Agriculture Census data was for example extensively used, but it is estimated that 20-40 % of the total costs of a survey or a census are still used for editing (Granquist 1997). To look for new opportunities to reduce the resources required for editing is therefore a continuous challenge. The purpose of this paper is to discuss an approach to statistical editing the main issue of which is to use available information of the statistical units more effectively in control and imputation. This is no new idea. It was discussed in international meetings as functional editing for more than 30 years ago (Nordbotten 1963). Because of more efficient technical tools and more available information, functional editing is far more realistic today than at the time it was first discussed. This paper will discuss the editing and imputation process, some aspects of neural network methods and conclude with illustrations based on experiments with the functional approach carried out on real life data by means of artificial neural networks. Examples of similar experiments have also been discussed by others (Roddick 1993, Teague and Thomas 1997).