According to the requirement of building data centers in State Grid project planning, the process of data cleaning was divided into two sub-processes in the data extraction process, namely, the abnormal values were set to NULL after detecting the electric quantity data, then, those data were predicted based on other valid values. To further improve the quality of data, we proposed a method which based on genetic neural network to handle the missing values. This method fully used the global search ability of genetic algorithm and the nonlinear mapping ability of neural network, so that the prediction accuracy of the data was greatly improved. The experiment shows that this method is feasible and effective in improving the prediction precision of data.
[1]
Robert P. Goldman,et al.
Imputation of Missing Data Using Machine Learning Techniques
,
1996,
KDD.
[2]
Eliseo P. Vergara.
Outlier Detection and Data Cleaning in Multivariate Non-Normal Samples: The PAELLA Algorithm ∗
,
2004
.
[3]
Taghi M. Khoshgoftaar,et al.
Imputation techniques for multivariate missingness in software measurement data
,
2008,
Software Quality Journal.
[4]
Tariq Samad,et al.
Imputation of Missing Data in Industrial Databases
,
1999,
Applied Intelligence.
[5]
Wang Su-qing.
Noisy-data-disposing Algorithm of Data Clean on the Attribute Level
,
2005
.