A noise filtering method using neural networks

During the data collecting and labeling process it is possible for noise to be introduced into a data set. As a result, the quality of the data set degrades and experiments and inferences derived from the data set become less reliable. In this paper we present an algorithm, called ANR (automatic noise reduction), as a filtering mechanism to identify and remove noisy data items whose classes have been mislabeled. The underlying mechanism behind ANR is based on a framework of multi-layer artificial neural networks. ANR assigns each data item a soft class label in the form of a class probability vector, which is initialized to the original class label and can be modified during training. When the noise level is reasonably small (< 30%), the non-noisy data is dominant in determining the network architecture and its output, and thus a mechanism for correcting mislabeled data can be provided by aligning class probability vector with the network output. With a learning procedure for class probability vector based on its difference from the network output, the probability of a mislabeled class gradually becomes smaller while that of the correct class becomes larger, which eventually causes a correction of mislabeled data after sufficient training. After training, those data items whose classes have been relabeled are then treated as noisy data and removed from the data set. We evaluate the performance of the ANR based on 12 data sets drawn from the UCI data repository. The results show that ANR is capable of identifying a significant portion of noisy data. An average increase in accuracy of 24.5% can be achieved at a noise level of 25% by using ANR as a training data filter for a nearest neighbor classifier, as compared to the one without using ANR.

[1]  Saso Dzeroski,et al.  Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois , 1996, ALT.

[2]  J. J. Hopfield,et al.  “Neural” computation of decisions in optimization problems , 1985, Biological Cybernetics.

[3]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[4]  Tony R. Martinez,et al.  An algorithm for correcting mislabeled data , 2001, Intell. Data Anal..

[5]  Carla E. Brodley,et al.  Identifying and Eliminating Mislabeled Training Instances , 1996, AAAI/IAAI, Vol. 1.

[6]  Choh-Man Teng,et al.  Correcting Noisy Data , 1999, ICML.

[7]  David W. Aha,et al.  Noise-Tolerant Instance-Based Learning Algorithms , 1989, IJCAI.

[8]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[9]  G. Gates,et al.  The reduced nearest neighbor rule (Corresp.) , 1972, IEEE Trans. Inf. Theory.

[10]  Tony R. Martinez,et al.  Instance Pruning Techniques , 1997, ICML.

[11]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[12]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[13]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[16]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[17]  Ralph Martinez,et al.  Reduction Techniques for Exemplar-Based Learning Algorithms , 1998 .

[18]  G. Gates The Reduced Nearest Neighbor Rule , 1998 .

[19]  Choh-Man Teng Evaluating Noise Correction , 2000, PRICAI.

[20]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[21]  Belur V. Dasarathy,et al.  Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.