Improving Label Noise Filtering by Exploiting Unlabeled Data

With the significant growth in the scale of data, an increasing amount of training data is available in many machine learning tasks. However, it is difficult to ensure perfect labeling with a large volume of training data. Some labels can be incorrect, resulting in label noise, which could lead to deterioration in learning performance. A common way to address label noise is to apply noise filtering techniques to identify and remove noise prior to learning. Multiple noise filtering approaches have been proposed. However, almost all existing works focus on only mislabeled training data and ignore the existence of unlabeled data. In fact, unlabeled data are common in many applications, and their values have been extensively studied and recognized. Therefore, in this paper, we explore the effective use of unlabeled data to improve the noise filtering performance. To this end, we propose a novel noise filtering algorithm called enhanced soft majority voting by exploiting unlabeled data (ESMVU), which is an ensemble-learning-based filter that adopts a soft majority voting strategy. ESMVU provides a systematic way to measure the value of unlabeled data by considering different aspects, such as label confidence and the sample distribution. Finally, the effectiveness of the proposed method is confirmed by experiments and comparison with other methods.

[1]  Juan José Rodríguez Diez,et al.  A weighted voting framework for classifiers ensembles , 2012, Knowledge and Information Systems.

[2]  Francisco Herrera,et al.  Tackling the problem of classification with noisy data using Multiple Classifier Systems: Analysis of the performance and robustness , 2013, Inf. Sci..

[3]  Bin Hu,et al.  Learning from neighborhood for classification with local distribution characteristics , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[4]  Thierry Denoeux,et al.  Editing training data for multi-label classification with the k-nearest neighbor rule , 2016, Pattern Analysis and Applications.

[5]  Francisco Herrera,et al.  Analyzing the presence of noise in multi-class problems: alleviating its influence with the One-vs-One decomposition , 2012, Knowledge and Information Systems.

[6]  Xindong Wu,et al.  Majority Voting and Pairing with Multiple Noisy Labeling , 2019, IEEE Transactions on Knowledge and Data Engineering.

[7]  Shiliang Sun,et al.  Local within-class accuracies for weighting individual outputs in multiple classifier systems , 2010, Pattern Recognit. Lett..

[8]  Choh-Man Teng,et al.  Polishing Blemishes: Issues in Data Correction , 2004, IEEE Intell. Syst..

[9]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[10]  Gábor Lugosi,et al.  Learning with an unreliable teacher , 1992, Pattern Recognit..

[11]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[12]  Fabio Roli,et al.  Multiple classifier systems : 7th International Workshop, MCS 2007, Prague, Czech Republic, May 23-25, 2007 : proceedings , 2007 .

[13]  Guangjie Han,et al.  Dynamic Adaptive Replacement Policy in Shared Last-Level Cache of DRAM/PCM Hybrid Memory for Big Data Storage , 2017, IEEE Transactions on Industrial Informatics.

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Juan Ramón Rico-Juan,et al.  Adaptive training set reduction for nearest neighbor classification , 2014, Neurocomputing.

[16]  Taghi M. Khoshgoftaar,et al.  Improving Software Quality Prediction by Noise Filtering Techniques , 2007, Journal of Computer Science and Technology.

[17]  Francisco Herrera,et al.  INFFC: An iterative class noise filter based on the fusion of classifiers with noise sensitivity control , 2016, Inf. Fusion.

[18]  Mohsen Guizani,et al.  Green Routing Protocols for Wireless Multimedia Sensor Networks , 2016, IEEE Wireless Communications.

[19]  Veda C. Storey,et al.  A Framework for Analysis of Data Quality Research , 1995, IEEE Trans. Knowl. Data Eng..

[20]  Rosa Maria Valdovinos,et al.  New Applications of Ensembles of Classifiers , 2003, Pattern Analysis & Applications.

[21]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[22]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[23]  Nada Lavrac,et al.  Experiments with Noise Filtering in a Medical Domain , 1999, ICML.

[24]  Tinghuai Ma,et al.  Detecting potential labeling errors for bioinformatics by multiple voting , 2014, Knowl. Based Syst..

[25]  Fabio Roli,et al.  Multiple Classifier Systems, 9th International Workshop, MCS 2010, Cairo, Egypt, April 7-9, 2010. Proceedings , 2010, MCS.

[26]  Mohsen Guizani,et al.  A Disaster Management-Oriented Path Planning for Mobile Anchor Node-Based Localization in Wireless Sensor Networks , 2020, IEEE Transactions on Emerging Topics in Computing.

[27]  Juan Ramón Rico-Juan,et al.  Improving kNN multi-label classification in Prototype Selection scenarios using class proposals , 2015, Pattern Recognit..

[28]  Carla E. Brodley,et al.  Identifying and Eliminating Mislabeled Training Instances , 1996, AAAI/IAAI, Vol. 1.

[29]  Xingquan Zhu,et al.  Class Noise vs. Attribute Noise: A Quantitative Study , 2003, Artificial Intelligence Review.

[30]  Ludmila I. Kuncheva,et al.  Data reduction using classifier ensembles , 2007, ESANN.

[31]  Saso Dzeroski,et al.  Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois , 1996, ALT.

[32]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[33]  Padhraic Smyth,et al.  Bounds on the mean classification error rate of multiple experts , 1996, Pattern Recognit. Lett..

[34]  Igor Kononenko,et al.  Machine Learning and Data Mining: Introduction to Principles and Algorithms , 2007 .

[35]  Guangjie Han,et al.  HySense: A Hybrid Mobile CrowdSensing Framework for Sensing Opportunities Compensation under Dynamic Coverage Constraint , 2017, IEEE Communications Magazine.

[36]  Saso Dzeroski,et al.  Noise detection and elimination in data preprocessing: Experiments in medical domains , 2000, Appl. Artif. Intell..

[37]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.