A Comparison of Techniques for Handling Incomplete Input Data with a Focus on Attribute Relevance Influence

This work presents a new approach based on support vector regression to deal with incomplete input (unseen) data and compares it to other existing techniques. The empirical analysis has been done over 18 real data sets and using five different classifiers, with the aim of foreseeing which technique can be deemed as more suitable for each classifier. Also, this study tries to devise how the relevance of the missing attribute affects the performance of each pair (handling algorithm, classifier). Experimental results demonstrate that no technique is absolutely better than the others for all classifiers. However, combining the proposed strategy with the nearest neighbor classifier appears as the best choice to face the problem of missing attribute values in the input data.