Applying Mutual Information for Prototype or Instance Selection in Regression Problems

1- University of Ja´en - Department of InformaticsSpain2- University of Granada - Department of Computer Technology and ArchitectureSpain3- Helsinki University of Technology - Information and Computer Science DepartmentFinlandAbstract. The problem of selecting the patterns to be learned by anymodel is usually not considered by the time of designing the concretemodel but as a preprocessing step. Information theory provides a robusttheoretical framework for performing input variable selection thanks to theconcept of mutual information. Recently the computation of the mutualinformation for regression tasks has been proposed so this paper presentsa new application of the concept of mutual information not to select thevariables but to decide which prototypes should belong to the training dataset in regression problems. The proposed methodology consists in decidingif a prototype should belong or not to the training set using as criteria theestimation of the mutual information between the variables. The noveltyof the approach is to focus in prototype selection for regression problemsinstead of classification as the majority of the literature deals only withthe last one. Other element that distinguishes this work from others isthat it is not proposed as an outlier identificator but as an algorithm thatdetermines the best subset of input vectors by the time of building a modelto approximate it. As the experiment section shows, this new method isable to identify a high percentage of the real data set when it is appliedto a highly distorted data sets.

[1]  Michel Verleysen,et al.  Using the Delta Test for Variable Selection , 2008, ESANN.

[2]  J. Tolvi,et al.  Genetic algorithms for outlier detection and variable selection in linear regression models , 2004, Soft Comput..

[3]  Amaury Lendasse,et al.  Non-parametric Residual Variance Estimation in Supervised Learning , 2007, IWANN.

[4]  Hisao Ishibuchi,et al.  Learning of neural networks with GA-based instance selection , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[5]  Héctor Pomares,et al.  Using fuzzy logic to improve a clustering technique for function approximation , 2007, Neurocomputing.

[6]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[7]  Thomas G. Dietterich,et al.  Pruning Improves Heuristic Search for Cost-Sensitive Learning , 2002, ICML.

[8]  Héctor Pomares,et al.  Effective Input Variable Selection for Function Approximation , 2006, ICANN.

[9]  A. S. Weigend,et al.  Selecting Input Variables Using Mutual Information and Nonparemetric Density Estimation , 1994 .

[10]  Richard Nock,et al.  Stopping Criterion for Boosting-Based Data Reduction Techniques: from Binary to Multiclass Problem , 2003, J. Mach. Learn. Res..

[11]  Frank-Michael Schleif,et al.  Advances in computational intelligence and learning , 2010 .

[12]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[13]  Jianping Zhang,et al.  Intelligent Selection of Instances for Prediction Functions in Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[14]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .