Local feature selection for heterogeneous problems

Current electronic data repositories contain enormous amount of data including also unknown and potentially interesting patterns and relations. One approach commonly used is supervised machine learning, in which a set of training instances is used to train one or more classifiers that map the space formed by different features of the instances into the set of class values. The classifiers are later used to classify new instances with unknown class values. The multidimensional data is sometimes feature-space heterogeneous so that different features have different importance in different subareas of the whole space. In this paper we describe a technique that searches for a division of the feature space identifying the best subsets of features for each instance. Our technique is based on the wrapper approach where a classification algorithm is used as the evaluation function to differentiate between several feature subsets. On the other hand, to make the feature selection dynamic or local, we apply the recently developed technique for dynamic integration of classifiers to determine what classifier and with what feature subset is applied for each new instance. Our technique can be applied in the case of implicit heterogeneity when the regions of heterogeneity cannot be easily defined by a simple dependency. We make experiments with well-known datasets of the UCI machine learning repository using ensembles of simple base classifiers. The base classifiers discussed in the experiments are developed by C4.5 based on only one feature each but the combining method itself is allowed to take into account unlimited amount of present features. The results achieved are promising and show that the local data mining in comparison with mining the whole space can be advantageous. In many cases, the mining accuracy is better as well as the time of processing is shorter, because only a small set of features is used to classify each new instance.

[1]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[2]  Edwin P. D. Pednault,et al.  Decomposition of Heterogeneous Classification Problems , 1997, IDA.

[3]  Alexey Tsymbal,et al.  The decision support system for telemedicine based on multiple expertise , 1998, Int. J. Medical Informatics.

[4]  Pedro M. Domingos Control-Sensitive Feature Selection for Lazy Learners , 1997, Artificial Intelligence Review.

[5]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[6]  Alexey Tsymbal,et al.  Distance functions in dynamic integration of data mining techniques , 2000, SPIE Defense + Commercial Sensing.

[7]  Alexey Tsymbal,et al.  A Dynamic Integration Algorithm for an Ensemble of Classifiers , 1999, ISMIS.

[8]  Alexey Tsymbal,et al.  Learning feature selection for medical databases , 1999, Proceedings 12th IEEE Symposium on Computer-Based Medical Systems (Cat. No.99CB36365).

[9]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[10]  Christopher J. Merz,et al.  Dynamical Selection of Learning Algorithms , 1995, AISTATS.

[11]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[12]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.