Diverse Selection of Feature Subsets for Ensemble Regression

Regression tasks such as forecasting of sensor values play a principal role in industrial applications. For instance, modern automobiles have hundreds of process variables which are used to predict target sensor values. Due to the complexity of these systems, each subset of features often shows different type of correlations with the target. Capturing such local interactions improve the regression models. Nevertheless, several existing feature selection algorithms focus on obtaining a single projection of the features and are not able to exploit the multiple local interactions from different subsets of variables. It is still an open challenge to efficiently select multiple subsets that not only contribute for the prediction quality, but are also diverse, i.e., subsets with complementary information. Such diverse subsets enrich the regression model with novel and essential knowledge by capturing the local interactions using multiple views of a high-dimensional feature space. In this work, we propose a framework to select multiple diverse subsets. First, our approach prunes the feature space by using the properties of multiple correlation measures. The pruned feature space is used to systematically generate new diverse combinations of feature subsets without decrease in the prediction quality. We show that our approach outperforms prevailing approaches on synthetic and several real world datasets from different application domains.

[1]  Juan Pardo,et al.  On-line learning of indoor temperature forecasting models towards energy efficiency , 2014 .

[2]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[3]  Jinsong Leng,et al.  A genetic Algorithm-Based feature selection , 2014 .

[4]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[5]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[6]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[7]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[8]  Douglas A. Wolfe,et al.  Nonparametric Statistical Methods , 1973 .

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Cyrus Shahabi,et al.  Feature subset selection and feature ranking for multivariate time series , 2005, IEEE Transactions on Knowledge and Data Engineering.

[11]  I. Jolliffe Principal Component Analysis , 2002 .

[12]  François Kawala,et al.  Prédictions d'activité dans les réseaux sociaux en ligne , 2013 .

[13]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[14]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[15]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[16]  Abdelhak M. Zoubir,et al.  Contributions to Automatic Target Recognition Systems for Underwater Mine Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[17]  M. M. A. Salama,et al.  Particle swarm optimization feature selection for the classification of conducting particles in transformer oil , 2011, IEEE Transactions on Dielectrics and Electrical Insulation.

[18]  Miguel Lázaro Gredilla Sparse gaussian processes for large-scale machine learning , 2011 .

[19]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[20]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[21]  John E. Olson On the Symmetric Difference of Two Sets in a Group , 1986, Eur. J. Comb..

[22]  Robi Polikar,et al.  An Ensemble Approach for Data Fusion with Learn++ , 2003, Multiple Classifier Systems.

[23]  Kagan Tumer,et al.  Dimensionality Reduction Through Classifier Ensembles , 1999 .