SPMoE: a novel subspace-projected mixture of experts model for multi-target regression problems

In this paper, we focus on modeling multi-target regression problems with high-dimensional feature spaces and a small number of instances that are common in many real-life problems of predictive modeling. With the aim of designing an accurate prediction tool, we present a novel mixture of experts (MoE) model called subspace-projected MoE (SPMoE). Training the experts of the SPMoE is done using a boosting-like manner by a combination of ideas from subspace projection method and the negative correlation learning algorithm (NCL). Instead of using whole original input space for training the experts, we develop a new cluster-based subspace projection method to obtain projected subspaces focused on the difficult instances at each step of the boosting approach for training the diverse experts. The experts of the SPMoE are trained on the obtained subspaces using a new NCL algorithm called sequential NCL. The SPMoE is compared with the other ensemble models using three real cases of high-dimensional multi-target regression problems; the electrical discharge machining, energy efficiency and an important problem in the field of operations strategy called the practice–performance problem. The experimental results show that the prediction accuracy of the SPMoE is significantly better than the other ensemble and single models and can be considered to be a promising alternative for modeling the high-dimensional multi-target regression problems.

[1]  Reza Ebrahimpour,et al.  Boost-wise pre-loaded mixture of experts for classification tasks , 2012, Neural Computing and Applications.

[2]  Peng Li,et al.  Distance Metric Learning with Eigenvalue Optimization , 2012, J. Mach. Learn. Res..

[3]  Hussein A. Abbass,et al.  The use of coevolution and the artificial immune system for ensemble learning , 2011, Soft Comput..

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  J. Nicholas,et al.  New product development best practice in SME and large organisations: theory vs practice , 2011 .

[6]  Ivan Bratko,et al.  First Order Regression , 1997, Machine Learning.

[7]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[8]  Minqiang Li,et al.  Coevolutionary learning of neural network ensemble for complex classification tasks , 2012, Pattern Recognit..

[9]  Sotiris B. Kotsiantis,et al.  Combining bagging, boosting, rotation forest and random subspace methods , 2011, Artificial Intelligence Review.

[10]  Reza Ebrahimpour,et al.  Mixture of experts: a literature survey , 2014, Artificial Intelligence Review.

[11]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  S. I. Ao,et al.  A hybrid neural network cybernetic system for quantifying cross-market dynamics and business forecasting , 2011, Soft Comput..

[14]  Juan José Rodríguez Diez,et al.  Rotation Forests for regression , 2013, Appl. Math. Comput..

[15]  Huanhuan Chen,et al.  Regularized Negative Correlation Learning for Neural Network Ensembles , 2009, IEEE Transactions on Neural Networks.

[16]  Reza Ebrahimpour,et al.  Combining features of negative correlation learning with mixture of experts in proposed ensemble methods , 2012, Appl. Soft Comput..

[17]  Esmaeil Hadavandi,et al.  Hybridization of evolutionary Levenberg-Marquardt neural networks and data pre-processing for stock market prediction , 2012, Knowl. Based Syst..

[18]  Nicolás García-Pedrajas,et al.  Boosting random subspace method , 2008, Neural Networks.

[19]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[20]  Arash Ghanbari,et al.  Integration of genetic fuzzy systems and artificial neural networks for stock price forecasting , 2010, Knowl. Based Syst..

[21]  Gabriele Soffritti,et al.  Hierarchical clustering of variables: a comparison among strategies of analysis , 1999 .

[22]  Saso Dzeroski,et al.  Tree ensembles for predicting structured outputs , 2013, Pattern Recognit..

[23]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[24]  Tapio Elomaa,et al.  Multi-target regression with rule ensembles , 2012, J. Mach. Learn. Res..

[25]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[26]  Aziz Guergachi,et al.  Data mining applications in hydrocarbon exploration , 2010, Artificial Intelligence Review.

[27]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[28]  Behrooz Karimi,et al.  Modeling and evaluating the strategic effects of improvement programs on the manufacturing performance using neural networks , 2010 .

[29]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[30]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[32]  E. Vigneau,et al.  Clustering of Variables Around Latent Components , 2003 .

[33]  Lipo Wang,et al.  Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[34]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[35]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[36]  Joseph N. Wilson,et al.  Twenty Years of Mixture of Experts , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Satarupa Banerjee,et al.  Lidar detection of underwater objects using a neuro-SVM-based architecture , 2006, IEEE Transactions on Neural Networks.

[38]  Reza Ebrahimpour,et al.  Mixture of feature specified experts , 2014, Inf. Fusion.

[39]  Hussein A. Abbass,et al.  Analysis of CCME: Coevolutionary Dynamics, Automatic Problem Decomposition, and Regularization , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[40]  Nicolás García-Pedrajas,et al.  Nonlinear Boosting Projections for Ensemble Construction , 2007, J. Mach. Learn. Res..

[41]  Esmaeil Hadavandi,et al.  Developing a hybrid intelligent model for forecasting problems: Case study of tourism demand time series , 2013, Knowl. Based Syst..

[42]  Nicolás García-Pedrajas,et al.  Supervised subspace projections for constructing ensembles of classifiers , 2012, Inf. Sci..

[43]  Lipo Wang,et al.  Data Mining With Computational Intelligence , 2006, IEEE Transactions on Neural Networks.

[44]  Athanasios Tsanas,et al.  Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools , 2012 .

[45]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[46]  Xizhao Wang,et al.  Dynamic ensemble extreme learning machine based on sample entropy , 2012, Soft Comput..

[47]  Ian T. Jolliffe,et al.  A clustering approach to interpretable principal components , 2013 .

[48]  Evelyne Vigneau,et al.  Clustering of variables, application in consumer and sensory studies , 1997 .

[49]  Antanas Verikas,et al.  Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey , 2010, Soft Comput..

[50]  Francisco Herrera,et al.  A study on the use of statistical tests for experimentation with neural networks: Analysis of parametric test conditions and non-parametric tests , 2007, Expert Syst. Appl..

[51]  Kwasi Amoako-Gyampah,et al.  Manufacturing strategy, competitive strategy and firm performance: An empirical study in a developing economy environment , 2008 .