Eliminating Multicollinearity Issues in Neural Network Ensembles: Incremental, Negatively Correlated, Optimal Convex Blending

Given a {features , target} dataset, we introduce an incremental algorithm that constructs an aggregate regressor, using an ensemble of neural networks. It is well known that ensemble methods suffer from the multicollinearity issue, which is the manifestation of redundancy arising mainly due to the common training-dataset. In the present incremental approach, at each stage we optimally blend the aggregate regressor with a newly trained neural network under a convexity constraint which, if necessary, induces negative correlations. Under this framework, collinearity issues do not arise at all, rendering so the method both accurate and robust.

[1]  Jie Zhang,et al.  Selective combination of multiple neural networks for improving model prediction in nonlinear systems modelling through forward selection and backward elimination , 2009, Neurocomputing.

[2]  Nikola K. Kasabov,et al.  Fast neural network ensemble learning via negative-correlation data correction , 2005, IEEE Transactions on Neural Networks.

[3]  Clive W. J. Granger,et al.  Invited review: combining forecasts - twenty years later , 2001 .

[4]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[5]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[6]  Ron Meir,et al.  Bias, Variance and the Combination of Least Squares Estimators , 1994, NIPS.

[7]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[8]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[9]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[10]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[11]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[12]  Michael J. Pazzani,et al.  A Principal Components Approach to Combining Regression Estimates , 1999, Machine Learning.

[13]  Bruce W. Schmeiser,et al.  Improving model accuracy using optimal linear combinations of trained neural networks , 1995, IEEE Trans. Neural Networks.

[14]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[15]  Sherif Hashem Bruce Schmeiser Approximating a Function and its Derivatives Using MSE-Optimal Linear Combinations of Trained Feedfo , 1993 .

[16]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[17]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[18]  Gavin Brown,et al.  Diversity and degrees of freedom in regression ensembles , 2018, Neurocomputing.

[19]  Kenneth F. Wallis,et al.  Combining forecasts – forty years later , 2011 .

[20]  Sherif Hashem Effects of Collinearity on Combining Neural Networks , 1996, Connect. Sci..

[21]  James T. Kwok,et al.  Generalizing from a Few Examples , 2019, ACM Comput. Surv..

[22]  R. Tibshirani,et al.  Combining Estimates in Regression and Classification , 1996 .

[23]  Huanhuan Chen,et al.  Regularized Negative Correlation Learning for Neural Network Ensembles , 2009, IEEE Transactions on Neural Networks.

[24]  Kagan Tumer,et al.  Robust Combining of Disparate Classifiers through Order Statistics , 1999, Pattern Analysis & Applications.

[25]  M. Perrone Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization , 1993 .