The MBPEP: a deep ensemble pruning algorithm providing high quality uncertainty prediction

Machine learning algorithms have been effectively applied into various real world tasks. However, it is difficult to provide high-quality machine learning solutions to accommodate an unknown distribution of input datasets; this difficulty is called the uncertainty prediction problems. In this paper, a margin-based Pareto deep ensemble pruning (MBPEP) model is proposed. It achieves the high-quality uncertainty estimation with a small value of the prediction interval width (MPIW) and a high confidence of prediction interval coverage probability (PICP) by using deep ensemble networks. In addition to these networks, unique loss functions are proposed, and these functions make the sub-learners available for standard gradient descent learning. Furthermore, the margin criterion fine-tuning-based Pareto pruning method is introduced to optimize the ensembles. Several experiments including predicting uncertainties of classification and regression are conducted to analyze the performance of MBPEP. The experimental results show that MBPEP achieves a small interval width and a low learning error with an optimal number of ensembles. For the real-world problems, MBPEP performs well on input datasets with unknown distributions datasets incomings and improves learning performance on a multi task problem when compared to that of each single model.

[1]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[2]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[5]  Bin Li,et al.  Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure , 2016, Comput. Intell. Neurosci..

[6]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[7]  Tetsuji Ogawa,et al.  Uncertainty estimation of DNN classifiers , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[8]  Myunghee Cho Paik,et al.  Uncertainty quantification using Bayesian neural networks in classification: Application to ischemic stroke lesion segmentation , 2018 .

[9]  Amir F. Atiya,et al.  Lower Upper Bound Estimation Method for Construction of Neural Network-Based Prediction Intervals , 2011, IEEE Transactions on Neural Networks.

[10]  George D. C. Cavalcanti,et al.  META-DES.Oracle: Meta-learning and feature selection for dynamic ensemble selection , 2017, Inf. Fusion.

[11]  Yang Yu,et al.  On the usefulness of infeasible solutions in evolutionary search: A theoretical study , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[12]  Le Hoang Son,et al.  Some novel hybrid forecast methods based on picture fuzzy clustering for weather nowcasting from satellite image sequences , 2016, Applied Intelligence.

[13]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  George D. C. Cavalcanti,et al.  Online pruning of base classifiers for Dynamic Ensemble Selection , 2017, Pattern Recognit..

[15]  Mahdi Jadaliha,et al.  Gaussian process regression using Laplace approximations under localization uncertainty , 2012, 2012 American Control Conference (ACC).

[16]  George D. C. Cavalcanti,et al.  On Meta-learning for Dynamic Ensemble Selection , 2014, 2014 22nd International Conference on Pattern Recognition.

[17]  Tuomas Virtanen,et al.  Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network , 2017, ArXiv.

[18]  S. Mahadevan,et al.  Bayesian Uncertainty Integration for Model Calibration, Validation, and Prediction , 2016 .

[19]  Geoffrey Zweig,et al.  Toward Human Parity in Conversational Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[20]  Li-Minn Ang,et al.  A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach , 2018, IEEE Transactions on Affective Computing.

[21]  Marek Kurzynski,et al.  A measure of competence based on random classification for dynamic ensemble selection , 2012, Inf. Fusion.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Alex Kendall,et al.  Concrete Dropout , 2017, NIPS.

[24]  Jin He,et al.  Real-Time Multilead Convolutional Neural Network for Myocardial Infarction Detection , 2018, IEEE Journal of Biomedical and Health Informatics.

[25]  Jin He,et al.  Efficient Multispike Learning for Spiking Neural Networks Using Probability-Modulated Timing Method , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Inés María Galván,et al.  Multi-objective evolutionary optimization of prediction intervals for solar energy forecasting with neural networks , 2017, Inf. Sci..

[27]  Yang Yu,et al.  Integration of an improved dynamic ensemble selection approach to enhance one-vs-one scheme , 2018, Eng. Appl. Artif. Intell..

[28]  Jian Ma,et al.  Study of corporate credit risk prediction based on integrating boosting and random subspace , 2011, Expert Syst. Appl..

[29]  Jin He,et al.  Monitor-Based Spiking Recurrent Network for the Representation of Complex Dynamic Patterns , 2019, Int. J. Neural Syst..

[30]  Reza Ebrahimpour,et al.  Mixture of experts: a literature survey , 2014, Artificial Intelligence Review.

[31]  Fuyuan Xiao,et al.  An improved distance-based total uncertainty measure in belief function theory , 2017, Applied Intelligence.

[32]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[33]  Juan Lin,et al.  List-Based Simulated Annealing Algorithm for Traveling Salesman Problem , 2016, Comput. Intell. Neurosci..

[34]  Yang Yu,et al.  Subset Selection by Pareto Optimization , 2015, NIPS.

[35]  Ying Tan,et al.  Multi-digit image synthesis using recurrent conditional variational autoencoder , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[36]  Jin He,et al.  A Hardware Friendly Unsupervised Memristive Neural Network with Weight Sharing Mechanism , 2019, Neurocomputing.

[37]  Hugo A. Katus,et al.  Robust Image-Based Estimation of Cardiac Tissue Parameters and Their Uncertainty from Noisy Data , 2014, MICCAI.

[38]  H. Fazlollahtabar,et al.  Hybrid cost and time path planning for multiple autonomous guided vehicles , 2018, Applied Intelligence.

[39]  Zhenyu Wu,et al.  An Integrated Ensemble Learning Model for Imbalanced Fault Diagnostics and Prognostics , 2018, IEEE Access.

[40]  Jin He,et al.  SpikeCD: a parameter-insensitive spiking neural network with clustering degeneracy strategy , 2019, Neural Computing and Applications.

[41]  Mohamed Zaki,et al.  High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach , 2018, ICML.

[42]  Robert Sabourin,et al.  From dynamic classifier selection to dynamic ensemble selection , 2008, Pattern Recognit..

[43]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[44]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..