Uncertainty Quantification in Extreme Learning Machine: Analytical Developments, Variance Estimates and Confidence Intervals

Uncertainty quantification is crucial to assess prediction quality of a machine learning model. In the case of Extreme Learning Machines (ELM), most methods proposed in the literature make strong assumptions on the data, ignore the randomness of input weights or neglect the bias contribution in confidence interval estimations. This paper presents novel estimations that overcome these constraints and improve the understanding of ELM variability. Analytical derivations are provided under general assumptions, supporting the identification and the interpretation of the contribution of different variability sources. Under both homoskedasticity and heteroskedasticity, several variance estimates are proposed, investigated, and numerically tested, showing their effectiveness in replicating the expected variance behaviours. Finally, the feasibility of confidence intervals estimation is discussed by adopting a critical approach, hence raising the awareness of ELM users concerning some of their pitfalls. The paper is accompanied with a scikit-learn compatible Python library enabling efficient computation of all estimates discussed herein.

[1]  C. Chatfield Model uncertainty, data mining and statistical inference , 1995 .

[2]  H. Nyquist Applications of the jackknife procedure in ridge regression , 1988 .

[3]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[4]  D. Hinkley Jackknifing in Unbalanced Situations , 1977 .

[5]  Tom Heskes,et al.  Practical Confidence and Prediction Intervals , 1996, NIPS.

[6]  Amaury Lendasse,et al.  Per-sample prediction intervals for extreme learning machines , 2018, International Journal of Machine Learning and Cybernetics.

[7]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[8]  Kit Po Wong,et al.  Probabilistic Forecasting of Wind Power Generation Using Extreme Learning Machine , 2014, IEEE Transactions on Power Systems.

[9]  Francisco Cribari-Neto,et al.  Asymptotic inference under heteroskedasticity of unknown form , 2004, Comput. Stat. Data Anal..

[10]  D. B. Duncan,et al.  Estimating Heteroscedastic Variances in Linear Models , 1975 .

[11]  Robert Tibshirani,et al.  A Comparison of Some Error Estimates for Neural Network Models , 1996, Neural Computation.

[12]  A. M. Mathai,et al.  Quadratic forms in random variables : theory and applications , 1992 .

[13]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[16]  M. Chial,et al.  in simple , 2003 .

[17]  Antonio J. Serrano,et al.  BELM: Bayesian Extreme Learning Machine , 2011, IEEE Transactions on Neural Networks.

[18]  Mikhail Kanevski,et al.  Extreme Learning Machines for spatial environmental data , 2015, Comput. Geosci..

[19]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[20]  Walter W. Piegorsch Statistical Data Analytics: Foundations for Data Mining, Informatics, and Knowledge Discovery , 2015 .

[21]  H. White,et al.  Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties☆ , 1985 .

[22]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[23]  Min Liu,et al.  A new robust ELM method based on a Bayesian framework with heavy-tailed distribution and weighted likelihood function , 2015, Neurocomputing.

[24]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[25]  P. Hall EFFECT OF BIAS ESTIMATION ON COVERAGE ACCURACY OF BOOTSTRAP CONFIDENCE INTERVALS FOR A PROBABILITY DENSITY , 1992 .

[26]  Stephen P. Boyd,et al.  Introduction to Applied Linear Algebra , 2018 .

[27]  Han Wang,et al.  Ensemble Based Extreme Learning Machine , 2010, IEEE Signal Processing Letters.

[28]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[29]  Mikhail F. Kanevski,et al.  Feature Selection for Regression Problems Based on the Morisita Estimator of Intrinsic Dimension: Concept and Case Studies , 2016, Pattern Recognit..

[30]  S. Roberts,et al.  Confidence Intervals and Prediction Intervals for Feed-Forward Neural Networks , 2001 .

[31]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .

[32]  Mikhail Kanevski,et al.  Model Variance for Extreme Learning Machine , 2020, ESANN.

[33]  Qinghua Zheng,et al.  Regularized Extreme Learning Machine , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[34]  Harry H. Kelejian,et al.  HAC estimation in a spatial framework , 2007 .

[35]  Amaury Lendasse,et al.  Extreme Learning Machine: A Robust Modeling Technique? Yes! , 2013, IWANN.

[36]  A. C. Davison,et al.  Statistical models: Name Index , 2003 .

[37]  M. Clyde,et al.  Model Uncertainty , 2003 .

[38]  Zhenyu Liao,et al.  A Random Matrix Approach to Neural Networks , 2017, ArXiv.

[39]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.