An overview of advances in reliability estimation of individual predictions in machine learning

In Machine Learning, estimation of the predictive accuracy for a given model is most commonly approached by analyzing the average accuracy of the model. In general, the predictive models do not provide accuracy estimates for their individual predictions. The reliability estimates of individual predictions require the analysis of various model and instance properties. In the paper we make an overview of the approaches for estimation of individual prediction reliability. We start by summarizing three research fields, that provided ideas and motivation for our work: (a) approaches to perturbing learning data, (b) the usage of unlabeled data in supervised learning, and (c) the sensitivity analysis. The main part of the paper presents two classes of reliability estimation approaches and summarizes the relevant terminology, which is often used in this and related research fields.

[1]  Harris Drucker,et al.  Improving Regressors using Boosting Techniques , 1997, ICML.

[2]  R. Tibshirani,et al.  The Covariance Inflation Criterion for Adaptive Model Selection , 1999 .

[3]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  Igor Kononenko,et al.  Reliable Classifications with Machine Learning , 2002, ECML.

[6]  L. Breiman Pasting Bites Together For Prediction In Large Data Sets And On-Line , 1996 .

[7]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[8]  Alexander Gammerman,et al.  Learning by Transduction , 1998, UAI.

[9]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[10]  M K Kerr,et al.  Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Stefano Tarantola,et al.  Introduction to Sensitivity Analysis , 2008 .

[12]  M. Pontil Leave-one-out error and stability of learning algorithms with applications , 2002 .

[13]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[14]  Z. Bosnic,et al.  Evaluation of prediction reliability in regression using the transduction principle , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[15]  Stefano Tarantola,et al.  Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models , 2004 .

[16]  Pierre Geurts,et al.  Dual perturb and combine algorithm , 2001, AISTATS.

[17]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[18]  Igor Kononenko,et al.  Estimation of individual prediction reliability using the local sensitivity analysis , 2008, Applied Intelligence.

[19]  Dana Ron,et al.  Algorithmic Stability and Sanity-Check Bounds for Leave-one-Out Cross-Validation , 1997, COLT.

[20]  Tom Heskes,et al.  Practical Confidence and Prediction Intervals , 1996, NIPS.

[21]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[22]  Matthias Seeger,et al.  Learning from Labeled and Unlabeled Data , 2010, Encyclopedia of Machine Learning.

[23]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Igor Kononenko,et al.  Towards Reliable Reliability Estimates for Individual Regression Predictions Technical Report T-01 / 08 , 2008 .

[25]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[26]  Thomas Parisini,et al.  Stochastic learning methods for dynamic neural networks: simulated and real-data comparisons , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[27]  Shumeet Baluja,et al.  Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data , 1998, NIPS.

[28]  Andrzej Dudek,et al.  Cluster Quality Indexes for Symbolic Classification - An Examination , 2006, GfKl.

[29]  Gunnar Rätsch,et al.  Learning to Predict the Leave-One-Out Error of Kernel Based Classifiers , 2001, ICANN.

[30]  Tom Michael Mitchell,et al.  The Role of Unlabeled Data in Supervised Learning , 2004 .

[31]  Stefan Schaal,et al.  Assessing the Quality of Learned Local Models , 1993, NIPS.

[32]  Padraig Cunningham,et al.  Confidence and prediction intervals for neural network ensembles , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[33]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[34]  Sebastian Thrun,et al.  Active Exploration in Dynamic Environments , 1991, NIPS.

[35]  A. Saltelli,et al.  Sensitivity analysis for chemical models. , 2005, Chemical reviews.

[36]  Igor Kononenko,et al.  Machine Learning and Data Mining: Introduction to Principles and Algorithms , 2007 .

[37]  R. Tibshirani,et al.  Model Search by Bootstrap “Bumping” , 1999 .

[38]  Linda C. van der Gaag,et al.  Making Sensitivity Analysis Computationally Efficient , 2000, UAI.

[39]  Alexander Gammerman,et al.  Transduction with Confidence and Credibility , 1999, IJCAI.

[40]  Igor Kononenko,et al.  Automatic selection of reliability estimates for individual regression predictions , 2010, The Knowledge Engineering Review.

[41]  Christopher G. Atkeson,et al.  Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.

[42]  S. Hashem,et al.  Sensitivity analysis for feedforward artificial neural networks with differentiable activation functions , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[43]  Andreas S. Weigend,et al.  Predictions with Confidence Intervals ( Local Error Bars ) , 1994 .

[44]  Virginia R. de Sa,et al.  Learning Classification with Unlabeled Data , 1993, NIPS.

[45]  Kweku-Muata Osei-Bryson,et al.  Assessing Cluster Quality Using Multiple Measures - A Decision Tree Based Approach , 2005 .

[46]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[47]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[48]  Harry Wechsler,et al.  Open set face recognition using transduction , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Vladimir Vovk,et al.  Ridge Regression Confidence Machine , 2001, International Conference on Machine Learning.

[50]  Robert Tibshirani,et al.  Model Search and Inference By Bootstrap "bumping , 1995 .

[51]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[52]  Jack P. C. Kleijnen Experimental Design for Sensitivity Analysis of Simulation Models , 2001 .

[53]  Dale Schuurmans,et al.  Data perturbation for escaping local maxima in learning , 2002, AAAI/IAAI.

[54]  André Elisseeff,et al.  Algorithmic Stability and Generalization Performance , 2000, NIPS.

[55]  Michael I. Jordan,et al.  Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[56]  Steven D. Whitehead,et al.  A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning , 1991, AAAI.

[57]  João Gama,et al.  Stream-Based Electricity Load Forecast , 2007, PKDD.

[58]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[59]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[60]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[61]  G. Ridgeway The State of Boosting ∗ , 1999 .

[62]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[63]  Yan Zhou,et al.  Enhancing Supervised Learning with Unlabeled Data , 2000, ICML.

[64]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[65]  Mauro Birattari,et al.  Local learning for data analysis , 1998 .

[66]  Igor Kononenko,et al.  Comparison of approaches for estimating reliability of individual regression predictions , 2008, Data Knowl. Eng..

[67]  Igor Kononenko,et al.  Estimation of Regressor Reliability , 2008 .

[68]  Thomas Richardson,et al.  Boosting methodology for regression problems , 1999, AISTATS.