Efficient conformal regressors using bagged neural nets

Conformal predictors use machine learning models to output prediction sets. For regression, a prediction set is simply a prediction interval. All conformal predictors are valid, meaning that the error rate on novel data is bounded by a preset significance level. The key performance metric for conformal predictors is their efficiency, i.e., the size of the prediction sets. Inductive conformal predictors utilize real-valued functions, called nonconformity functions, and a calibration set, i.e., a set of labeled instances not used for the model training, to obtain the prediction regions. In state-of-the-art conformal regressors, the nonconformity functions are normalized, i.e., they include a component estimating the difficulty of each instance. In this study, conformal regressors are built on top of ensembles of bagged neural networks, and several nonconformity functions are evaluated. In addition, the option to calibrate on out-of-bag instances instead of setting aside a calibration set is investigated. The experiments, using 33 publicly available data sets, show that normalized nonconformity functions can produce smaller prediction sets, but the efficiency is highly dependent on the quality of the difficulty estimation. Specifically, in this study, the most efficient normalized nonconformity function estimated the difficulty of an instance by calculating the average error of neighboring instances. These results are consistent with previous studies using random forests as underlying models. Calibrating on out-of-bag did, however, only lead to more efficient conformal predictors on smaller data sets, which is in sharp contrast to the random forest study, where out-out-of bag calibration was significantly better overall.

[1]  Haris Haralambous,et al.  Neural Networks Regression Inductive Conformal Predictor and Its Application to Total Electron Content Prediction , 2010, ICANN.

[2]  Zhiyuan Luo,et al.  Conformal Prediction for Indoor Localisation with Fingerprinting Method , 2012, AIAI.

[3]  Henrik Boström,et al.  Effective utilization of data in inductive conformal prediction using ensembles of neural networks , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[4]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[5]  Lázaro Emílio Makili,et al.  Computationally efficient SVM multi-class image recognition with confidence measures , 2011 .

[6]  Harris Papadopoulos,et al.  Inductive Conformal Prediction: Theory and Application to Neural Networks , 2008 .

[7]  Alexander Gammerman,et al.  Pattern Recognition and Density Estimation under the General i.i.d. Assumption , 2001, COLT/EuroCOLT.

[8]  Ilia Nouretdinov,et al.  Prediction with Confidence Based on a Random Forest Classifier , 2010, AIAI.

[9]  Henrik Boström,et al.  Regression conformal prediction with random forests , 2014, Machine Learning.

[10]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[11]  Harris Papadopoulos,et al.  Reliable Confidence Intervals for Software Effort Estimation , 2009, AIAI Workshops.

[12]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[13]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[14]  Henrik Boström,et al.  Evolved decision trees as conformal predictors , 2013, 2013 IEEE Congress on Evolutionary Computation.

[15]  Haris Haralambous,et al.  Reliable prediction intervals with regression neural networks , 2011, Neural Networks.

[16]  Henrik Boström,et al.  Conformal Prediction Using Decision Trees , 2013, 2013 IEEE 13th International Conference on Data Mining.

[17]  Harris Papadopoulos,et al.  Reliable Confidence Measures for Medical Diagnosis With Evolutionary Algorithms , 2011, IEEE Transactions on Information Technology in Biomedicine.

[18]  Harris Papadopoulos,et al.  Inductive Confidence Machines for Regression , 2002, ECML.

[19]  Harris Papadopoulos,et al.  Regression Conformal Prediction with Nearest Neighbours , 2014, J. Artif. Intell. Res..

[20]  Harris Papadopoulos,et al.  Reliable diagnosis of acute abdominal pain with conformal prediction , 2009 .

[21]  Vladimir Vovk,et al.  Conformal predictors in early diagnostics of ovarian and breast cancers , 2012, Progress in Artificial Intelligence.

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[23]  Siddhartha Bhattacharyya Confidence in Predictions from Random Tree Ensembles , 2011, ICDM.

[24]  Siddhartha Bhattacharyya,et al.  Confidence in predictions from random tree ensembles , 2011, 2011 IEEE 11th International Conference on Data Mining.

[25]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .