Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications

The conformal predictions framework is a recent development in machine learning that can associate a reliable measure of confidence with a prediction in any real-world pattern recognition application, including risk-sensitive applications such as medical diagnosis, face recognition, and financial risk prediction. Conformal Predictions for Reliable Machine Learning: Theory, Adaptations and Applications captures the basic theory of the framework, demonstrates how to apply it to real-world problems, and presents several adaptations, including active learning, change detection, and anomaly detection. As practitioners and researchers around the world apply and adapt the framework, this edited volume brings together these bodies of work, providing a springboard for further research as well as a handbook for application in real-world problems. Understand the theoretical foundations of this important framework that can provide a reliable measure of confidence with predictions in machine learning Be able to apply this framework to real-world problems in different machine learning settings, including classification, regression, and clustering Learn effective ways of adapting the framework to newer problem settings, such as active learning, model selection, or change detection

[1]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[2]  Nikolay I. Nikolaev,et al.  Single-Stacking Conformity Approach to Reliable Classification , 2010, AIMSA.

[3]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[5]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[6]  Nikolaos Papanikolopoulos,et al.  Breaking the interactive bottleneck in multi-class classification with active selection and binary feedback , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[8]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[9]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[10]  Kristin P. Bennett,et al.  A Pattern Search Method for Model Selection of Support Vector Regression , 2002, SDM.

[11]  Alexander Gammerman,et al.  Transductive Confidence Machines for Pattern Recognition , 2002, ECML.

[12]  Carla E. Brodley,et al.  Active Class Selection , 2007, ECML.

[13]  Michèle Sebag,et al.  C4.5 competence map: a phase transition-inspired approach , 2004, ICML '04.

[14]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[15]  J. Kemperman Generalized Tolerance Limits , 1956 .

[16]  Balachander Krishnamurthy,et al.  Sketch-based change detection: methods, evaluation, and applications , 2003, IMC '03.

[17]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[18]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[19]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[20]  Alexander Gammerman,et al.  Plug-in martingales for testing exchangeability on-line , 2012, ICML.

[21]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[22]  Andrew W. Moore,et al.  Discriminators for use in flow-based classification , 2013 .

[23]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[24]  Charu C. Aggarwal,et al.  A framework for diagnosing changes in evolving data streams , 2003, SIGMOD '03.

[25]  H. Jaap van den Herik,et al.  The ROC isometrics approach to construct reliable classifiers , 2009, Intell. Data Anal..

[26]  J. Downing,et al.  Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. , 2003, Blood.

[27]  Ishwar K. Sethi,et al.  Confidence-based active learning , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  C. A. Murthy,et al.  A probabilistic active support vector learning algorithm , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2002, J. Mach. Learn. Res..

[30]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[31]  Harris Papadopoulos,et al.  Reliable Confidence Intervals for Software Effort Estimation , 2009, AIAI Workshops.

[32]  Marko Robnik-Sikonja,et al.  Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.

[33]  Nikolaos Papanikolopoulos,et al.  Multi-class batch-mode active learning for image classification , 2010, 2010 IEEE International Conference on Robotics and Automation.

[34]  Alexander Gammerman,et al.  Learning by Transduction , 1998, UAI.

[35]  David J Hand,et al.  Breast Cancer Diagnosis from Proteomic Mass Spectrometry Data: A Comparative Evaluation , 2008, Statistical applications in genetics and molecular biology.

[36]  Gian Luca Foresti,et al.  Trajectory-Based Anomalous Event Detection , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  J. Steele Stochastic Calculus and Financial Applications , 2000 .

[38]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[39]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[40]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[41]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[42]  Alexander Gammerman,et al.  Computationally Efficient Transductive Machines , 2000, ALT.

[43]  J. Friedman Stochastic gradient boosting , 2002 .

[44]  Elsa M. Jordaan,et al.  Confidence of SVM Predictions using a Strangeness Measure , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[45]  Xue Zhang,et al.  Batch Mode Active Learning Based Multi-view Text Classification , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[46]  Marco Canini,et al.  Efficient application identification and the temporal and spatial stability of classification schema , 2009, Comput. Networks.

[47]  J. Robins,et al.  Distribution-Free Prediction Sets , 2013, Journal of the American Statistical Association.

[48]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[49]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[50]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[51]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[52]  Fan Yang,et al.  Hedged Predictions for Traditional Chinese Chronic Gastritis Diagnosis with Confidence Machine , 2008, 2008 International Conference on Computer Science and Information Technology.

[53]  Mohan M. Trivedi,et al.  A Survey of Vision-Based Trajectory Learning and Analysis for Surveillance , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[54]  Erik Strumbelj,et al.  Explanation and reliability of prediction models: the case of breast cancer recurrence , 2010, Knowledge and Information Systems.

[55]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[56]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[57]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[58]  Johannes Fürnkranz,et al.  An Evaluation of Grading Classifiers , 2001, IDA.

[59]  Igor Kononenko,et al.  Reliable Classifications with Machine Learning , 2002, ECML.

[60]  G. Diamond,et al.  Analysis of probability as an aid in the clinical diagnosis of coronary-artery disease. , 1979, The New England journal of medicine.

[61]  M. Kearns,et al.  Algorithmic stability and sanity-check bounds for leave-one-out cross-validation , 1999 .

[62]  J. Tukey Non-Parametric Estimation II. Statistically Equivalent Blocks and Tolerance Regions--The Continuous Case , 1947 .

[63]  Masashi Sugiyama,et al.  Active Learning with Model Selection in Linear Regression , 2008, SDM.

[64]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[65]  Nenad Filipovic,et al.  Mining Data From Hemodynamic Simulations for Generating Prediction and Explanation Models , 2010, IEEE Transactions on Information Technology in Biomedicine.

[66]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[67]  Vladimir Vovk,et al.  Comparing the Bayes and Typicalness Frameworks , 2001, ECML.

[68]  Kentaro Inui,et al.  Selective Sampling for Example-based Word Sense Disambiguation , 1998, CL.

[69]  Evgueni N. Smirnov,et al.  Meta-conformity approach to reliable classification , 2009, Intell. Data Anal..

[70]  VN Balasubramanian,et al.  Support vector machine based conformal predictors for risk of complications following a coronary Drug Eluting Stent procedure , 2009, 2009 36th Annual Computers in Cardiology Conference (CinC).

[71]  Göran Falkman,et al.  Inductive conformal anomaly detection for sequential detection of anomalous sub-trajectories , 2013, Annals of Mathematics and Artificial Intelligence.

[72]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[73]  Maria Riveiro,et al.  Visual analytics for maritime anomaly detection , 2011 .

[74]  Nello Cristianini,et al.  Query Learning with Large Margin Classifiers , 2000, ICML.

[75]  Vladimir Vovk,et al.  Conditional validity of inductive conformal predictors , 2012, Machine Learning.

[76]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[77]  Zhiyuan Luo,et al.  Time series prediction with performance guarantee , 2011, IET Commun..

[78]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[79]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[80]  John W. Tukey,et al.  Nonparametric Estimation, III. Statistically Equivalent Blocks and Multivariate Tolerance Regions--The Discontinuous Case , 1948 .

[81]  Ze Dong,et al.  Wind speed conformal prediction in wind farm based on algorithmic randomness theory , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[82]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[83]  Harry Wechsler,et al.  Reliable face recognition methods - system design, implementation and evaluation , 2006 .

[84]  Didier Meuwly,et al.  The inference of identity in forensic speaker recognition , 2000, Speech Commun..

[85]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[86]  Clive W. J. Granger,et al.  An introduction to long-memory time series models and fractional differencing , 2001 .

[87]  Alexander Gammerman,et al.  Application of Inductive Confidence Machine to ICMLA Competition Data , 2009, 2009 International Conference on Machine Learning and Applications.

[88]  Shunyi Zhang,et al.  Encrypted Internet Traffic Classification Method based on Host Behavior , 2011 .

[89]  Alexander Gammerman,et al.  Multiprobabilistic prediction in early medical diagnoses , 2013, Annals of Mathematics and Artificial Intelligence.

[90]  Vladimir Vovk,et al.  Kernel Ridge Regression , 2013, Empirical Inference.

[91]  M. Olona-Cabases,et al.  The probability of a correct diagnosis , 1994 .

[92]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[93]  Alexander Gammerman,et al.  Reliable classification of childhood acute leukaemia from gene expression data using confidence machines , 2006, 2006 IEEE International Conference on Granular Computing.

[94]  Joachim M. Buhmann,et al.  Active Data Clustering , 1997, NIPS.

[95]  Jack P. C. Kleijnen Experimental Design for Sensitivity Analysis of Simulation Models , 2001 .

[96]  Shen-Shyang Ho,et al.  A martingale framework for concept change detection in time-varying data streams , 2005, ICML.

[97]  Alexander Gammerman,et al.  Hedging Predictions in Machine Learning: The Second Computer Journal Lecture , 2006, Comput. J..

[98]  Peter A. Flach,et al.  Improving Accuracy and Cost of Two-class and Multi-class Probabilistic Classifiers Using ROC Curves , 2003, ICML.

[99]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[100]  Matjaz Kukar,et al.  Transductive reliability estimation for medical diagnosis , 2003, Artif. Intell. Medicine.

[101]  Ravi Kothari,et al.  Learning from labeled and unlabeled data using a minimal number of queries , 2003, IEEE Trans. Neural Networks.

[102]  Dale Schuurmans,et al.  Data perturbation for escaping local maxima in learning , 2002, AAAI/IAAI.

[103]  Tom Heskes,et al.  Practical Confidence and Prediction Intervals , 1996, NIPS.

[104]  John Shawe-Taylor,et al.  Pattern analysis for the prediction of fungal pro-peptide cleavage sites , 2009, Discret. Appl. Math..

[105]  Eamonn J. Keogh,et al.  HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[106]  Matjaz Kukar,et al.  Image processing and machine learning for fully automated probabilistic evaluation of medical images , 2011, Comput. Methods Programs Biomed..

[107]  Donald Fraser,et al.  Nonparametric Estimation IV , 1951 .

[108]  Zhiyuan Luo,et al.  Predictions with Confidence in Applications , 2009, MLDM.

[109]  Pawan Sinha,et al.  Face Recognition by Humans: Nineteen Results All Computer Vision Researchers Should Know About , 2006, Proceedings of the IEEE.

[110]  Harris Papadopoulos,et al.  Reliable Confidence Measures for Medical Diagnosis With Evolutionary Algorithms , 2011, IEEE Transactions on Information Technology in Biomedicine.

[111]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[112]  Nicholas D K Petraco,et al.  Addressing the National Academy of Sciences’ Challenge: A Method for Statistical Pattern Comparison of Striated Tool Marks , 2012, Journal of forensic sciences.

[113]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[114]  Vedat Topuz Traffic Demand Prediction Using ANN Simulator , 2007, KES.

[115]  Thomas Villmann,et al.  Cancer informatics by prototype networks in mass spectrometry , 2009, Artif. Intell. Medicine.

[116]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[117]  S. S. Wilks Determination of Sample Sizes for Setting Tolerance Limits , 1941 .

[118]  Guido Van Oost,et al.  Identification of Confinement Regimes in Tokamak Plasmas by Conformal Prediction on a Probabilistic Manifold , 2012, AIAI.

[119]  Alexander Gammerman,et al.  Conditional Prediction Intervals for Linear Regression , 2009, 2009 International Conference on Machine Learning and Applications.

[120]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[121]  Gang Hua,et al.  Which faces to tag: Adding prior constraints into active learning , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[122]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[123]  Patrick Grother,et al.  Face Recognition Vendor Test (FRVT) , 2014 .

[124]  Lázaro Emílio Makili,et al.  Computationally efficient SVM multi-class image recognition with confidence measures , 2011 .

[125]  David A. Freedman,et al.  Rejoinder: On the Consistency of Bayes Estimates , 1986 .

[126]  Larry Wasserman Frasian Inference , 2012 .

[127]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[128]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[129]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[130]  Xiaowei Xu,et al.  Representative Sampling for Text Classification Using Support Vector Machines , 2003, ECIR.

[131]  Harry Wechsler,et al.  Learning from data streams using transductive inference and martingale , 2006 .

[132]  Kongqiao Wang,et al.  Active learning for image retrieval with Co-SVM , 2007, Pattern Recognit..

[133]  John Shawe-Taylor,et al.  Classification Accuracy Based on Observed Margin , 1998, Algorithmica.

[134]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[135]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[136]  Guang Li,et al.  Application of Conformal Predictors to Tea Classification Based on Electronic Nose , 2010, AIAI.

[137]  Zhiyuan Luo,et al.  Reliable Probabilistic Classification and Its Application to Internet Traffic , 2008, ICIC.

[138]  Rong Jin,et al.  Semi-supervised SVM batch mode active learning for image retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[139]  Alexander Gammerman,et al.  Qualified predictions for microarray and proteomics pattern diagnostics with confidence machines , 2005, Int. J. Neural Syst..

[140]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[141]  Li Guo,et al.  An active learning based TCM-KNN algorithm for supervised network intrusion detection , 2007, Comput. Secur..

[142]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[143]  D. F. Specht,et al.  Experience with adaptive probabilistic neural networks and adaptive general regression neural networks , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[144]  Göran Falkman,et al.  Sequential Conformal Anomaly Detection in trajectories based on Hausdorff distance , 2011, 14th International Conference on Information Fusion.

[145]  Alexander Gammerman,et al.  Machine-Learning Applications of Algorithmic Randomness , 1999, ICML.

[146]  Harry Wechsler,et al.  A Martingale Framework for Detecting Changes in Data Streams by Testing Exchangeability , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[147]  Abraham Wald,et al.  An Extension of Wilks' Method for Setting Tolerance Limits , 1943 .

[148]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[149]  Ilia Nouretdinov,et al.  Transductive Confidence Machine Is Universal , 2003, ALT.

[150]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[151]  Thomas M. Loughin,et al.  A systematic comparison of methods for combining p , 2004, Comput. Stat. Data Anal..

[152]  Inder Jeet Taneja,et al.  On Generalized Information Measures and Their Applications , 1989 .

[153]  Wei Li,et al.  Approaching Real-time Network Traffic Classification , 2013 .

[154]  Dimitry Devetyarov,et al.  Confidence and venn machines and their applications to proteomics , 2010 .

[155]  Olga Ivina,et al.  Conformal prediction of air pollution concentrations for the Barcelona Metropolitan Region , 2012 .

[156]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[157]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[158]  A. Gammerman,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[159]  Stefan Wrobel,et al.  Active Hidden Markov Models for Information Extraction , 2001, IDA.

[160]  Bellotti Ag,et al.  Confidence machines for microarray classification and feature selection. , 2006 .

[161]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[162]  Russell Greiner,et al.  Optimistic Active-Learning Using Mutual Information , 2007, IJCAI.

[163]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[164]  Lixin Shi,et al.  Batch Mode Sparse Active Learning , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[165]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[166]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[167]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[168]  Harris Papadopoulos,et al.  Assessment of Stroke Risk Based on Morphological Ultrasound Image Analysis with Conformal Prediction , 2010, AIAI.

[169]  Vladimir Vovk,et al.  On-line confidence machines are well-calibrated , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[170]  Vladimir Vovk,et al.  Cross-conformal predictors , 2012, Annals of Mathematics and Artificial Intelligence.

[171]  Susanne Albers,et al.  Efficient Algorithms, Essays Dedicated to Kurt Mehlhorn on the Occasion of His 60th Birthday , 2009, Efficient Algorithms.

[172]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[173]  Rikard Laxhammar,et al.  Conformal anomaly detection: Detecting abnormal trajectories in surveillance applications , 2014 .

[174]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[175]  Hajo Zeeb,et al.  Tutorial: Using Confidence Curves in Medical Research , 2005, Biometrical journal. Biometrische Zeitschrift.

[176]  Raymond J. Mooney,et al.  Active Learning for Probability Estimation Using Jensen-Shannon Divergence , 2005, ECML.

[177]  Rong Jin,et al.  Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval , 2009, IEEE Transactions on Knowledge and Data Engineering.

[178]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[179]  Donald Fraser,et al.  Nonparametric Tolerance Regions , 1953 .

[180]  J. Downing,et al.  Gene Expression Profiling of Pediatric Acute Myelogenous Leukemia Materials and Methods , 2022 .

[181]  D. Fraser Sequentially Determined Statistically Equivalent Blocks , 1951 .

[182]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Multiclass SVM Model Selection Using Particle Swarm Optimization , 2006, 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS'06).

[183]  Feihu Qi,et al.  SVM Model Selection with the VC Bound , 2004, CIS.

[184]  Jonathan Crook,et al.  Support vector machines for credit scoring and discovery of significant features , 2009, Expert Syst. Appl..

[185]  Harry Wechsler,et al.  Transductive confidence machine for active learning , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[186]  Pedro Sousa,et al.  Multi‐scale Internet traffic forecasting using neural networks and time series methods , 2010, Expert Syst. J. Knowl. Eng..

[187]  Harry Wechsler,et al.  On the Detection of Concept Changes in Time-Varying Data Stream by Testing Exchangeability , 2005, UAI.

[188]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[189]  Sebastián Dormido-Canto,et al.  INCREMENTAL SUPPORT VECTOR MACHINES FOR FAST RELIABLE INCREMENTAL SUPPORT VECTOR MACHINES FOR FAST RELIABLE IMAGE RECOGNITION , 2013 .

[190]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[191]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[192]  Hans Wackernagel,et al.  Multivariate Geostatistics: An Introduction with Applications , 1996 .

[193]  Kun Deng,et al.  Balancing exploration and exploitation: a new algorithm for active machine learning , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[194]  Larry A. Wasserman,et al.  Distribution Free Prediction Bands , 2012, ArXiv.

[195]  Sean P. Meyn,et al.  The value of volatile resources in electricity markets , 2010, 49th IEEE Conference on Decision and Control (CDC).

[196]  Rikard Laxhammar,et al.  Conformal prediction for distribution-independent anomaly detection in streaming vessel data , 2010, StreamKDD '10.

[197]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[198]  Ciril Grošelj,et al.  Transductive Machine Learning for Reliable Medical Diagnostics , 2005, Journal of Medical Systems.

[199]  Alexander Gammerman,et al.  Valid predictions with confidence estimation in an air pollution problem , 2012, Progress in Artificial Intelligence.

[200]  Doroteo Torre Toledano,et al.  Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[201]  A. J. Gammerman,et al.  Plant promoter prediction with confidence estimation , 2005, Nucleic acids research.

[202]  O. Miettinen,et al.  Theoretical Epidemiology: Principles of Occurrence Research in Medicine. , 1987 .

[203]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[204]  Claire Monteleoni,et al.  Practical Online Active Learning for Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[205]  Andreas S. Weigend,et al.  Predictions with Confidence Intervals ( Local Error Bars ) , 1994 .

[206]  Conrad Sanderson,et al.  Biometric Person Recognition: Face, Speech and Fusion , 2008 .

[207]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[208]  Stephen D. Bay,et al.  Characterizing Model Erros and Differences , 2000, ICML.

[209]  Salvatore J. Stolfo,et al.  Cost-based modeling for fraud and intrusion detection: results from the JAM project , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[210]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[211]  Harris Papadopoulos,et al.  Evaluation of the Risk of Stroke with Confidence Predictions Based on Ultrasound Carotid Image Analysis , 2012, Int. J. Artif. Intell. Tools.

[212]  Zhiyuan Luo,et al.  Reliable Probabilistic Classification of Internet Traffic , 2009, Int. J. Inf. Acquis..

[213]  D. Sculley,et al.  Online Active Learning Methods for Fast Label-Efficient Spam Filtering , 2007, CEAS.

[214]  Vladimir Vovk,et al.  Conformal predictors in early diagnostics of ovarian and breast cancers , 2012, Progress in Artificial Intelligence.

[215]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[216]  M. A. Girshick,et al.  A BAYES APPROACH TO A QUALITY CONTROL MODEL , 1952 .

[217]  Carlo Zaniolo,et al.  An adaptive learning approach for noisy data streams , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[218]  Marco Zaffalon,et al.  Credal Model Averaging: An Extension of Bayesian Model Averaging to Imprecise Probabilities , 2008, ECML/PKDD.

[219]  Christos Faloutsos,et al.  Adaptive, Hands-Off Stream Mining , 2003, VLDB.

[220]  A. Shiryaev On Optimum Methods in Quickest Detection Problems , 1963 .

[221]  Robert Tibshirani,et al.  The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..

[222]  Harry Wechsler,et al.  Query by Transduction , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[223]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[224]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[225]  Daphne Koller,et al.  Active learning: theory and applications , 2001 .

[226]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[227]  Pedro M. Domingos,et al.  Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier , 1996, ICML.

[228]  Guillaume Urvoy-Keller,et al.  Application-based feature selection for Internet traffic classification , 2010, 2010 22nd International Teletraffic Congress (lTC 22).

[229]  Alexander Gammerman,et al.  Transduction with Confidence and Credibility , 1999, IJCAI.

[230]  Jesús Vega,et al.  Region selection and image classification methodology using a non-conformity measure , 2012, Progress in Artificial Intelligence.

[231]  Matti Kääriäinen Sinuhe - Statistical Machine Translation using a Globally Trained Conditional Exponential Family Translation Model , 2009, EMNLP.

[232]  A. Murari,et al.  Accurate and reliable image classification by using conformal predictors in the TJ-II Thomson scattering. , 2010, The Review of scientific instruments.

[233]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[234]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[235]  Igor Kononenko,et al.  Comparison of approaches for estimating reliability of individual regression predictions , 2008, Data Knowl. Eng..

[236]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[237]  J. Langford Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..

[238]  Sethuraman Panchanathan,et al.  Generalized Query by Transduction for online active learning , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[239]  A. Gammerman,et al.  Evolutionary Conformal Prediction for Breast Cancer Diagnosis , 2009, 2009 9th International Conference on Information Technology and Applications in Biomedicine.

[240]  John Shawe-Taylor,et al.  Tighter PAC-Bayes Bounds , 2006, NIPS.

[241]  C. Bonwell,et al.  Active Learning: Creating Excitement in the Classroom. ERIC Digest. , 1991 .

[242]  Barbara Hammer,et al.  Supervised Neural Gas for Learning Vector Quantization , 2002 .

[243]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[244]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[245]  C.-C. Jay Kuo,et al.  GA-Based Internet Traffic Classification Technique for QoS Provisioning , 2006, 2006 International Conference on Intelligent Information Hiding and Multimedia.

[246]  Burr Settles,et al.  From Theories to Queries: Active Learning in Practice , 2011 .

[247]  Alexander G. Gray,et al.  Retrofitting Decision Tree Classifiers Using Kernel Density Estimation , 1995, ICML.

[248]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[249]  Stefano Tarantola,et al.  Introduction to Sensitivity Analysis , 2008 .

[250]  Lucia Specia,et al.  Improving the Confidence of Machine Translation Quality Estimates , 2009, MTSUMMIT.

[251]  Harris Papadopoulos,et al.  Reliable pavement backcalculation with confidence estimation , 2011 .

[252]  Peter A. Flach,et al.  ROC Analysis in Artificial Intelligence, 1st International Workshop, ROCAI-2004, Valencia, Spain, August 22, 2004 , 2004, ROCAI.

[253]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[254]  Carey L. Williamson,et al.  A Longitudinal Study of P2P Traffic Classification , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[255]  Haris Haralambous,et al.  Neural Networks Regression Inductive Conformal Predictor and Its Application to Total Electron Content Prediction , 2010, ICANN.

[256]  Jerry Nedelman,et al.  Book review: “Bayesian Data Analysis,” Second Edition by A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin Chapman & Hall/CRC, 2004 , 2005, Comput. Stat..

[257]  John Venn,et al.  The Logic Of Chance , 1888 .

[258]  Ida G. Sprinkhuizen-Kuyper,et al.  A Comparison of Two Approaches to Classify with Guaranteed Performance , 2007, PKDD.

[259]  Harris Papadopoulos,et al.  Reliable Probability Estimates Based on Support Vector Machines for Large Multiclass Datasets , 2012, AIAI.

[260]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[261]  Eyke Hüllermeier,et al.  Case-Based Approximate Reasoning (Theory and Decision Library B) , 2007 .

[262]  Pietro Perona,et al.  Entropy-based active learning for object recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[263]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[264]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[265]  I. Guttman,et al.  Statistical Tolerance Regions: Classical and Bayesian , 1970 .

[266]  G. Shafer,et al.  Probability and Finance: It's Only a Game! , 2001 .

[267]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[268]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[269]  Haris Haralambous,et al.  Reliable prediction intervals with regression neural networks , 2011, Neural Networks.

[270]  Harry Wechsler,et al.  Detecting Changes in Unlabeled Data Streams Using Martingale , 2007, IJCAI.

[271]  Gert Cauwenberghs,et al.  SVM incremental learning, adaptation and optimization , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[272]  E. S. Page On problems in which a change in a parameter occurs at an unknown point , 1957 .

[273]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[274]  Alexander Gammerman,et al.  Testing Exchangeability On-Line , 2003, ICML.

[275]  Alexander Gammerman,et al.  Feature Selection by Conformal Predictor , 2011, EANN/AIAI.

[276]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[277]  Carlo Zaniolo,et al.  Fast and Light Boosting for Adaptive Mining of Data Streams , 2004, PAKDD.

[278]  Trevor J. Hastie,et al.  Discriminative vs Informative Learning , 1997, KDD.

[279]  Eleazar Eskin,et al.  Anomaly Detection over Noisy Data using Learned Probability Distributions , 2000, ICML.

[280]  Roland Kuhn,et al.  PORTAGE: with Smoothed Phrase Tables and Segment Choice Models , 2006, WMT@HLT-NAACL.

[281]  Gunnar Rätsch,et al.  Learning to Predict the Leave-One-Out Error of Kernel Based Classifiers , 2001, ICANN.

[282]  O. M. Halck,et al.  Using Hard Classifiers to Estimate Conditional Class Probabilities , 2002, ECML.

[283]  Michele Nappi,et al.  Robust re-identification using randomness and statistical learning: Quo vadis , 2012, Pattern Recognit. Lett..

[284]  Xiaohong Guan,et al.  Accurate Classification of the Internet Traffic Based on the SVM Method , 2007, 2007 IEEE International Conference on Communications.

[285]  Stefan Axelsson,et al.  The base-rate fallacy and the difficulty of intrusion detection , 2000, TSEC.

[286]  Carl Gold,et al.  Model selection for support vector machine classification , 2002, Neurocomputing.

[287]  Michael T. Goodrich,et al.  Data Structures and Algorithms in Java with Cdrom , 1998 .

[288]  Igor Kononenko,et al.  Estimation of individual prediction reliability using the local sensitivity analysis , 2008, Applied Intelligence.

[289]  G. Shafer,et al.  Algorithmic Learning in a Random World , 2005 .

[290]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[291]  Ian Witten,et al.  Data Mining , 2000 .

[292]  Andy Adler Sample images can be independently restored from face recognition templates , 2003, CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436).

[293]  Marco Zaffalon,et al.  Reliable diagnoses of dementia by the naive credal classifier inferred from incomplete cognitive data , 2003, Artif. Intell. Medicine.

[294]  Ida G. Sprinkhuizen-Kuyper,et al.  Version Space Support Vector Machines , 2006, ECAI.

[295]  Matjaz Kukar,et al.  Quality assessment of individual classifications in machine learning and data mining , 2006, Knowledge and Information Systems.

[296]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[297]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[298]  João Gama,et al.  Correcting Streaming Predictions of an Electricity Load Forecast System Using a Prediction Reliability Estimate , 2011, ICMMI.

[299]  W. A. Shewhart,et al.  The Application of Statistics as an Aid in Maintaining Quality of a Manufactured Product , 1925 .

[300]  Alexander Gammerman,et al.  Strangeness Minimisation Feature Selection with Confidence Machines , 2006, IDEAL.

[301]  P. Rigollet,et al.  Optimal rates for plug-in estimators of density level sets , 2006, math/0611473.

[302]  Li Guo,et al.  Network anomaly detection based on TCM-KNN algorithm , 2007, ASIACCS '07.

[303]  Ilia Nouretdinov Offline Nearest Neighbour Transductive Confidence Machine , 2008, Industrial Conference on Data Mining - Posters and Workshops.

[304]  Göran Falkman,et al.  Online Learning and Sequential Anomaly Detection in Trajectories , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[305]  Anja Feldmann,et al.  Dynamic Application-Layer Protocol Analysis for Network Intrusion Detection , 2006, USENIX Security Symposium.

[306]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[307]  Harry Wechsler,et al.  Open set face recognition using transduction , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[308]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[309]  John Shawe-Taylor,et al.  Prediction with the SVM Using Test Point Margins , 2010, Data Mining.

[310]  Lázaro Emílio Makili,et al.  Active Learning Using Conformal Predictors: Application to Image Classification , 2012 .

[311]  Alexander Gammerman,et al.  Machine learning classification with confidence: Application of transductive conformal predictors to MRI-based diagnostic and prognostic markers in depression , 2011, NeuroImage.

[312]  Vladimir Vovk,et al.  Ridge Regression Confidence Machine , 2001, International Conference on Machine Learning.

[313]  Harry Wechsler,et al.  Adaptive Support Vector Machine for Time-Varying Data Streams Using Martingale , 2005, IJCAI.

[314]  Walter L. Ruzzo,et al.  Improved Gene Selection for Classification of Microarrays , 2002, Pacific Symposium on Biocomputing.

[315]  Harris Papadopoulos,et al.  Inductive Confidence Machines for Regression , 2002, ECML.

[316]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[317]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[318]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[319]  Daniel Barbará,et al.  Detecting outliers using transduction and statistical testing , 2006, KDD '06.

[320]  Harris Papadopoulos,et al.  Regression Conformal Prediction with Nearest Neighbours , 2014, J. Artif. Intell. Res..

[321]  J. Tukey,et al.  Non-Parametric Estimation. I. Validation of Order Statistics , 1945 .

[322]  Harris Papadopoulos,et al.  Reliable diagnosis of acute abdominal pain with conformal prediction , 2009 .

[323]  A. Rényi,et al.  Théorie des éléments saillants d'une suite d'observations , 1962 .

[324]  Padraig Cunningham,et al.  Confidence and prediction intervals for neural network ensembles , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[325]  Harris Papadopoulos,et al.  Confidence Predictions for the Diagnosis of Acute Abdominal Pain , 2009, AIAI.

[326]  R. Stolzenberg,et al.  Multiple Regression Analysis , 2004 .

[327]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[328]  A. R. Crathorne,et al.  Economic Control of Quality of Manufactured Product. , 1933 .