Bias reduction through conditional conformal prediction

Conformal prediction (CP) is a relatively new framework in which predictive models output sets of predictions with a bound on the error rate, i.e., the probability of making an erroneous prediction ...

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Vladimir Vovk,et al.  Cross-conformal predictors , 2012, Annals of Mathematics and Artificial Intelligence.

[3]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[4]  Yunqian Ma,et al.  Imbalanced Learning: Foundations, Algorithms, and Applications , 2013 .

[5]  Alexander Gammerman,et al.  Criteria of Efficiency for Conformal Prediction , 2016, COPA.

[6]  Tim Menzies,et al.  The \{PROMISE\} Repository of Software Engineering Databases. , 2005 .

[7]  Henrik Boström,et al.  Conformal Prediction Using Decision Trees , 2013, 2013 IEEE 13th International Conference on Data Mining.

[8]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[9]  Xiao-Yan Sun,et al.  Clustering Based Bagging Algorithm on Imbalanced Data Sets , 2011, IUKM.

[10]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Alexander Gammerman,et al.  Machine learning classification with confidence: Application of transductive conformal predictors to MRI-based diagnostic and prognostic markers in depression , 2011, NeuroImage.

[13]  Herna L. Viktor,et al.  Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.

[14]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[16]  Hongyu Guo,et al.  Boosting with data generation: improving the classification of hard to learn examples , 2004 .

[17]  Yue-Shi Lee,et al.  Cluster-based under-sampling approaches for imbalanced data distributions , 2009, Expert Syst. Appl..

[18]  G. Shafer,et al.  Algorithmic Learning in a Random World , 2005 .

[19]  Wei Liu,et al.  Class Confidence Weighted kNN Algorithms for Imbalanced Data Sets , 2011, PAKDD.

[20]  P. Pfeifer,et al.  A Brief Primer on Probability Distributions , 2008, SSRN Electronic Journal.

[21]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[22]  Chumphol Bunkhumpornpat,et al.  DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique , 2011, Applied Intelligence.

[23]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[24]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[25]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[26]  Fan Yang,et al.  Using random forest for reliable classification and cost-sensitive learning for medical diagnosis , 2009, BMC Bioinformatics.

[27]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[28]  Jorma Laurikkala,et al.  Improving Identification of Difficult Small Classes by Balancing Class Distribution , 2001, AIME.

[29]  Alexander Gammerman,et al.  Transduction with Confidence and Credibility , 1999, IJCAI.

[30]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[31]  Nuno Vasconcelos,et al.  Cost-Sensitive Boosting , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[34]  Edward Y. Chang,et al.  Aligning boundary in kernel space for learning imbalanced dataset , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[35]  Jie Gu,et al.  Making Class Bias Useful: A Strategy of Learning from Imbalanced Data , 2007, IDEAL.

[36]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[37]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[38]  Yi-Hung Liu,et al.  Face Recognition Using Total Margin-Based Adaptive Fuzzy Support Vector Machines , 2007, IEEE Transactions on Neural Networks.

[39]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[40]  Tao Xiang,et al.  Finding Rare Classes: Active Learning with Generative and Discriminative Models , 2013, IEEE Transactions on Knowledge and Data Engineering.

[41]  Igor Kononenko,et al.  Cost-Sensitive Learning with Neural Networks , 1998, ECAI.

[42]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[43]  Edward Y. Chang,et al.  KBA: kernel boundary alignment considering imbalanced data distribution , 2005, IEEE Transactions on Knowledge and Data Engineering.

[44]  JuiHsi Fu,et al.  Certainty-based active learning for sampling imbalanced datasets , 2013, Neurocomputing.

[45]  Göran Falkman,et al.  Inductive conformal anomaly detection for sequential detection of anomalous sub-trajectories , 2013, Annals of Mathematics and Artificial Intelligence.

[46]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[47]  Alexander Gammerman,et al.  Learning by Transduction , 1998, UAI.

[48]  Scott Boyer,et al.  Application of Conformal Prediction in QSAR , 2012, AIAI.

[49]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[50]  Gregory Ditzler,et al.  Incremental Learning of Concept Drift from Streaming Imbalanced Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[51]  Rikard Laxhammar,et al.  Conformal prediction for distribution-independent anomaly detection in streaming vessel data , 2010, StreamKDD '10.

[52]  Edward Y. Chang,et al.  Class-Boundary Alignment for Imbalanced Dataset Learning , 2003 .

[53]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[54]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[55]  Vladimir Vovk,et al.  Conditional validity of inductive conformal predictors , 2012, Machine Learning.

[56]  Svetha Venkatesh,et al.  Multi-class Pattern Classification in Imbalanced Data , 2010, 2010 20th International Conference on Pattern Recognition.

[57]  C. Lee Giles,et al.  Learning on the border: active learning in imbalanced data classification , 2007, CIKM '07.

[58]  Henrik Boström,et al.  Overproduce-and-select: The grim reality , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[59]  Sheng Chen,et al.  A Kernel-Based Two-Class Classifier for Imbalanced Data Sets , 2007, IEEE Transactions on Neural Networks.

[60]  David Mease,et al.  Boosted Classification Trees and Class Probability/Quantile Estimation , 2007, J. Mach. Learn. Res..

[61]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[62]  Scott Boyer,et al.  The application of conformal prediction to the drug discovery process , 2013, Annals of Mathematics and Artificial Intelligence.

[63]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[64]  Alexander Gammerman,et al.  Pattern Recognition and Density Estimation under the General i.i.d. Assumption , 2001, COLT/EuroCOLT.

[65]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[66]  Qiang Yang,et al.  Test-cost sensitive naive Bayes classification , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[67]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[68]  Harris Papadopoulos,et al.  Regression Conformal Prediction with Nearest Neighbours , 2014, J. Artif. Intell. Res..

[69]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[70]  Haibo He,et al.  RAMOBoost: Ranked Minority Oversampling in Boosting , 2010, IEEE Transactions on Neural Networks.

[71]  Harris Papadopoulos,et al.  Inductive Conformal Prediction: Theory and Application to Neural Networks , 2008 .

[72]  Taghi M. Khoshgoftaar,et al.  RUSBoost: A Hybrid Approach to Alleviating Class Imbalance , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[73]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.