Is this model reliable for everyone? Testing for strong calibration
暂无分享,去创建一个
[1] N. Petrick,et al. Monitoring machine learning (ML)-based risk prediction algorithms in the presence of confounding medical interventions , 2022, ArXiv.
[2] S. Ermon,et al. Modular Conformal Calibration , 2022, ICML.
[3] Rachael V. Phillips,et al. Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare , 2022, npj Digital Medicine.
[4] Jared A. Dunnmon,et al. Domino: Discovering Systematic Errors with Cross-Modal Embeddings , 2022, ICLR.
[5] Leying Guan,et al. Localized Conformal Prediction: A Generalized Inference Framework for Conformal Prediction , 2021, Biometrika.
[6] Ali Shojaie,et al. Inference on function-valued parameters using a restricted score test , 2021, 2105.06646.
[7] Yuekai Sun,et al. Statistical inference for individual fairness , 2021, ICLR.
[8] Shira Mitchell,et al. Algorithmic Fairness: Choices, Assumptions, and Definitions , 2021, Annual Review of Statistics and Its Application.
[9] S. Savarese,et al. Local calibration: metrics and recalibration , 2021, UAI.
[10] Eitan Bachmat,et al. Addressing bias in prediction models by improving subpopulation calibration , 2020, J. Am. Medical Informatics Assoc..
[11] John C. Duchi,et al. Distributionally Robust Losses for Latent Covariate Mixtures , 2020, Oper. Res..
[12] Kinjal Basu,et al. Evaluating Fairness Using Permutation Tests , 2020, KDD.
[13] Tengyu Ma,et al. Individual Calibration with Randomized Forecasting , 2020, ICML.
[14] Jean Feng,et al. Efficient nonparametric statistical inference on population feature importance using Shapley values , 2020, ICML.
[15] G. A. Young,et al. High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.
[16] Yuekai Sun,et al. Auditing ML Models for Individual Bias and Unfairness , 2020, AISTATS.
[17] Martin Vechev,et al. Learning Certified Individually Fair Representations , 2020, NeurIPS.
[18] Yuhong Yang,et al. Is a Classification Procedure Good Enough?—A Goodness-of-Fit Assessment Tool for Classification Learning , 2019, Journal of the American Statistical Association.
[19] Emmanuel J. Candès,et al. With Malice Towards None: Assessing Uncertainty via Equalized Coverage , 2019, ArXiv.
[20] Rajen Dinesh Shah,et al. Goodness‐of‐fit testing in high dimensional generalized linear models , 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology).
[21] Christina Ilvento,et al. Metric Learning for Individual Fairness , 2019, FORC.
[22] E. Candès,et al. The limits of distribution-free conditional predictive inference , 2019, Information and Inference: A Journal of the IMA.
[23] Alexander Gammerman,et al. Conformal calibrators , 2019, COPA.
[24] John C. Duchi,et al. Learning Models with Uniform Performance via Distributionally Robust Optimization , 2018, ArXiv.
[25] Tim Kraska,et al. Slice Finder: Automated Data Slicing for Model Validation , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).
[26] Guy N. Rothblum,et al. Multicalibration: Calibration for the (Computationally-Identifiable) Masses , 2018, ICML.
[27] James Y. Zou,et al. Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.
[28] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.
[29] E. Gombay. Editor’s special invited paper: On the efficient score vector in sequential monitoring , 2017 .
[30] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[31] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.
[32] Yvonne Vergouwe,et al. A calibration hierarchy for risk models was defined: from utopia to empirical data. , 2016, Journal of clinical epidemiology.
[33] Jianxin Shi,et al. Developing and evaluating polygenic risk prediction models for stratified disease prevention , 2016, Nature Reviews Genetics.
[34] Ewout W. Steyerberg,et al. F1000Prime recommendation of Calibration of risk prediction models: impact on decision-analytic performance. , 2014 .
[35] Diane Lacaille,et al. 2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk , 2014 .
[36] Larry Wasserman,et al. Distribution‐free prediction bands for non‐parametric regression , 2014 .
[37] Vladimir Vovk,et al. Conditional validity of inductive conformal predictors , 2012, Machine Learning.
[38] Anja De Waegenaere,et al. Robust Solutions of Optimization Problems Affected by Uncertain Probabilities , 2011, Manag. Sci..
[39] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.
[40] Edit Gombay,et al. Sequential Change-Point Detection and Estimation , 2003 .
[41] Amit Mitra,et al. Statistical Quality Control , 2002, Technometrics.
[42] Nils Lid Hjort,et al. Goodness‐of‐fit processes for logistic regression: simulation results , 2002, Statistics in medicine.
[43] Z. Ying,et al. Model‐Checking Techniques Based on Cumulative Residuals , 2002, Biometrics.
[44] D. Hosmer,et al. A comparison of goodness-of-fit tests for the logistic regression model. , 1997, Statistics in medicine.
[45] D. Hosmer,et al. Applied Logistic Regression , 1991 .
[46] Douglas M. Hawkins,et al. Diagnostics for use with regression recursive residuals , 1991 .
[47] Suchi Saria,et al. Evaluating Model Robustness and Stability to Dataset Shift , 2021, AISTATS.
[48] Hongseok Namkoong,et al. Evaluating model performance under worst-case subpopulations , 2024, NeurIPS.
[49] Jennifer G. Robinson,et al. Reply: 2013 ACC/AHA guideline on the assessment of cardiovascular risk. , 2014, Journal of the American College of Cardiology.
[50] W. Gasarch,et al. The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .
[51] John C. Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .
[52] A. Tsiatis. A note on a goodness-of-fit test for the logistic regression model , 1980 .
[53] J. Durbin,et al. Techniques for Testing the Constancy of Regression Relationships Over Time , 1975 .