Decision-Making Under Selective Labels: Optimal Finite-Domain Policies and Beyond
暂无分享,去创建一个
[1] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..
[2] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[3] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[4] Jure Leskovec,et al. The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables , 2017, KDD.
[5] Alexandra Chouldechova,et al. Learning under selective labels in the presence of expert consistency , 2018, ArXiv.
[6] Aaron Roth,et al. Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.
[7] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[8] Mingyan Liu,et al. How Do Fair Decisions Fare in Long-term Qualification? , 2020, NeurIPS.
[9] Dean Eckles,et al. Thompson sampling with the online bootstrap , 2014, ArXiv.
[10] Nathan Kallus,et al. Balanced Policy Evaluation and Learning , 2017, NeurIPS.
[11] Benjamin Van Roy,et al. Bootstrapped Thompson Sampling and Deep Exploration , 2015, ArXiv.
[12] Nathan Kallus,et al. Residual Unfairness in Fair Machine Learning from Prejudiced Data , 2018, ICML.
[13] Stefan Wager,et al. Policy Learning With Observational Data , 2017, Econometrica.
[14] Krikamol Muandet,et al. Fair Decisions Despite Imperfect Predictions , 2019, AISTATS.
[15] Percy Liang,et al. Fairness Without Demographics in Repeated Loss Minimization , 2018, ICML.
[16] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[17] Sampath Kannan,et al. A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem , 2018, NeurIPS.
[18] Lihong Li,et al. Provable Optimal Algorithms for Generalized Linear Contextual Bandits , 2017, ArXiv.
[19] Aurélien Garivier,et al. Parametric Bandits: The Generalized Linear Case , 2010, NIPS.
[20] Toniann Pitassi,et al. Causal Modeling for Fairness in Dynamical Systems , 2019, ICML.
[21] Yiling Chen,et al. A Short-term Intervention for Long-term Fairness in the Labor Market , 2017, WWW.
[22] Krishna P. Gummadi,et al. On the Long-term Impact of Algorithmic Decision Policies: Effort Unfairness and Feature Segregation through Social Learning , 2019, ICML.
[23] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[24] Suresh Venkatasubramanian,et al. Runaway Feedback Loops in Predictive Policing , 2017, FAT.
[25] Celestine Mendler-Dünner,et al. Performative Prediction , 2020, ICML.
[26] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[27] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[28] Nathan Srebro,et al. From Fair Decision Making To Social Equality , 2018, FAT.
[29] Manuel Gomez-Rodriguez,et al. Decisions, Counterfactual Explanations and Strategic Behavior , 2020, NeurIPS.
[30] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[31] David C. Parkes,et al. From Predictions to Decisions: Using Lookahead Regularization , 2020, NeurIPS.
[32] R. Weber. On the Gittins Index for Multiarmed Bandits , 1992 .
[33] Aaron Roth,et al. Equal Opportunity in Online Classification with Partial Feedback , 2019, NeurIPS.
[34] Alexandra Chouldechova,et al. Counterfactual risk assessments, evaluation, and fairness , 2020, FAT*.
[35] Khashayar Khosravi,et al. Mostly Exploration-Free Algorithms for Contextual Bandits , 2017, Manag. Sci..
[36] Avi Feller,et al. Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.
[37] Esther Rolf,et al. Delayed Impact of Fair Machine Learning , 2018, ICML.
[38] John Langford,et al. A Contextual Bandit Bake-off , 2018, J. Mach. Learn. Res..