Fair Algorithms for Infinite and Contextual Bandits

We study fairness in linear bandit problems. Starting from the notion of meritocratic fairness introduced in Joseph et al. [2016], we carry out a more refined analysis of a more general problem, achieving better performance guarantees with fewer modelling assumptions on the number and structure of available choices as well as the number selected. We also analyze the previously-unstudied question of fairness in infinite linear bandit problems, obtaining instance-dependent regret upper bounds as well as lower bounds demonstrating that this instance-dependence is necessary. The result is a framework for meritocratic fairness in an online linear setting that is substantially more powerful, general, and realistic than the current state of the art.

[1]  J. Walrand,et al.  Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .

[2]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[3]  Philip M. Long,et al.  Reinforcement Learning with Immediate Rewards and Linear Hypotheses , 2003, Algorithmica.

[4]  Adam Tauman Kalai,et al.  Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[5]  S. Vempala,et al.  Hit-and-Run from a Corner , 2006 .

[6]  S. Vempala Geometric Random Walks: a Survey , 2007 .

[7]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[8]  Thomas P. Hayes,et al.  Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.

[9]  Martin Pál,et al.  Contextual Multi-Armed Bandits , 2010, AISTATS.

[10]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[11]  Jun Sakuma,et al.  Fairness-aware Learning through Regularization Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[12]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[13]  Franco Turini,et al.  k-NN as an implementation of situation testing for discrimination discovery and prevention , 2011, KDD.

[14]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[15]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[16]  John Langford,et al.  Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.

[17]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[18]  Joel A. Tropp,et al.  An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[19]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[20]  Cathy O'Neil,et al.  Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2016, Vikalpa: The Journal for Decision Makers.

[21]  Adam R. Klivans,et al.  Facet Guessing for Finding the M-Best Integral Solutions of a Linear Program , 2016 .

[22]  Aaron Roth,et al.  Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[23]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[24]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[25]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[26]  Nathan Srebro,et al.  Learning Non-Discriminatory Predictors , 2017, COLT.

[27]  Cary Coglianese,et al.  Regulating by Robot: Administrative Decision Making in the Machine-Learning Era , 2017 .

[28]  Algorithmic decision making and the cost of fairness , 2017, 1701.08230.

[29]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[30]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.