Algorithms now permeate multiple aspects of human lives and multiple recent results have reported that these algorithms may have biases pertaining to gender, race, and other demographic characteristics. The metrics used to quantify such biases have still focused on a static notion of algorithms. However, algorithms evolve over time. For instance, Tay (a conversational bot launched by Microsoft) was arguably not biased at its launch but quickly became biased, sexist, and racist over time. We suggest a set of intuitive metrics to study the variations in biases over time and present the results for a case study for genders represented in images resulting from a Twitter image search for #Nurse and #Doctor over a period of 21 days. Results indicate that biases vary significantly over time and the direction of bias could appear to be different on different days. Hence, one-shot measurements may not suffice for understanding algorithmic bias, thus motivating further work on studying biases in algorithms over time.
[1]
Sean A. Munson,et al.
Unequal Representation and Gender Stereotypes in Image Search Results for Occupations
,
2015,
CHI.
[2]
Carlos Eduardo Scheidegger,et al.
Certifying and Removing Disparate Impact
,
2014,
KDD.
[3]
R. Kitchin,et al.
Thinking critically about and researching algorithms
,
2014,
The Social Power of Algorithms.
[4]
Keith W. Miller,et al.
Why we should have seen that coming: comments on Microsoft's tay "experiment," and wider implications
,
2017,
CSOC.
[5]
Anne Marie Piper,et al.
Addressing Age-Related Bias in Sentiment Analysis
,
2018,
CHI.
[6]
S. Noble.
Algorithms of Oppression: How Search Engines Reinforce Racism
,
2018
.
[7]
N. Diakopoulos.
Algorithmic Accountability Reporting: On the Investigation of Black Boxes
,
2014
.
[8]
Kush R. Varshney,et al.
Optimized Pre-Processing for Discrimination Prevention
,
2017,
NIPS.
[9]
Richard S. Sutton,et al.
Introduction to Reinforcement Learning
,
1998
.