Uncovering Latent Biases in Text: Method and Application to Peer Review

Quantifying systematic disparities in numerical quantities such as employment rates and wages between population subgroups provides compelling evidence for the existence of societal biases. However, biases in the text written for members of different subgroups (such as in recommendation letters for male and non-male candidates), though widely reported anecdotally, remain challenging to quantify. In this work, we introduce a novel framework to quantify bias in text caused by the visibility of subgroup membership indicators. We develop a nonparametric estimation and inference procedure to estimate this bias. We then formalize an identification strategy to causally link the estimated bias to the visibility of subgroup membership indicators, provided observations from time periods both before and after an identity-hiding policy change. We identify an application wherein "ground truth" bias can be inferred to evaluate our framework, instead of relying on synthetic or secondary data. Specifically, we apply our framework to quantify biases in the text of peer reviews from a reputed machine learning conference before and after the conference adopted a double-blind reviewing policy. We show evidence of biases in the review ratings that serves as "ground truth", and show that our proposed framework accurately detects these biases from the review text without having access to the review ratings.

[1]  M. Kocher,et al.  Single-blind vs Double-blind Peer Review in the Setting of Author Prestige. , 2016, JAMA.

[2]  David Card,et al.  Minimum Wages and Employment: A Case Study of the Fast Food Industry in New Jersey and Pennsylvania , 1993 .

[3]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[4]  Alison T Wynn,et al.  Inside the Black Box of Organizational Life: The Gendered Language of Performance Assessment , 2020 .

[5]  David J. DeWitt,et al.  Impact of double-blind reviewing on SIGMOD publication rates , 2006, SGMD.

[6]  E. Lawson,et al.  Effect of institutional prestige on reviewers' recommendations and editorial decisions. , 1994, JAMA.

[7]  Anthony K. H. Tung Impact of double blind reviewing on SIGMOD publication: a more detail analysis , 2006, SGMD.

[8]  Nihar B. Shah,et al.  PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review , 2018, ALT.

[9]  Yulia Tsvetkov,et al.  Unsupervised Discovery of Implicit Gender Bias , 2020, EMNLP.

[10]  Bhavana Dalvi,et al.  A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications , 2018, NAACL.

[11]  Mike Thelwall,et al.  Does the use of open, non-anonymous peer review in scholarly publishing introduce bias? Evidence from the F1000Research post-publication open peer review publishing model , 2019, J. Inf. Sci..

[12]  Margaret E. Roberts,et al.  How to make causal inferences using texts , 2018, Science advances.

[13]  R. Blank The Effects of Double-Blind versus Single-Blind Reviewing: Experimental Evidence from The American Economic Review , 1991 .

[14]  A. Alesina,et al.  A Test of Racial Bias in Capital Sentencing , 2014 .

[15]  Philip Goldberg,et al.  Are women prejudiced against women? , 1968 .

[16]  Nihar B. Shah,et al.  Your 2 is My 1, Your 3 is My 9: Handling Arbitrary Miscalibrations in Ratings , 2018, AAMAS.

[17]  C. Goldin,et al.  Orchestrating Impartiality: The Impact of "Blind" Auditions on Female Musicians , 1997 .

[18]  Min Zhang,et al.  Reviewer bias in single- versus double-blind peer review , 2017, Proceedings of the National Academy of Sciences.

[19]  Nihar B. Shah,et al.  On Testing for Biases in Peer Review , 2019, NeurIPS.

[20]  T. Banta,et al.  Change : The Magazine of Higher Learning , 2010 .

[21]  Matt Taddy,et al.  Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech , 2016, Econometrica.

[22]  Kurt Mehlhorn,et al.  Assigning Papers to Referees , 2009, Algorithmica.

[23]  Francine D. Blau,et al.  The Gender Wage Gap: Extent, Trends, and Explanations , 2016, SSRN Electronic Journal.

[24]  Randi C. Martin,et al.  Raising Doubt in Letters of Recommendation for Academia: Gender Differences and Their Impact , 2018, Journal of Business and Psychology.

[25]  Ricardo Conejo,et al.  Bias in peer review : a case study , 2018 .

[26]  S. Ceci,et al.  Peer review--a study of reliability. , 1982, Change.

[27]  Alberto Bacchelli,et al.  Does single blind peer review hinder newcomers? , 2017, Scientometrics.

[28]  Margaret E. Lloyd,et al.  Gender factors in reviewer recommendations for manuscript publication. , 1990, Journal of applied behavior analysis.

[29]  Nihar B. Shah,et al.  Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments , 2020, NeurIPS.

[30]  Margaret E. Roberts,et al.  Adjusting for Confounding with Text Matching , 2020 .

[31]  Carole J. Lee Commensuration Bias in Peer Review , 2015, Philosophy of Science.

[32]  Leigh L. Linden,et al.  Discrimination in Grading , 2012 .

[33]  Dan Suciu,et al.  Causal Relational Learning , 2020, SIGMOD Conference.

[34]  Katherine A. Keith,et al.  Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates , 2020, ACL.

[35]  Kosuke Imai,et al.  On the Use of Two-Way Fixed Effects Regression Models for Causal Inference with Panel Data , 2020, Political Analysis.

[36]  Christophe Bernard,et al.  Editorial: Gender Bias in Publishing: Double-Blind Reviewing as a Solution? , 2018, eNeuro.

[37]  H. Krumholz,et al.  Differences, Disparities, and Biases: Clarifying Racial Variations in Health Care Use , 2004, Annals of Internal Medicine.

[38]  Richard T. Snodgrass,et al.  Single- versus double-blind reviewing: an analysis of the literature , 2006, SGMD.

[39]  Chen Yanover,et al.  Benchmarking Framework for Performance-Evaluation of Causal Inference Analysis , 2018, ArXiv.

[40]  Guillaume Cabanac,et al.  Capitalizing on order effects in the bids of peer-reviewed conferences to secure reviews by expert referees , 2013, J. Assoc. Inf. Sci. Technol..

[41]  C. Gross,et al.  Effect of blinded peer review on abstract acceptance. , 2006, JAMA.

[42]  P. Holland Statistics and Causal Inference , 1985 .

[43]  Andrew McCallum,et al.  Paper Matching with Local Fairness Constraints , 2019, KDD.

[44]  T. Tregenza,et al.  Double-blind review favours increased representation of female authors. , 2008, Trends in ecology & evolution.

[45]  Reid Pryzant,et al.  Interpretable Neural Architectures for Attributing an Ad’s Performance to its Writing Style , 2018, BlackboxNLP@EMNLP.

[46]  Nihar B. Shah,et al.  Loss Functions, Axioms, and Peer Review , 2018 .

[47]  Uraz Yavanoglu,et al.  Identifying Framing Bias in Online News , 2018, ACM Trans. Soc. Comput..

[48]  Lise Getoor,et al.  Estimating Causal Effects of Tone in Online Debates , 2019, IJCAI.

[49]  Robert P Freckleton,et al.  Does double-blind review benefit female authors? , 2008, Trends in ecology & evolution.

[50]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: A General Method for Estimating Sampling Variances for Standard Estimators for Average Causal Effects , 2015 .

[51]  Kristina M. W. Mitchell,et al.  Gender Bias in Student Evaluations , 2018, PS: Political Science& Politics.

[52]  Jörg Rothe,et al.  How to Calibrate the Scores of Biased Reviewers by Quadratic Programming , 2011, AAAI.

[53]  Nihar B. Shah,et al.  Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment , 2020, AAAI.

[54]  Donald M. Taylor,et al.  Self-Serving and Group-Serving Bias in Attribution , 1981 .

[55]  David Strömberg Natural Disasters, Economic Development, and Humanitarian Aid , 2007 .

[56]  Janet K. Swim,et al.  Joan McKay versus John McKay: Do gender stereotypes bias evaluations? , 1989 .

[57]  Douglas G Altman,et al.  How to obtain the P value from a confidence interval , 2011, BMJ : British Medical Journal.

[58]  Nihar B. Shah,et al.  Prior and Prejudice , 2020, Proc. ACM Hum. Comput. Interact..

[59]  Cecilia Elena Rouse,et al.  Orchestrating Impartiality: The Impact of , 1997 .

[60]  Nihar B. Shah,et al.  A SUPER* Algorithm to Optimize Paper Bidding in Peer Review , 2020, UAI.