Investigating Sports Commentator Bias within a Large Corpus of American Football Broadcasts

Sports broadcasters inject drama into play-by-play commentary by building team and player narratives through subjective analyses and anecdotes. Prior studies based on small datasets and manual coding show that such theatrics evince commentator bias in sports broadcasts. To examine this phenomenon, we assemble FOOTBALL, which contains 1,455 broadcast transcripts from American football games across six decades that are automatically annotated with 250K player mentions and linked with racial metadata. We identify major confounding factors for researchers examining racial bias in FOOTBALL, and perform a computational analysis that supports conclusions from prior social science studies.

[1]  R. E. Rainville,et al.  Extent of Covert Racial Prejudice in Pro Football Announcers' Speech , 1977 .

[2]  Michael Omi,et al.  Racial formation in the United States , 1986 .

[3]  James A. Rada Color blind‐sided: Racial bias in network television's coverage of professional football games , 1996 .

[4]  N. Koivula Gender Stereotyping in Televised Media Sport Coverage , 1999 .

[5]  A. Billings Depicting the Quarterback in Black and White: A Content Analysis of College and Professional Football Broadcast Commentary , 2004 .

[6]  T. Bruce Marking the boundaries of the ‘normal’ in televised sports: the play-by-play of race , 2004 .

[7]  James A. Rada,et al.  Color Coded: Racial Descriptors in Television Coverage of Intercollegiate Sports , 2005 .

[8]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[9]  Burt L. Monroe,et al.  Fightin' Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict , 2008, Political Analysis.

[10]  Eric P. Xing,et al.  Sparse Additive Generative Models of Text , 2011, ICML.

[11]  Stephen Tyndall,et al.  BALLGAME: A Corpus for Computational Semantics , 2011, IWCS.

[12]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[13]  Laura L. Aull,et al.  Fighting words: a corpus analysis of gender representations in sports reportage , 2013 .

[14]  Dirk Hovy,et al.  Learning a POS tagger for AAVE-like language , 2016, NAACL.

[15]  M. Sen,et al.  Race as a Bundle of Sticks: Designs that Estimate Effects of Seemingly Immutable Characteristics , 2016 .

[16]  Abram Handler,et al.  Bag of What? Simple Noun Phrase Extraction for Text Analysis , 2016, NLP+CSS@EMNLP.

[17]  Cristian Danescu-Niculescu-Mizil,et al.  Tie-breaker: Using language models to quantify gender bias in sports journalism , 2016, ArXiv.

[18]  Jure Leskovec,et al.  Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora , 2016, EMNLP.

[19]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[20]  William L. Hamilton,et al.  Language from police body camera footage shows racial disparities in officer respect , 2017, Proceedings of the National Academy of Sciences.

[21]  Andy Way,et al.  Demographic Word Embeddings for Racism Detection on Twitter , 2017, IJCNLP.

[22]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[23]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[24]  Saif Mohammad,et al.  Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems , 2018, *SEMEVAL.

[25]  Sameer Singh,et al.  GenderQuant: Quantifying Mention-Level Genderedness , 2019, NAACL.