Analyzing Who and What Appears in a Decade of US Cable TV News

Cable TV news reaches millions of U.S. households each day, and decisions about who appears on the news, and what stories get talked about, can profoundly influence public opinion and discourse. In this paper, we use computational techniques to analyze a data set of nearly 24/7 video, audio, and text captions from three major U.S. cable TV networks (CNN, FOX News, and MSNBC) from the last decade. Using automated machine learning tools, we detect faces in 244,038 hours of video, label their presented gender, identify prominent public figures, and align text captions to audio. We use these labels to perform face screen time and caption word frequency analyses of the contents of cable TV news. For example, we find that the ratio of female-presenting to male-presenting individuals has increased from 0.41 to 0.54 over the last decade. Donald Trump and Barack Obama received the most screen time over the last decade, with Trump receiving twice the screen time of Obama. Hillary Clinton's face was on screen 11% of the time when "email" was said in 2015 and 2016. In addition to reporting the results of our own analyses, we describe the design of an interactive web-based tool that allows the general public to perform their own screen time analyses on the entire cable TV news data set.

[1]  Alexei A. Efros,et al.  City Forensics: Using Visual Elements to Predict Non-Visual City Attributes , 2014, IEEE Transactions on Visualization and Computer Graphics.

[2]  Gunther Heidemann,et al.  Interactive Schematic Summaries for Faceted Exploration of Surveillance Video , 2013, IEEE Transactions on Multimedia.

[3]  Huamin Qu,et al.  Multimodal Analysis of Video Collections: Visual Exploration of Presentation Techniques in TED Talks , 2020, IEEE Transactions on Visualization and Computer Graphics.

[4]  Alex Endert,et al.  EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos , 2019, IEEE Transactions on Visualization and Computer Graphics.

[5]  Maneesh Agrawala,et al.  Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels , 2019, ArXiv.

[6]  Pourang Irani,et al.  Interactive Exploration of Surveillance Video through Action Shot Summarization and Trajectory Visualization , 2013, IEEE Transactions on Visualization and Computer Graphics.

[7]  Carlo Strapparava,et al.  Behind the Times: Detecting Epoch Changes using Large Corpora , 2013, IJCNLP.

[8]  Erez Lieberman Aiden,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010, Science.

[9]  Qiaosong Wang,et al.  Visual Search at eBay , 2017, KDD.

[10]  Samuel B. Williams,et al.  ASSOCIATION FOR COMPUTING MACHINERY , 2000 .

[11]  Alexei A. Efros,et al.  A Century of Portraits: A Visual Historical Record of American High School Yearbooks , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[12]  Noah Snavely,et al.  StreetStyle: Exploring world-wide clothing styles from millions of photos , 2017, ArXiv.

[13]  Franco Moretti Graphs, Maps, Trees: Abstract Models for a Literary History , 2005 .

[14]  Pat Hanrahan,et al.  Scanner: Efficient Video Analysis at Scale , 2018, ACM Trans. Graph..

[15]  Liqing Zhang,et al.  MindFinder: interactive sketch-based image search on millions of images , 2010, ACM Multimedia.

[16]  Linda D. Hallman,et al.  The Status of Women , 1949, Social Service Review.

[17]  Morgan Klaus Scheuerman,et al.  Gender Recognition or Gender Reductionism?: The Social Implications of Embedded Gender Recognition Systems , 2018, CHI.

[18]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[19]  Christo Kirov,et al.  Blending Noisy Social Media Signals with Traditional Movement Variables to Predict Forced Migration , 2019, KDD.

[20]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[21]  Sridha Sridharan,et al.  Fine-Grained Retrieval of Sports Plays using Tree-Based Alignment of Trajectories , 2017, ArXiv.

[22]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yisong Yue,et al.  Chalkboarding: A New Spatiotemporal Query Paradigm for Sports Play Retrieval , 2016, IUI.

[24]  Chang-Tien Lu,et al.  EMBERS at 4 years: Experiences operating an Open Source Indicators Forecasting System , 2016, KDD.

[25]  Kevin Wilson,et al.  Looking to listen at the cocktail party , 2018, ACM Trans. Graph..

[26]  Maria del Novo The Global Media Monitoring Project , 1970 .

[27]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[28]  Naren Ramakrishnan,et al.  EMBERS AutoGSR: Automated Coding of Civil Unrest Events , 2016, KDD.

[29]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[30]  Tovi Grossman,et al.  Video lens: rapid playback and exploration of large video collections and associated metadata , 2014, UIST.

[31]  Kaitlin L. Brunick,et al.  Quicker, faster, darker: Changes in Hollywood film over 75 years , 2011, i-Perception.

[32]  Eric Tzeng,et al.  Learning a Unified Embedding for Visual Search at Pinterest , 2019, KDD.

[33]  Maria Elizabeth Grabe,et al.  Taking Television Seriously: A Sound and Image Bite Analysis of Presidential Campaign Coverage, 1992–2004 , 2007 .

[34]  Duy-Dinh Le,et al.  Visual Analytics of Political Networks From Face-Tracking of News Video , 2016, IEEE Transactions on Multimedia.

[35]  D. Hallin Sound Bite News: Television Coverage of Elections, 1968–1988 , 1992 .

[36]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[37]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.