Exploring Speech Cues in Web-mined COVID-19 Conversational Vlogs

The COVID-19 pandemic caused by the novel SARS-Coronavirus-2 (n-SARS-CoV-2) has impacted people's lives in unprecedented ways. During the time of the pandemic, social vloggers have used social media to actively share their opinions or experiences in quarantine. This paper collected videos from YouTube to track emotional responses in conversational vlogs and their potential associations with events related to the pandemic. In particular, vlogs uploaded from locations in New York City were analyzed given that this was one of the first epicenters of the pandemic in the United States. We observed some common patterns in vloggers' acoustic and linguistic features across the time span of the quarantine, which is indicative of changes in emotional reactivity. Additionally, we investigated fluctuations of acoustic and linguistic patterns in relation to COVID-19 events in the New York area (e.g. the number of daily new cases, number of deaths, and extension of stay-at-home order and state of emergency). Our results indicate that acoustic features, such as zero-crossing-rate, jitter, and shimmer, can be valuable for analyzing emotional reactivity in social media videos. Our findings further indicate that some of the peaks of the acoustic and linguistic indices align with COVID-19 events, such as the peak in the number of deaths and emergency declaration.

[1]  Daniel Gatica-Perez,et al.  VlogSense: Conversational behavior and social attention in YouTube , 2011, TOMCCAP.

[2]  Yen-Liang Chen,et al.  Emotion classification of YouTube videos , 2017, Decis. Support Syst..

[3]  Satya Avasarala Selenium WebDriver practical guide : interactively automate web applications using Selenium WebDriver , 2014 .

[4]  Xin Pan,et al.  YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Douiji yasmina,et al.  Using YouTube Comments for Text-based Emotion Recognition☆ , 2016 .

[6]  J. Xue,et al.  The Impact of COVID-19 Epidemic Declaration on Psychological Consequences: A Study on Active Weibo Users , 2020, International journal of environmental research and public health.

[7]  Md. Mokhlesur Rahman,et al.  COVID-19 Public Sentiment Insights and MachineLearning for Tweets Classification , 2020, medRxiv.

[8]  Daniel Gatica-Perez,et al.  The Good, the Bad, and the Angry: Analyzing Crowdsourced Impressions of Vloggers , 2012, ICWSM.

[9]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[10]  Maximilian Mozes,et al.  Measuring Emotions in the COVID-19 Real World Worry Dataset , 2020, NLPCOVID19.

[11]  Björn W. Schuller,et al.  YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context , 2013, IEEE Intelligent Systems.

[12]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[13]  Daniel Gatica-Perez,et al.  You Are Known by How You Vlog: Personality Impressions and Nonverbal Behavior in YouTube , 2011, ICWSM.

[14]  Björn W. Schuller,et al.  Acoustic emotion recognition: A benchmark comparison of performances , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[15]  Yong Tan,et al.  Social Networks and the Diffusion of User-Generated Content: Evidence from YouTube , 2012, Inf. Syst. Res..

[16]  Quan Wang,et al.  Generalized End-to-End Loss for Speaker Verification , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Ayu Purwarianti,et al.  Emotion classification on youtube comments using word embedding , 2017, 2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA).

[18]  Wen Gao,et al.  Vlogging: A survey of videoblogging technology on the web , 2010, CSUR.

[19]  Hajar Mousannif,et al.  Using YouTube Comments for Text-based Emotion Recognition , 2016, ANT/SEIT.