Tracing Unemployment Rate of South Africa during the COVID-19 Pandemic Using Twitter Data (Preprint)

BACKGROUND Global economy has been hardly hit by the COVID-19 pandemic. Many countries are experiencing a severe and destructive recession. Unemployment rate is very important to policy makers as it provide a key indicator of overall labour market and wider economic conditions. Despite its relevance, there is usually a delay in the availability of the indicator as it is traditionally based on a survey of households over several months. The speed at which the economy in most countries decline at the onset of COVID-19 highlights the importance of timely information about the labour market during the onset of a recession. In the coming year, there will be uncertainty about the timing and extent of any improvement in labour market outcomes that will also highlight the value of timely information. OBJECTIVE The main goal of this study is to provide policy- and decision-makers with additional and real-time information about the labor market flow during a prolonged pandemic. The first objective of the study is to find the missing unemployment rates in cases where census measurements are incomplete. The second objective is to estimate the unemployment rate in real-time since it usually takes months for formal unemployment data to be published. In this paper, we use social media data, particularly, Twitter to trace and nowcast the unemployment rate of South Africa during the COVID-19 pandemic. METHODS Unemployment rate in South Africa is estimated quarterly. We first used Google mobility index to interpolate it and find the monthly values. Next, we created a dataset of unemployment related tweets in South Africa using certain keywords such as employed, unemployed, and retrench. Principal Component Regression (PCR) was applied to estimate the unemployment rate using the tweets and their sentiment scores. RESULTS Numerical results indicate that the number of tweets is highly correlated with the unemployment rate during and before the COVID-19 pandemic. In addition, the trend of the normalized sum of the sentiment scores of the tweets is negatively correlated with the unemployment rate of South Africa. Moreover, the estimated unemployment rate using PCR is highly correlated with the actual unemployment rate of South Africa and has a low Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). CONCLUSIONS The results of this study show that social media information can be used to reasonably estimate one of the key labor market indicators, especially during disaster events such as a prolonged pandemic. This information can be used to rapidly understand and manage the impacts of the pandemic on the economy and people’s life.

[1]  Leticia Elizabeth Romero-García,et al.  Social network analysis of spreading and exchanging information on Twitter: the case of an agricultural research and education centre in Mexico , 2021, The Journal of Agricultural Education and Extension.

[2]  Clement Ola Adekoya,et al.  Social media and the spread of COVID-19 infodemic , 2021 .

[3]  Emmanouil I. Marakakis,et al.  OpinionMine: A Bayesian-based framework for opinion mining using Twitter Data , 2021 .

[4]  Samah J. Fodeh,et al.  Twitter-based analysis reveals differential COVID-19 concerns across areas with socioeconomic disparities , 2021, Computers in Biology and Medicine.

[5]  Md. Mokhlesur Rahman,et al.  Socioeconomic factors analysis for COVID-19 US reopening sentiment with Twitter and census data , 2021, Heliyon.

[6]  Rodrigo Sandoval-Almazán,et al.  Does Twitter Affect Stock Market Decisions? Financial Sentiment Analysis During Pandemics: A Comparative Study of the H1N1 and the COVID-19 Periods , 2021, Cogn. Comput..

[7]  Benjamin K. P. Woo,et al.  Twitter as a Mental Health Support System for Students and Professionals in the Medical Field , 2021, JMIR medical education.

[8]  Joanna Michalak Does pre-processing affect the correlation indicator between Twitter message volume and stock market trading volume? , 2020 .

[9]  Philip S. Yu,et al.  Understanding Pre-trained BERT for Aspect-based Sentiment Analysis , 2020, COLING.

[10]  Leonardo Neves,et al.  TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification , 2020, FINDINGS.

[11]  P. Butterworth,et al.  Unemployment, Employability and COVID19: How the Global Socioeconomic Shock Challenged Negative Perceptions Toward the Less Fortunate in the Australian Context , 2020, Frontiers in Psychology.

[12]  Steven J. Davis,et al.  Economic uncertainty before and during the COVID-19 pandemic , 2020, Journal of Public Economics.

[13]  C. Iglesias,et al.  Predicting Reputation in the Sharing Economy with Twitter Social Data , 2020, Applied Sciences.

[14]  T. Bosch,et al.  Facebook and politics in Africa: Zimbabwe and Kenya , 2020 .

[15]  Min Song,et al.  Developing a supervised learning-based social media business sentiment index , 2019, The Journal of Supercomputing.

[16]  C. Skinner Issues and Challenges in Census Taking , 2018 .

[17]  Gábor Vattay,et al.  Prediction of employment and unemployment rates from Twitter daily rhythms in the US , 2017, EPJ Data Science.

[18]  Scott Counts,et al.  The psychology of job loss: using social media data to characterize and predict unemployment , 2016, WebSci.

[19]  C. R. Nirmala,et al.  Twitter data analysis for unemployment crisis , 2015, 2015 International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT).

[20]  Seokho Lee,et al.  Principal Component Regression by Principal Component Selection , 2015 .

[21]  Douglas Grbic,et al.  Measuring race and ethnicity in the censuses of Australia, Canada, and the United States: Parallels and paradoxes , 2015 .

[22]  Michael J. Cafarella,et al.  Using Social Media to Measure Labor Market Flows , 2014 .