Accuracy of Malaysia Public Response to Economic Factors During the Covid-19 Pandemic Using Vader and Random Forest

This study conducted a sentiment analysis of the impact of the Covid-19 pandemic in the economic sector on people's lives through social media Twitter. The analysis was carried out on 23777 tweet data collected from 13 states in Malaysia from 1 December 2019 to 17 June 2020. The research process went through 3 stages, namely pre-processing, labeling, and modeling. The pre-processing stage is collecting and cleaning data. Labeling in this study uses Vader sentiment polarity detection to provide an assessment of the sentiment of tweet data which is used as training data. The modeling stage means to test the sentiment data using the random forest algorithm plus the extraction count vectorizer and TF-IDF features as well as the N-gram selection feature. The test results show that the polarity of public sentiment in Malaysia is predominantly positive, which is 11,323 positive, 4105 neutral, and 8349 negative based on Vader labeling. The accuracy rate from the random forest modeling results was obtained 93.5 percent with TF-IDF and 1 gram.