Text Classification of Flu-Related Tweets Using FastText with Sentiment and Keyword Features

In this study, we present a framework for fluprediction/detection based on the available data of Social Networking Sites (SNS). The framework uses a state-of-the-art text classifier, which is FastText, to classify Twitter posts into flu-related or flu-unrelated posts. The FastText based framework is trained and tested using a pre-labeled dataset and utilizing the features of sentiment analysis and predefined keyword occurrences in addition to textual features. Results show that the framework improves the accuracy, in addition to the efficiency of flu disease surveillance systems that use unstructured data such as posts of Social Networking Sites.