UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices?

In this paper, we present our system for assigning an emoji to a tweet based on the text. Each tweet was originally posted with an emoji which the task providers removed. Our task was to decide out of 20 emojis, which originally came with the tweet. Two datasets were provided - one in English and the other in Spanish. We treated the task as a standard classification task with the emojis as our classes and the tweets as our documents. Our best performing system used a Bag of Words model with a Linear Support Vector Machine as its’ classifier. We achieved a macro F1 score of 32.73% for the English data and 17.98% for the Spanish data.