论文信息 - Manchester Metropolitan at SemEval-2018 Task 2: Random Forest with an Ensemble of Features for Predicting Emoji in Tweets

Manchester Metropolitan at SemEval-2018 Task 2: Random Forest with an Ensemble of Features for Predicting Emoji in Tweets

We present our submission to the Semeval 2018 task on emoji prediction. We used a random forest, with an ensemble of bag-of-words, sentiment and psycholinguistic features. Although we performed well on the trial dataset (attaining a macro f-score of 63.185 for English and 81.381 for Spanish), our approach did not perform as well on the test data. We describe our features and classification protocol, as well as initial experiments, concluding with a discussion of the discrepancy between our trial and test results.

Matthew Shardlow | Luciano Gerber | M. Shardlow | Luciano Gerber

[1] Michael Wilson. MRC Psycholinguistic Database , 2001 .

[2] Wiebke Wagner,et al. Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit , 2010, Lang. Resour. Evaluation.

[3] Loren G. Terveen,et al. "Blissfully Happy" or "Ready toFight": Varying Interpretations of Emoji , 2016, ICWSM.

[4] Horacio Saggion,et al. Are Emojis Predictable? , 2017, EACL.

[5] K. Crawford,et al. The Conservatism of Emoji: Work, Affect, and Communication , 2015 .

[6] R. Wegerif,et al. The Semiotics of Emoji: The Rise of Visual Language in the Age of the Internet , 2017 .

[7] J. Pennebaker,et al. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[8] Eric Gilbert,et al. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[9] Horacio Saggion,et al. SemEval 2018 Task 2: Multilingual Emoji Prediction , 2018, *SEMEVAL.

[10] David R. Flatla,et al. Oh that's what you meant!: reducing emoji misunderstanding , 2016, MobileHCI Adjunct.