Sentiment Analysis of Tweets : Baselines and Neural Network Models

The goal of sentiment analysis is to classify text samples according to their overall positivity or negativity. We refer to the positivity or negativity of a text sample as its polarity. In this project, we investigate three-class sentiment classification of Twitter data where the labels are “positive”, “negative”, and “neutral”. We explore a number of questions in relation to the sentiment analysis problem. First, we examine dataset preprocessing specific to the natural language domain of tweets. We then evaluate a number of baseline linear models for sentiment analysis. Finally, we attempt to improve on the performance of our baseline models using neural networks initialized with linear model weights. All the algorithms we consider in this project are supervised methods over unigram and bigram features.