论文信息 - Sentiment Classification System of Twitter Data for US Airline Service Analysis

Sentiment Classification System of Twitter Data for US Airline Service Analysis

The airline industry is a very competitive market which has grown rapidly in the past 2 decades. Airline companies resort to traditional customer feedback forms which in turn are very tedious and time consuming. This is where Twitter data serves as a good source to gather customer feedback tweets and perform a sentiment analysis. In this paper, we worked on a dataset comprising of tweets for 6 major US Airlines and performed a multi-class sentiment analysis. This approach starts off with pre-processing techniques used to clean the tweets and then representing these tweets as vectors using a deep learning concept (Doc2vec) to do a phrase-level analysis. The analysis was carried out using 7 different classification strategies: Decision Tree, Random Forest, SVM, K-Nearest Neighbors, Logistic Regression, Gaussian Naïve Bayes and AdaBoost. The classifiers were trained using 80% of the data and tested using the remaining 20% data. The outcome of the test set is the tweet sentiment (positive/negative/neutral). Based on the results obtained, the accuracies were calculated to draw a comparison between each classification approach and the overall sentiment count was visualized combining all six airlines.

Anand Kumar | Ankita Rane | Ankita Rane | Anand Kumar

[1] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.

[2] Lillian Lee,et al. Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[3] Aaas News,et al. Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[4] Vaibhavi N Patodkar,et al. Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[5] Haixun Wang,et al. Guest Editorial: Big Social Data Analysis , 2014, Knowl. Based Syst..

[6] Prem Melville,et al. Sentiment analysis of blogs by combining lexical knowledge with text classification , 2009, KDD.

[7] Rui Xia,et al. Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..