Polarity Classification of Twitter Messages using Audio Processing

Abstract Polarity classification is one of the most fundamental problems in sentiment analysis. In this paper, we propose a novel method, Sound Cosine Similaritye Matching, for polarity classification of Twitter messages which incorporates features based on audio data rather than on grammar or other text properties, i.e., eliminates the dependency on external dictionaries. It is useful especially for correctly identifying misspelled or shortened words that are frequently encountered in text from online social media. Method performance is evaluated in two levels: i) capture rate of the misspelled and shortened words, ii) classification performance of the feature set. Our results show that classification accuracy is improved, compared to two other models in the literature, when the proposed features are used.

[1]  Vadlamani Ravi,et al.  A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[2]  Wei-Po Lee,et al.  Tracking and recognizing emotions in short text messages from online chatting services , 2018, Inf. Process. Manag..

[3]  Hatem Haddad,et al.  Empirical Evaluation of Leveraging Named Entities for Arabic Sentiment Analysis , 2019, Int. Arab J. Inf. Technol..

[4]  Sergiu Nisioi Comparing Speech and Text Classification on ICNALE , 2016, LREC.

[5]  Siti Mariyam Shamsuddin,et al.  Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion , 2019, Inf. Process. Manag..

[6]  A. Cabrini,et al.  Sound representation in higher language areas during language generation , 2015, Proceedings of the National Academy of Sciences.

[7]  Chihli Hung,et al.  Word of mouth quality classification based on contextual sentiment lexicons , 2017, Inf. Process. Manag..

[8]  Fernando de la Prieta,et al.  Sentiment Analysis Based on Deep Learning: A Comparative Study , 2020, Electronics.

[9]  François Yvon,et al.  Normalizing SMS: are Two Metaphors Better than One ? , 2008, COLING.

[10]  Harith Alani,et al.  Contextual semantics for sentiment analysis of Twitter , 2016, Inf. Process. Manag..

[11]  Serkan Günal,et al.  The impact of preprocessing on text classification , 2014, Inf. Process. Manag..

[12]  Christopher S. G. Khoo,et al.  Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons , 2018, J. Inf. Sci..

[13]  Vivek Kumar Rangarajan Sridhar Unsupervised Text Normalization Using Distributed Representations of Words and Phrases , 2015, VS@HLT-NAACL.

[14]  Vasudeva Varma,et al.  sielers : Feature Analysis and Polarity Classification of Expressions from Twitter and SMS Data , 2013, *SEMEVAL.

[15]  Johannes Fürnkranz,et al.  A Study Using $n$-gram Features for Text Categorization , 1998 .

[16]  Olga Vechtomova,et al.  Disambiguating context-dependent polarity of words: An information retrieval approach , 2017, Inf. Process. Manag..

[17]  Kun Lu,et al.  Vocabulary size and its effect on topic representation , 2017, Inf. Process. Manag..

[18]  Chin-Yew Lin,et al.  Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.

[19]  Xiong Luo,et al.  An LSTM Approach to Short Text Sentiment Classification with Word Embeddings , 2018, ROCLING/IJCLCLP.

[20]  Andreas Dengel,et al.  Sentiment Analysis and Summarization of Twitter Data , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[21]  Yuita Arum Sari,et al.  Sentiment Analysis on Movie Reviews Using Ensemble Features and Pearson Correlation Based Feature Selection , 2018, 2018 International Conference on Sustainable Information Engineering and Technology (SIET).

[22]  Zitao Liu,et al.  A Comparative Study on Linguistic Feature Selection in Sentiment Polarity Classification , 2013, ArXiv.

[23]  Aun Irtaza,et al.  Fuzzy topic modeling approach for text mining over short text , 2019, Inf. Process. Manag..

[24]  Usman Qamar,et al.  eSAP: A decision support framework for enhanced sentiment analysis and polarity classification , 2016, Inf. Sci..

[25]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[26]  Meera Narvekar,et al.  Normalization of Noisy Text Data , 2015 .

[27]  Enrique Herrera-Viedma,et al.  Sentiment analysis: A review and comparative analysis of web services , 2015, Inf. Sci..

[28]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[29]  Meng Zhang,et al.  Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.