Transportation sentiment analysis using word embedding and ontology-based topic modeling

Abstract Social networks play a key role in providing a new approach to collecting information regarding mobility and transportation services. To study this information, sentiment analysis can make decent observations to support intelligent transportation systems (ITSs) in examining traffic control and management systems. However, sentiment analysis faces technical challenges: extracting meaningful information from social network platforms, and the transformation of extracted data into valuable information. In addition, accurate topic modeling and document representation are other challenging tasks in sentiment analysis. We propose an ontology and latent Dirichlet allocation (OLDA)-based topic modeling and word embedding approach for sentiment classification. The proposed system retrieves transportation content from social networks, removes irrelevant content to extract meaningful information, and generates topics and features from extracted data using OLDA. It also represents documents using word embedding techniques, and then employs lexicon-based approaches to enhance the accuracy of the word embedding model. The proposed ontology and the intelligent model are developed using Web Ontology Language and Java, respectively. Machine learning classifiers are used to evaluate the proposed word embedding system. The method achieves accuracy of 93%, which shows that the proposed approach is effective for sentiment classification.

[1]  Zhang Hao,et al.  Research and Application on Domain Ontology Learning Method Based on LDA , 2017, J. Softw..

[2]  Long Chen,et al.  Weakly-Supervised Deep Embedding for Product Review Sentiment Analysis , 2018, IEEE Transactions on Knowledge and Data Engineering.

[3]  Mauro Dragoni,et al.  A Neural Word Embeddings Approach for Multi-Domain Sentiment Analysis , 2017, IEEE Transactions on Affective Computing.

[4]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[5]  Rui Zhao,et al.  Fuzzy Bag-of-Words Model for Document Representation , 2018, IEEE Transactions on Fuzzy Systems.

[6]  Jianxin Li,et al.  Incremental term representation learning for social network analysis , 2017, Future Gener. Comput. Syst..

[7]  Dongli Yue,et al.  Traffic Accidents Knowledge Management Based on Ontology , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[8]  Francisco C. Pereira,et al.  Why so many people? Explaining Nonhabitual Transport Overcrowding With Internet Data , 2015, IEEE Transactions on Intelligent Transportation Systems.

[9]  Mehran Kamkarhaghighi,et al.  Content Tree Word Embedding for document representation , 2017, Expert Syst. Appl..

[10]  Kyung Sup Kwak,et al.  Opinion mining based on fuzzy domain ontology and Support Vector Machine: A proposal to automate online review classification , 2016, Appl. Soft Comput..

[11]  Raymond Y. K. Lau,et al.  Social analytics: Learning fuzzy product ontologies for aspect-oriented sentiment analysis , 2014, Decis. Support Syst..

[12]  Xinyu Dai,et al.  Topic2Vec: Learning distributed representations of topics , 2015, 2015 International Conference on Asian Language Processing (IALP).

[13]  Miguel Ángel Rodríguez-García,et al.  Ontology-based annotation and retrieval of services in the cloud , 2014, Knowl. Based Syst..

[14]  Kim Schouten,et al.  Review-level aspect-based sentiment analysis using an ontology , 2018, SAC.

[15]  Mark S. Fox,et al.  Ontologies for transportation research: A survey , 2018 .

[16]  Xiaoduan Sun,et al.  Text Mining and Topic Modeling of Compendiums of Papers from Transportation Research Board Annual Meetings , 2016 .

[17]  Abdelkarim Erradi,et al.  Sentiment Analysis as a Service: A Social Media Based Sentiment Analysis Framework , 2017, 2017 IEEE International Conference on Web Services (ICWS).

[18]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[19]  Mingdong Tang,et al.  WE-LDA: A Word Embeddings Augmented LDA Model for Web Services Clustering , 2017, 2017 IEEE International Conference on Web Services (ICWS).

[20]  Zijian Wang,et al.  Semi supervised classification of scientific and technical literature based on semi supervised hierarchical description of improved latent dirichlet allocation (LDA) , 2018, Cluster Computing.

[21]  John G. Breslin,et al.  INSIGHT-1 at SemEval-2016 Task 4: Convolutional Neural Networks for Sentiment Classification and Quantification , 2016, SemEval@NAACL-HLT.

[22]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[23]  D. Teja Santosh,et al.  Opinion Mining of Online Product Reviews from Traditional LDA Topic Clusters using Feature Ontology Tree and Sentiwordnet , 2016 .

[24]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[26]  Kyung Sup Kwak,et al.  Fuzzy Ontology-Based Sentiment Analysis of Transportation and City Feature Reviews for Safe Traveling , 2017, ArXiv.

[27]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[28]  Umberto Straccia,et al.  Fuzzy Ontology Representation using OWL 2 , 2010, Int. J. Approx. Reason..

[29]  Dipanjan Sarkar,et al.  Analyzing Movie Reviews Sentiment , 2018 .

[30]  Wei Xu,et al.  Secondhand seller reputation in online markets: A text analytics framework , 2018, Decis. Support Syst..

[31]  Aitor García Pablos,et al.  W2VLDA: Almost unsupervised system for Aspect Based Sentiment Analysis , 2017, Expert Syst. Appl..

[32]  Hideyuki Tanaka,et al.  Public Sentiment and Demand for Used Cars after A Large-Scale Disaster: Social Media Sentiment Analysis with Facebook Pages , 2018, ArXiv.

[33]  Mauro Dragoni,et al.  A fuzzy-based strategy for multi-domain sentiment analysis , 2018, Int. J. Approx. Reason..

[34]  João Filipe Figueiredo Pereira,et al.  Social Media Text Processing and Semantic Analysis for Smart Cities , 2017, ArXiv.

[35]  Alaa Mohasseb,et al.  Domain specific syntax based approach for text classification in machine learning context , 2017, 2017 International Conference on Machine Learning and Cybernetics (ICMLC).

[36]  Taeho Hong,et al.  Investigating Online Destination Images Using a Topic-Based Sentiment Analysis Approach , 2017 .

[37]  Namuk Ko,et al.  Identifying Product Opportunities Using Social Media Mining: Application of Topic Modeling and Chance Discovery Theory , 2018, IEEE Access.

[38]  Kyung Sup Kwak,et al.  The IoT: Exciting Possibilities for Bettering Lives: Special application scenarios , 2016, IEEE Consumer Electronics Magazine.

[39]  Eric Atwell,et al.  Aspect Based Sentiment Analysis Framework using Data from Social Media Network , 2017 .

[40]  Tsvi Kuflik,et al.  Enhancing transport data collection through social media sources: methods, challenges and opportunities for textual data , 2015 .

[41]  Maria Virvou,et al.  Comparative Evaluation of Algorithms for Sentiment Analysis over Social Networking Services , 2017, J. Univers. Comput. Sci..

[42]  Keeley A. Crockett,et al.  Modelling road congestion using ontologies for big data analytics in smart cities , 2017, 2017 International Smart Cities Conference (ISC2).

[43]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[44]  Marwan Bikdash,et al.  From social media to public health surveillance: Word embedding based clustering method for twitter classification , 2017, SoutheastCon 2017.

[45]  Aytug Onan,et al.  LDA-based Topic Modelling in Text Sentiment Classification: An Empirical Analysis , 2016, Int. J. Comput. Linguistics Appl..

[46]  Xiaolin Zheng,et al.  Review Sentiment Analysis Based on Deep Learning , 2015, 2015 IEEE 12th International Conference on e-Business Engineering.

[47]  Long Ma,et al.  A Multi-label Text Classification Framework: Using Supervised and Unsupervised Feature Selection Strategy , 2017 .

[48]  Min Song,et al.  Topic-based content and sentiment analysis of Ebola virus on Twitter and in the news , 2016, J. Inf. Sci..

[49]  Hee Yong Youn,et al.  A novel classification approach based on Naïve Bayes for Twitter sentiment analysis , 2017, KSII Trans. Internet Inf. Syst..

[50]  Zoraida Callejas Carrión,et al.  Sentiment Analysis: From Opinion Mining to Human-Agent Interaction , 2016, IEEE Transactions on Affective Computing.

[51]  Luiz Antonio Ribeiro,et al.  Impurity effects and temperature influence on the exciton dissociation dynamics in conjugated polymers , 2013 .

[52]  Hui Zhang,et al.  Public Sentiments Analysis Based on Fuzzy Logic for Text , 2016, Int. J. Softw. Eng. Knowl. Eng..

[53]  Veronikha Effendy,et al.  Sentiment Analysis on Twitter about the Use of City Public Transportation Using Support Vector Machine Method , 2016 .

[54]  Munir Ahmad,et al.  Analyzing the Performance of SVM for Polarity Detection with Different Datasets , 2017 .

[55]  Daeyoung Park,et al.  Merged Ontology and SVM-Based Information Extraction and Recommendation System for Social Robots , 2017, IEEE Access.

[56]  Marco Guerini,et al.  SentiWords: Deriving a High Precision and High Coverage Lexicon for Sentiment Analysis , 2015, IEEE Transactions on Affective Computing.

[57]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[58]  Francisco Herrera,et al.  Consensus vote models for detecting and filtering neutrality in sentiment analysis , 2018, Inf. Fusion.

[59]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[60]  Yugyung Lee,et al.  Ontology Mapping Framework with Feature Extraction and Semantic Embeddings , 2018, 2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W).

[61]  Emma E. Regentova,et al.  A Hybrid Model Using Logistic Regression and Wavelet Transformation to Detect Traffic Incidents , 2016 .

[62]  Miguel Ángel Rodríguez-García,et al.  Sentiment Analysis on Tweets about Diabetes: An Aspect-Level Approach , 2017, Comput. Math. Methods Medicine.

[63]  Yong-Gi Kim,et al.  Type-2 fuzzy ontology-based opinion mining and information extraction: A proposal to automate the hotel reservation system , 2015, Applied Intelligence.

[64]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[65]  Dong-Hong Ji,et al.  A topic-enhanced word embedding for Twitter sentiment classification , 2016, Inf. Sci..

[66]  Keet Sugathadasa,et al.  Deriving a representative vector for ontology classes with instance word vector embeddings , 2017, 2017 Seventh International Conference on Innovative Computing Technology (INTECH).