Semantic Orientation of Crosslingual Sentiments: Employment of Lexicon and Dictionaries

Sentiment Analysis is a modern discipline at the crossroads of data mining and natural language processing. It is concerned with the computational treatment of public moods shared in the form of text over social networking websites. Social media users express their feelings in conversations through cross-lingual terms, intensifiers, enhancers, reducers, symbols, and Net Lingo. However, the generic Sentiment Analysis (SA) research lacks comprehensive coverage about such abstruseness. In particular, they are inapt in the semantic orientation of Crosslingual based code switching, capitalization and accentuation of opinionative text due to the lack of annotated corpora, computational resources, linguistic processing and inefficient machine translation. This study proposes a Heuristic Framework for Crosslingual Sentiment Analysis (HF-CSA) and takes into consideration the NetLingua, code switching, opinion intensifiers, enhancers and reducers in order to cope with intrinsic linguistic peculiarities. The performance of proposed HF-CSA is examined on the Twitter dataset and the robustness of system is assessed on SemEval-2020 task9. The results show that HF-CSA outperformed the existing systems and reached to 71.6% and 76.18% of average accuracy on Clift and SemEval-2020 datasets respectively.

[1]  Wenqing Wang,et al.  A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and Evaluations , 2022, Data Science and Engineering.

[2]  Mohammed I. Alghamdi,et al.  Social Media Big Data Analysis: Towards Enhancing Competitiveness of Firms in a Post-Pandemic World , 2022, Journal of healthcare engineering.

[3]  R. Rout,et al.  Smart sentiment analysis system for pain detection using cutting edge techniques in a smart healthcare framework , 2022, Cluster Computing.

[4]  Asad Habib,et al.  IoT-Based Pervasive Sentiment Analysis: A Fine-Grained Text Normalization Framework for Context Aware Hybrid Applications , 2021, Information and Knowledge in Internet of Things.

[5]  Abdenour Hadid,et al.  Automatic Pain Estimation from Facial Expressions: A Comparative Analysis Using Off-the-Shelf CNN Architectures , 2021, Electronics.

[6]  Luis Espinosa Anke,et al.  XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond , 2021, LREC.

[7]  Peng Wang,et al.  MeisterMorxrc at SemEval-2020 Task 9: Fine-Tune Bert and Multitask Learning for Sentiment Analysis of Code-Mixed Tweets , 2020, SEMEVAL.

[8]  J. E. Et. al. Socio monitoring framework (SMF): Efficient sentiment analysis through informal and native terms , 2020, International Journal of ADVANCED AND APPLIED SCIENCES.

[9]  Salim Sazzed,et al.  Cross-lingual sentiment classification in low-resource Bengali language , 2020, WNUT.

[10]  Wasiq Khan,et al.  Cross Lingual Sentiment Analysis: A Clustering-Based Bee Colony Instance Selection and Target-Based Feature Weighting Approach , 2020, Sensors.

[11]  Tanmoy Chakraborty,et al.  SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets , 2020, SEMEVAL.

[12]  A. Taleb-Ahmed,et al.  Past, Present, and Future of Face Recognition: A Review , 2020, Electronics.

[13]  D. Roth,et al.  Design Challenges for Low-resource Cross-lingual Entity Linking , 2020, arXiv.org.

[14]  Els Lefever,et al.  Sentiment Analysis for Hinglish Code-mixed Tweets by means of Cross-lingual Word Embeddings , 2020, CALCS.

[15]  Jingpeng Li,et al.  A Hybrid Persian Sentiment Analysis Framework: Integrating Dependency Grammar Based Rules and Deep Neural Networks , 2019, Neurocomputing.

[16]  J. Ashraf,et al.  Semantic Orientation Based Decision Making Framework for Big Data Analysis of Sporadic News Events , 2018, Journal of Grid Computing.

[17]  Moch Arif Bijaksana,et al.  Negation handling in sentiment classification using rule-based adapted from Indonesian language syntactic for Indonesian text in Twitter , 2018 .

[18]  Graeme Hirst,et al.  Cross-Lingual Sentiment Analysis Without (Good) Translation , 2017, IJCNLP.

[19]  Muhammad Shahid,et al.  Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques , 2016, J. King Saud Univ. Comput. Inf. Sci..

[20]  Abeed Sarker,et al.  DiegoLab16 at SemEval-2016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters, and Sentiment Lexicons , 2016, *SEMEVAL.

[21]  Ahmad Y. A. Hawalah,et al.  Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques , 2016, Cognitive Computation.

[22]  Igor Mozetic,et al.  Multilingual Twitter Sentiment Classification: The Role of Human Annotators , 2016, PloS one.

[23]  Kam-Fai Wong,et al.  Cross lingual opinion holder extraction based on multi-kernel SVMs and transfer learning , 2015, World Wide Web.

[24]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[25]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[26]  Asad Habib,et al.  Urdu to English Machine Translation using Bilingual Evaluation Understudy , 2013 .

[27]  Hwee Tou Ng,et al.  Improving Statistical Machine Translation for a Resource-Poor Language Using Related Resource-Rich Languages , 2012, J. Artif. Intell. Res..

[28]  Tat-Seng Chua,et al.  Mining slang and urban opinion words and phrases from cQA services: an optimization approach , 2012, WSDM '12.

[29]  Cong-Phap Huynh,et al.  New approach for collecting high quality parallel corpora from multilingual websites , 2011, iiWAS '11.

[30]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[31]  Ana María Martínez Enríquez,et al.  Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits , 2010, MICAI.

[32]  Mehdi Mohammadi,et al.  Building Bilingual Parallel Corpora Based on Wikipedia , 2010, 2010 Second International Conference on Computer Engineering and Applications.

[33]  Philipp Koehn,et al.  Experiments in Domain Adaptation for Statistical Machine Translation , 2007, WMT@ACL.

[34]  Christian Boitet,et al.  Corpus pour la TA : types, tailles et problèmes associés, selon leur usage et le type de système , 2007 .

[35]  Philipp Koehn,et al.  Shared Task: Statistical Machine Translation between European Languages , 2005, ParallelText@ACL.

[36]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[37]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[38]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[39]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[40]  Arwa A. Mashat,et al.  Support Vector Machine Based Handwritten Hindi Character Recognition and Summarization , 2022, Comput. Syst. Sci. Eng..

[41]  Surbhi Bhatia,et al.  Role of Genetic Algorithm in Optimization of Hindi Word Sense Disambiguation , 2022, IEEE Access.

[42]  Zufan Zhang,et al.  Exploring Coevolution of Emotional Contagion and Behavior for Microblog Sentiment Analysis: A Deep Learning Architecture , 2021, Complex..

[43]  Aditya Malte,et al.  Team_Swift at SemEval-2020 Task 9: Tiny Data Specialists through Domain-Specific Pre-training on Code-Mixed Data , 2020, SEMEVAL.

[44]  Asad Habib,et al.  Improving M-Learners’ Performance Through Deep Learning Techniques by Leveraging Features Weights , 2020, IEEE Access.

[45]  Jan Kocon,et al.  Cross-lingual deep neural transfer learning in sentiment analysis , 2020, KES.

[46]  Feng Zeng,et al.  Deep Learning-Based Sentiment Analysis for Roman Urdu Text , 2018, IIKI.

[47]  Nazlia Omar,et al.  Cross-Lingual Sentiment Classification from English to Arabic using Machine Translation , 2017 .

[48]  Muhammad Javed,et al.  A Review on Urdu Language Parsing , 2017 .

[49]  Lucas Brönnimann,et al.  Multilanguage sentiment-analysis of Twitter data on the example of Swiss politicians , 2013 .

[50]  Andrea Esuli,et al.  SentiWordNet: A High-Coverage Lexical Resource for Opinion Mining , 2006 .

[51]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.