A Novel Context-Aware Multimodal Framework for Persian Sentiment Analysis

Most recent works on sentiment analysis have exploited the text modality. However, millions of hours of video recordings posted on social media platforms everyday hold vital unstructured information that can be exploited to more effectively gauge public perception. Multimodal sentiment analysis offers an innovative solution to computationally understand and harvest sentiments from videos by contextually exploiting audio, visual and textual cues. In this paper, we, firstly, present a first of its kind Persian multimodal dataset comprising more than 800 utterances, as a benchmark resource for researchers to evaluate multimodal sentiment analysis approaches in Persian language. Secondly, we present a novel context-aware multimodal sentiment analysis framework, that simultaneously exploits acoustic, visual and textual cues to more accurately determine the expressed sentiment. We employ both decision-level (late) and feature-level (early) fusion methods to integrate affective cross-modal information. Experimental results demonstrate that the contextual integration of multimodal features such as textual, acoustic and visual features deliver better performance (91.39%) compared to unimodal features (89.24%).

[1]  Qammer H. Abbasi,et al.  Energy and Performance Trade-Off Optimization in Heterogeneous Computing via Reinforcement Learning , 2020 .

[2]  Zain U. Hussain,et al.  Artificial intelligence-enabled analysis of UK and US public attitudes on Facebook and Twitter towards COVID-19 vaccinations , 2020, medRxiv.

[3]  Ryan M. Gibson,et al.  A Novel Functional Link Network Stacking Ensemble with Fractal Features for Multichannel Fall Detection , 2020, Cognitive Computation.

[4]  Hong Qiao,et al.  Guided Policy Search for Sequential Multitask Learning , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[5]  Swati Gupta,et al.  Multimodal sentiment analysis: Sentiment analysis using audiovisual format , 2015, 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom).

[6]  Shuang Wu,et al.  Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Mohammad Bagher Dastgheib,et al.  The application of Deep Learning in Persian Documents Sentiment Analysis , 2020 .

[8]  Hadi Larijani,et al.  Exploiting Deep Learning for Persian Sentiment Analysis , 2018, BICS.

[9]  Francesco Piazza,et al.  Sentic Web: A New Paradigm for Managing Social Media Affective Information , 2011, Cognitive Computation.

[10]  Haixun Wang,et al.  Guest Editorial: Big Social Data Analysis , 2014, Knowl. Based Syst..

[11]  Qiang Zhou,et al.  PerSent: A Freely Available Persian Sentiment Lexicon , 2016, BICS.

[12]  Judith A. Hall,et al.  Encoding and decoding of spontaneous and posed facial expressions. , 1976 .

[13]  Erik Cambria,et al.  Benchmarking Multimodal Sentiment Analysis , 2017, CICLing.

[14]  W. H. Li,et al.  Densely Connected Deep Extreme Learning Machine Algorithm , 2020, Cognitive Computation.

[15]  Erik Cambria,et al.  Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis , 2017, Neurocomputing.

[16]  Francesco Carlo Morabito,et al.  An Ensemble Based Classification Approach for Persian Sentiment Analysis , 2017, IIH-MSP.

[17]  Andrea Nanetti,et al.  A Review of Shorthand Systems: From Brachygraphy to Microtext and Beyond , 2020, Cognitive Computation.

[18]  Ahsan Adeel,et al.  CochleaNet: A Robust Language-independent Audio-Visual Model for Speech Enhancement , 2019, Inf. Fusion.

[19]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[20]  Amir Hussain,et al.  Deep Neural Network Driven Binaural Audio Visual Speech Separation , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[21]  Xuelong Li,et al.  Deep Multimodal Clustering for Unsupervised Audiovisual Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Tariq S. Durrani,et al.  A Comparative Study of Persian Sentiment Analysis Based on Different Feature Combinations , 2017, CSPS.

[23]  Mufti Mahmud,et al.  TeKET: a Tree-Based Unsupervised Keyphrase Extraction Technique , 2020, Cognitive Computation.

[24]  Hadi Larijani,et al.  A Survey on the Role of Wireless Sensor Networks and IoT in Disaster Management , 2018, Springer Natural Hazards.

[25]  Mohammad Soleymani,et al.  A survey of multimodal sentiment analysis , 2017, Image Vis. Comput..

[26]  Yücel Saygin,et al.  Adaptation and Use of Subjectivity Lexicons for Domain Dependent Sentiment Classification , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[27]  Amir Hussain,et al.  Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System , 2020, INTERSPEECH.

[28]  Tariq S. Durrani,et al.  Toward's Arabic Multi-modal Sentiment Analysis , 2017, CSPS.

[29]  Erik Cambria,et al.  Ensemble application of ELM and GPU for real-time multimodal sentiment analysis , 2018, Memetic Comput..

[30]  Jingpeng Li,et al.  Robust Visual Saliency Optimization Based on Bidirectional Markov Chains , 2020, Cognitive Computation.

[31]  Jane Yung-jen Hsu,et al.  Sentic blending: Scalable multimodal fusion for the continuous interpretation of semantics and sentics , 2013, 2013 IEEE Symposium on Computational Intelligence for Human-like Intelligence (CIHLI).

[32]  Rada Mihalcea,et al.  Towards multimodal sentiment analysis: harvesting opinions from the web , 2011, ICMI '11.

[33]  Raymond Chiong,et al.  Multilingual sentiment analysis: from formal to informal and scarce resource languages , 2016, Artificial Intelligence Review.

[34]  Paolo Gastaldo,et al.  Learning with Similarity Functions: a Tensor-Based Framework , 2018, Cogn. Comput..

[35]  Erik Cambria,et al.  Application of multi-dimensional scaling and artificial neural networks for biologically inspired opinion mining , 2013, BICA 2013.

[36]  Peter Derleth,et al.  AV Speech Enhancement Challenge using a Real Noisy Corpus , 2019, ArXiv.

[37]  Amir Hussain,et al.  Deep learning driven multimodal fusion for automated deception detection , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[38]  Kaizhu Huang,et al.  Offline Arabic Handwriting Recognition Using Deep Machine Learning: A Review of Recent Advances , 2019, BICS.

[39]  Zhaoxia Wang,et al.  A review of emotion sensing: categorization models and algorithms , 2020, Multimedia Tools and Applications.

[40]  Amir Hussain,et al.  Persian Named Entity Recognition , 2017, 2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[41]  Amir Hussain,et al.  Lip-Reading Driven Deep Learning Approach for Speech Enhancement , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[42]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[43]  Sidney K. D'Mello,et al.  A Review and Meta-Analysis of Multimodal Affect Detection Systems , 2015, ACM Comput. Surv..

[44]  Gilbert Lazard,et al.  A grammar of contemporary Persian , 1994 .

[45]  Sridha Sridharan,et al.  Automatically Detecting Pain in Video Through Facial Action Units , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Erik Cambria,et al.  Extracting Time Expressions and Named Entities with Constituent-Based Tagging Schemes , 2020, Cognitive Computation.

[47]  Muhammad Ali Imran,et al.  Travelers-Tracing and Mobility Profiling Using Machine Learning in Railway Systems , 2020, 2020 International Conference on UK-China Emerging Technologies (UCET).

[48]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[49]  Jingpeng Li,et al.  A Hybrid Persian Sentiment Analysis Framework: Integrating Dependency Grammar Based Rules and Deep Neural Networks , 2019, Neurocomputing.

[50]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[51]  Ayoub Al-Hamadi,et al.  Landmark based head pose estimation benchmark and method , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[52]  Erik Cambria,et al.  Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis , 2015, EMNLP.

[53]  Erik Cambria,et al.  PerSent 2.0: Persian Sentiment Lexicon Enriched with Domain-Specific Words , 2019, BICS.

[54]  Erik Cambria,et al.  The Hourglass Model Revisited , 2020, IEEE Intelligent Systems.

[55]  Erik Cambria,et al.  Bridging Cognitive Models and Recommender Systems , 2020, Cognitive Computation.

[56]  Erik Cambria,et al.  SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis , 2020, CIKM.

[57]  Verónica Pérez-Rosas,et al.  Multimodal Sentiment Analysis of Spanish Online Videos , 2013, IEEE Intelligent Systems.

[58]  Bilel Elayeb,et al.  Automatic Arabic Text Summarization Using Analogical Proportions , 2020, Cognitive Computation.

[59]  Amir Hussain,et al.  A novel multi-modal machine learning based approach for automatic classification of EEG recordings in dementia , 2019, Neural Networks.

[60]  Erik Cambria,et al.  Fusing audio, visual and textual clues for sentiment analysis from multimodal content , 2016, Neurocomputing.