Enabling 5G: sentimental image dominant graph topic model for cross-modality topic detection

Fifth generation mobile networks (5G) is coming into our life and it will not only provide an increase of 1000 times in Internet traffic in the next decade but will also offer “smarter” user experience. With the commercial uses of 5G, online social networks and smart phones will spring up again. Then cross-modality data will play a more important role in the daily information dissemination. As an effective way of content analysis, topic detection has attracted much research interest, but conventional topic analysis is undergoing the limitations from the cross-modality heterogenous data. This paper proposes a sentimental image dominant graph topic model, that can detect the topic from the heterogenous data and mine the sentiment of each topic. In details, we design a topic model to transfer both the low-level visual modality and the high-level text modality into a semantic manifold, and improve the discriminative power of CNN feature by jointly optimizing the output of both convolutional layer and fully-connected layer. Furthermore, since the sentimental impact is very significant for understanding the intrinsic meaning of topics, we introduce a semantic score of subjective sentences to calculate the sentiment on the base of the contextual sentence structure. The comparison experiments on the public cross-modality benchmark show the promising performance of our model. So our method using AI technology will facilitate the intellectualization of 5G.

[1]  Wenwu Zhu,et al.  Learning Compact Hash Codes for Multimodal Representations Using Orthogonal Deep Structure , 2015, IEEE Transactions on Multimedia.

[2]  Hagai Attias,et al.  Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Qingming Huang,et al.  Partial-Duplicate Image Retrieval via Saliency-Guided Visual Matching , 2013, IEEE MultiMedia.

[4]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[5]  Zhouyu Fu,et al.  Recognition of Pornographic Web Pages by Classifying Texts and Images , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Qingming Huang,et al.  LSH-based semantic dictionary learning for large scale image understanding , 2015, J. Vis. Commun. Image Represent..

[7]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[8]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[9]  Eric P. Xing,et al.  Select-Additive Learning: Improving Cross-individual Generalization in Multimodal Sentiment Analysis , 2016, ArXiv.

[10]  Jianxiong Xiao,et al.  What makes an image memorable? , 2011, CVPR 2011.

[11]  Gang Hua,et al.  Semi-supervised Relational Topic Model for Weakly Annotated Image Recognition in Social Media , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[13]  Juan-Zi Li,et al.  How Do Your Friends on Social Media Disclose Your Emotions? , 2014, AAAI.

[14]  Shuai Wang,et al.  Deep learning for sentiment analysis: A survey , 2018, WIREs Data Mining Knowl. Discov..

[15]  Yue Gao,et al.  Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression , 2017, IEEE Transactions on Multimedia.

[16]  Hugo Larochelle,et al.  A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Hrishikesh B. Aradhye,et al.  Video2Text: Learning to Annotate Video Content , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[19]  Muhamad Asvial,et al.  5G as Intelligent System: Model and Regulatory Consequences , 2016 .

[20]  Yueting Zhuang,et al.  Multi-modal Mutual Topic Reinforce Modeling for Cross-media Retrieval , 2014, ACM Multimedia.

[21]  Trevor Darrell,et al.  Learning cross-modality similarity for multinomial data , 2011, 2011 International Conference on Computer Vision.

[22]  Jie Tang,et al.  Understanding the emotional impact of images , 2012, ACM Multimedia.

[23]  Erik Cambria,et al.  Tensor Fusion Network for Multimodal Sentiment Analysis , 2017, EMNLP.

[24]  Samit Bhattacharya,et al.  Using Deep and Convolutional Neural Networks for Accurate Emotion Classification on DEAP Dataset , 2017, AAAI.

[25]  Lifeng Sun,et al.  Social-Aware Video Recommendation for Online Social Groups , 2017, IEEE Transactions on Multimedia.

[26]  Yongdong Zhang,et al.  Novel Visual and Statistical Image Features for Microblogs News Verification , 2017, IEEE Transactions on Multimedia.

[27]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[28]  Jiebo Luo,et al.  Aesthetics and Emotions in Images , 2011, IEEE Signal Processing Magazine.

[29]  Qingming Huang,et al.  A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Gabriela Csurka,et al.  Assessing the aesthetic quality of photographs using generic image descriptors , 2011, 2011 International Conference on Computer Vision.

[31]  Yalou Huang,et al.  Hashtag Graph Based Topic Model for Tweet Mining , 2014, 2014 IEEE International Conference on Data Mining.

[32]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[33]  Erik Cambria,et al.  Context-Dependent Sentiment Analysis in User-Generated Videos , 2017, ACL.

[34]  Qianhua He,et al.  A survey on emotional semantic image retrieval , 2008, 2008 15th IEEE International Conference on Image Processing.

[35]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[36]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[37]  Qingming Huang,et al.  Attentive Recurrent Neural Network for Weak-supervised Multi-label Image Classification , 2018, ACM Multimedia.

[38]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[39]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[40]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .

[41]  Bing Li,et al.  Scaring or pleasing: exploit emotional impact of an image , 2012, ACM Multimedia.

[42]  Bruno Ohana,et al.  Opinion mining with the SentWordNet lexical resource , 2009 .

[43]  Jiebo Luo,et al.  Jointly Image Topic and Emotion Detection using Multi-Modal Hierarchical Latent Dirichlet Allocation , 2014, J. Multim. Inf. Syst..

[44]  Ho-Lung Hung Application firefly algorithm for peak-to-average power ratio reduction in OFDM systems , 2017, Telecommun. Syst..

[45]  Kostas E. Psannis,et al.  Cognitive Radio Network and Network Service Chaining toward 5G: Challenges and Requirements , 2017, IEEE Communications Magazine.

[46]  Martha Larson,et al.  Intent and its discontents: the user at the wheel of the online video search engine , 2012, ACM Multimedia.

[47]  William I. Grosky,et al.  Narrowing the semantic gap - improved text-based web document retrieval using visual features , 2002, IEEE Trans. Multim..

[48]  Stefan Winkler,et al.  Emotion-based sequence of family photos , 2012, ACM Multimedia.

[49]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[50]  Nicu Sebe,et al.  In the eye of the beholder: employing statistical analysis and eye tracking for analyzing abstract paintings , 2012, ACM Multimedia.

[51]  Nicu Sebe,et al.  Emotional valence categorization using holistic image features , 2008, 2008 15th IEEE International Conference on Image Processing.

[52]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[53]  Shancang Li,et al.  5G Internet of Things: A survey , 2018, J. Ind. Inf. Integr..

[54]  Jufeng Yang,et al.  Learning Visual Sentiment Distributions via Augmented Conditional Probability Neural Network , 2017, AAAI.

[55]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[56]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[57]  Qingming Huang,et al.  Learning Hierarchical Semantic Description Via Mixed-Norm Regularization for Image Understanding , 2012, IEEE Transactions on Multimedia.

[58]  Qingming Huang,et al.  Dependency Exploitation: A Unified CNN-RNN Approach for Visual Emotion Recognition , 2017, IJCAI.

[59]  Xiaohu You,et al.  An overview of transmission theory and techniques of large-scale antenna systems for 5G wireless communications , 2016, Science China Information Sciences.

[60]  Allan Hanbury,et al.  Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[61]  Gerhard P. Hancke,et al.  A Survey on 5G Networks for the Internet of Things: Communication Technologies and Challenges , 2018, IEEE Access.

[62]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[63]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[64]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[65]  Tao Mei,et al.  SocialTransfer: cross-domain transfer learning from social streams for media applications , 2012, ACM Multimedia.

[66]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[67]  Daniel P. W. Ellis,et al.  IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System , 2011, TRECVID.

[68]  Qingming Huang,et al.  Distributed image understanding with semantic dictionary and semantic expansion , 2016, Neurocomputing.

[69]  Alagan Anpalagan,et al.  Towards the fulfillment of 5G network requirements: technologies and challenges , 2016, Telecommunication Systems.

[70]  Jie Tang,et al.  Can we understand van gogh's mood?: learning to infer affects from images in social networks , 2012, ACM Multimedia.

[71]  Tao Mei,et al.  Towards Cross-Domain Learning for Social Video Popularity Prediction , 2013, IEEE Transactions on Multimedia.

[72]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[73]  Jiebo Luo,et al.  Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks , 2015, AAAI.

[74]  Navrati Saxena,et al.  Next Generation 5G Wireless Networks: A Comprehensive Survey , 2016, IEEE Communications Surveys & Tutorials.

[75]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, CVPR.

[76]  Qingming Huang,et al.  Effective Multimodality Fusion Framework for Cross-Media Topic Detection , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[77]  Ting Rui,et al.  Joint user-interest and social-influence emotion prediction for individuals , 2017, Neurocomputing.

[78]  Jiebo Luo,et al.  Visual Sentiment Analysis by Attending on Local Image Regions , 2017, AAAI.