Low-rank regularized multi-view inverse-covariance estimation for visual sentiment distribution prediction

Abstract With the increasing tendency of using images to express opinions and share experiences, sentiment analysis of visual content has aroused considerable attention interests in the past few years. Traditional sentiment analysis methods mainly focus on predicting the most dominant sentiment category of images while neglecting the sentiment ambiguity problem restricted by various factors such as environment, subjectivity, and cultural background. To tackle this problem, visual sentiment distribution prediction has been put forward to characterize images by distributions over a set of sentiment labels instead of a single distinct label or multiple distinct labels. Nevertheless, existing approaches usually separate feature embedding and distribution prediction. In this paper, we propose a novel supervised visual sentiment distribution prediction model, termed as low-rank regularized multi-view inverse-covariance estimation, in which feature embedding and distribution prediction are jointly performed. Specifically, our proposed model contains two main components: multi-view embedding and inverse-covariance estimation terms. The multi-view embedding term is restricted by low-rank constraints to seek the lowest-rank representation of samples. The inverse-covariance estimation term is restricted by structured sparsity regularization to learn a more reasonable distribution prediction model. We develop an alternative heuristic optimization algorithm to solve the objective function of the proposed model. Experiment results performed on three publicly available datasets demonstrate the effectiveness of our proposed scheme compared with state-of-the-art algorithms.

[1]  Jialie Shen,et al.  On Effective Location-Aware Music Recommendation , 2016, ACM Trans. Inf. Syst..

[2]  Heng Tao Shen,et al.  Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Jiwen Lu,et al.  Co-Learned Multi-View Spectral Clustering for Face Recognition Based on Image Sets , 2014, IEEE Signal Processing Letters.

[4]  Boyang Li,et al.  Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization , 2015, IEEE Transactions on Affective Computing.

[5]  Xin Geng,et al.  Label Distribution Learning , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[6]  Meng Wang,et al.  Oracle in Image Search: A Content-Based Approach to Performance Prediction , 2012, TOIS.

[7]  Liang Wang,et al.  Cross-Modal Subspace Learning via Pairwise Constraints , 2014, IEEE Transactions on Image Processing.

[8]  Ling Shao,et al.  Dynamic Multi-View Hashing for Online Image Retrieval , 2017, IJCAI.

[9]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[11]  Ke Lu,et al.  Transfer Independently Together: A Generalized Framework for Domain Adaptation , 2019, IEEE Transactions on Cybernetics.

[12]  Jianxin Wu,et al.  Deep Label Distribution Learning With Label Ambiguity , 2016, IEEE Transactions on Image Processing.

[13]  Adam J Rothman,et al.  Sparse Multivariate Regression With Covariance Estimation , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[14]  Silvia Corchs,et al.  Ensemble learning on visual and textual data for social image emotion classification , 2019, Int. J. Mach. Learn. Cybern..

[15]  Yue Gao,et al.  Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression , 2017, IEEE Transactions on Multimedia.

[16]  Zi Huang,et al.  Discrete Multimodal Hashing With Canonical Views for Robust Mobile Landmark Search , 2017, IEEE Transactions on Multimedia.

[17]  Liqiang Nie,et al.  Predicting Image Memorability Through Adaptive Transfer Learning From External Sources , 2017, IEEE Transactions on Multimedia.

[18]  Jun Fang,et al.  Sparse Bayesian dictionary learning with a Gaussian hierarchical model , 2017, Signal Process..

[19]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[20]  Yuting Su,et al.  Rank canonical correlation analysis and its application in visual search reranking , 2013, Signal Process..

[21]  Emmanuel J. Candès,et al.  Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements , 2011, IEEE Transactions on Information Theory.

[22]  Yue Gao,et al.  Beyond Text QA: Multimedia Answer Generation by Harvesting Web Information , 2013, IEEE Transactions on Multimedia.

[23]  Ivan W. Selesnick,et al.  Improved sparse low-rank matrix estimation , 2016, Signal Process..

[24]  Erik Cambria,et al.  Fusing audio, visual and textual clues for sentiment analysis from multimodal content , 2016, Neurocomputing.