Considering user agreement in learning to predict the aesthetic quality

How to robustly rank the aesthetic quality of given images has been a long-standing ill-posed topic. Such challenge stems mainly from the diverse subjective opinions of different observers about the varied types of content. There is a growing interest in estimating the user agreement by considering the standard deviation (σ) of the scores, instead of only predicting the mean aesthetic opinion score (μ). Nevertheless, when comparing a pair of contents, few studies consider how confident are we regarding the difference in the aesthetic scores. In this paper, we thus propose (1) a re-adapted multi-task attention network to predict both the mean opinion score and the standard deviation in an end-to-end manner; (2) a brand-new confidence interval ranking loss that encourages the model to focus on image-pairs that are less certain about the difference of their aesthetic scores. With such loss, the model is encouraged to learn the uncertainty of the content that is relevant to the diversity of observers’ opinions, i.e., user disagreement. Extensive experiments have demonstrated that the proposed multi-task aesthetic model achieves state-of-the-art performance on two different types of aesthetic datasets, i.e., AVA and TMGA.

[1]  Naila Murray,et al.  AVA: A large-scale database for aesthetic visual analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  D. Rumsey Statistics II for Dummies , 2003 .

[3]  James Zijun Wang,et al.  Rating Image Aesthetics Using Deep Learning , 2015, IEEE Transactions on Multimedia.

[4]  Xin Jin,et al.  Deep Multimodality Learning for UAV Video Aesthetic Quality Assessment , 2020, IEEE Transactions on Multimedia.

[5]  Junle Wang,et al.  A Subjective Study of Multi-Dimensional Aesthetic Assessment for Mobile Game Image , 2020, QoEVMA @ ACM Multimedia.

[6]  Chong Wang,et al.  Visual aesthetic quality assessment with a regression model , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[7]  Yi Wang,et al.  Spatial Attentive Image Aesthetic Assessment , 2020, 2020 IEEE International Conference on Multimedia and Expo (ICME).

[8]  Dietmar Saupe,et al.  Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  John See,et al.  Multigap: Multi-pooled inception network with text augmentation for aesthetic prediction of photographs , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[10]  Radomír Mech,et al.  Deep Multi-patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Peyman Milanfar,et al.  NIMA: Neural Image Assessment , 2017, IEEE Transactions on Image Processing.

[12]  Kaiqi Huang,et al.  Hierarchical aesthetic quality assessment using deep convolutional neural networks , 2016, Signal Process. Image Commun..

[13]  Hendrik P. A. Lensch,et al.  Will People Like Your Image? Learning the Aesthetic Space , 2016, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Raviteja Vemulapalli,et al.  Camera View Adjustment Prediction for Improving Image Composition , 2021, ArXiv.

[15]  Andrew J. Davison,et al.  End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Edward A. Vessel,et al.  The Aesthetic Responsiveness Assessment (AReA): A screening tool to assess individual differences in responsiveness to art in English and German. , 2020 .

[17]  Marcus Barkowsky,et al.  Modeling and estimating the subjects' diversity of opinions in video quality assessment: a neural network based approach , 2021, Multim. Tools Appl..

[18]  Patrick Le Callet,et al.  Strategy for Boosting Pair Comparison and Improving Quality Assessment Accuracy , 2020, ArXiv.

[19]  Andreas Hotho,et al.  Self-Supervised Multi-Task Pretraining Improves Image Aesthetic Assessment , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Shuang Ma,et al.  A-Lamp: Adaptive Layout-Aware Multi-patch Deep Convolutional Neural Network for Photo Aesthetic Assessment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Radomír Mech,et al.  Photo Aesthetics Ranking Network with Attributes and Content Adaptation , 2016, ECCV.

[22]  Jianping Fan,et al.  Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Tsuhan Chen,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < , 2022 .