论文信息 - Scoring Photographic Rule of Thirds in a Large MIRFLICKR Dataset: A Showdown Between Machine Perception and Human Perception of Image Aesthetics

Scoring Photographic Rule of Thirds in a Large MIRFLICKR Dataset: A Showdown Between Machine Perception and Human Perception of Image Aesthetics

In this research we have developed and evaluated a system that uses the image compositional metric called ‘Rule of Thirds’ used by photographers to grade visual aesthetics of an image. The novel aspect of the work is that it combines quantitative and qualitative aspects of research by taking human psychology into account. The core idea is to identify how similar the perception of a ‘good image’ and ‘bad image’ is by machines versus humans (through a user study based on 255 participants on 5000 images from the standard MIRFLICKR database [9]). We have considered the compositional norm, namely ‘rule of thirds’ used by photographers and inspired by the golden ratio that states that - if an image is segmented on a 3 × 3 grid, then it is appealing to the eye when the most salient object(s) or ‘subject(s)’ of the image is located precisely on or aligned on the middle grid lines [11]. First, we preprocess the input image by labeling the regions of attraction for human eye using two saliency algorithms namely Graph-Based Visual Saliency (GBVS) [3] and Itti-Koch [4]. Next, we quantify the rule of thirds property in images by mathematically considering the location of salient region(s) adhering to rule of thirds. This is then used to rank or score an input image. To validate, we conducted a user study where 255 human subjects ranked the images and compared our algorithmic results, making it a both a quantitative and qualitative research. We have also analyzed and presented the performance differences between two saliency algorithms and presented ROC plots along with similarity quantification between algorithms and human subjects. Our massive user study and experimental results provides the evidence of modern machine’s ability to mimic human-like behavior. Along with it, results computationally prove significance of rule of thirds.

[1] Yuzhen Niu,et al. Rule of Thirds Detection from Photograph , 2011, 2011 IEEE International Symposium on Multimedia.

[2] C. Koch,et al. A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[3] Jana Kosecka,et al. Visual door detection integrating appearance and shape cues , 2008, Robotics Auton. Syst..

[4] Mark J. Huiskes,et al. The MIR flickr retrieval evaluation , 2008, MIR '08.

[5] Bryan Peterson,et al. Learning to See Creatively , 1988 .

[6] C. Redies,et al. Evaluating the Rule of Thirds in Photographs and Paintings , 2014 .

[7] Pietro Perona,et al. Graph-Based Visual Saliency , 2006, NIPS.

[8] M. Grgic,et al. Compositional rule of thirds detection , 2012, Proceedings ELMAR-2012.