论文信息 - Fusion of Multichannel Local and Global Structural Cues for Photo Aesthetics Evaluation

Fusion of Multichannel Local and Global Structural Cues for Photo Aesthetics Evaluation

Photo aesthetic quality evaluation is a fundamental yet under addressed task in computer vision and image processing fields. Conventional approaches are frustrated by the following two drawbacks. First, both the local and global spatial arrangements of image regions play an important role in photo aesthetics. However, existing rules, e.g., visual balance, heuristically define which spatial distribution among the salient regions of a photo is aesthetically pleasing. Second, it is difficult to adjust visual cues from multiple channels automatically in photo aesthetics assessment. To solve these problems, we propose a new photo aesthetics evaluation framework, focusing on learning the image descriptors that characterize local and global structural aesthetics from multiple visual channels. In particular, to describe the spatial structure of the image local regions, we construct graphlets small-sized connected graphs by connecting spatially adjacent atomic regions. Since spatially adjacent graphlets distribute closely in their feature space, we project them onto a manifold and subsequently propose an embedding algorithm. The embedding algorithm encodes the photo global spatial layout into graphlets. Simultaneously, the importance of graphlets from multiple visual channels are dynamically adjusted. Finally, these post-embedding graphlets are integrated for photo aesthetics evaluation using a probabilistic model. Experimental results show that: 1) the visualized graphlets explicitly capture the aesthetically arranged atomic regions; 2) the proposed approach generalizes and improves four prominent aesthetic rules; and 3) our approach significantly outperforms state-of-the-art algorithms in photo aesthetics prediction.

[1] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2] Xuelong Li,et al. Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[3] Naila Murray,et al. AVA: A large-scale database for aesthetic visual analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Sebastian Nowozin,et al. On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5] Zaïd Harchaoui,et al. Image Classification with Segmentation Graph Kernels , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Masashi Nishiyama,et al. Aesthetic quality classification of photographs based on color harmony , 2011, CVPR 2011.

[7] Bingbing Ni,et al. Learning to photograph , 2010, ACM Multimedia.

[8] Wen Gao,et al. Learning to Distribute Vocabulary Indexing for Scalable Visual Search , 2013, IEEE Transactions on Multimedia.

[9] Xiaogang Wang,et al. Content-based photo quality assessment , 2011, 2011 International Conference on Computer Vision.

[10] Xiao Liu,et al. Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Qi Tian,et al. Task-Dependent Visual-Codebook Compression , 2012, IEEE Transactions on Image Processing.

[12] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[13] Markus A. Stricker,et al. Similarity of color images , 1995, Electronic Imaging.

[14] Qi Tian,et al. Less is More: Efficient 3-D Object Retrieval With Query View Selection , 2011, IEEE Transactions on Multimedia.

[15] Andrew Zisserman,et al. The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[16] Yoichi Sato,et al. Sensation-based photo cropping , 2009, ACM Multimedia.

[17] Kap Luk Chan,et al. Towards an unsupervised optimal fuzzy clustering algorithm for image database organization , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[18] Wen Gao,et al. Towards low bit rate mobile visual search with multiple-channel coding , 2011, ACM Multimedia.

[19] Michael Werman,et al. Similarity and Affine Invariant Distances Between 2D Point Sets , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[20] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[21] Dacheng Tao,et al. Biologically Inspired Feature Manifold for Scene Classification , 2010, IEEE Transactions on Image Processing.

[22] Hao Su,et al. Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[23] Mubarak Shah,et al. A framework for photo-quality assessment and enhancement based on visual aesthetics , 2010, ACM Multimedia.

[24] Yi Yang,et al. Discovering Discriminative Graphlets for Aerial Image Categories Recognition , 2013, IEEE Transactions on Image Processing.

[25] Vicente Ordonez,et al. High level describable attributes for predicting aesthetics and interestingness , 2011, CVPR 2011.

[26] Xuelong Li,et al. Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search , 2013, IEEE Transactions on Image Processing.

[27] David J. Hand,et al. Kernel Discriminant Analysis , 1983 .

[28] C. V. Jawahar,et al. Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29] Yong Luo,et al. Manifold Regularized Multitask Learning for Semi-Supervised Multilabel Image Classification , 2013, IEEE Transactions on Image Processing.

[30] W. Eric L. Grimson,et al. Spatial Latent Dirichlet Allocation , 2007, NIPS.

[31] Xiao Liu,et al. Probabilistic Graphlet Transfer for Photo Cropping , 2013, IEEE Transactions on Image Processing.

[32] Zhigang Luo,et al. Manifold Regularized Discriminative Nonnegative Matrix Factorization With Fast Gradient Descent , 2011, IEEE Transactions on Image Processing.

[33] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[34] Antonio Torralba,et al. Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[35] Maurice Herlihy,et al. A methodology for implementing highly concurrent data objects , 1993, TOPL.

[36] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37] James Ze Wang,et al. Studying Aesthetics in Photographic Images Using a Computational Approach , 2006, ECCV.

[38] Yongdong Zhang,et al. Multiview Spectral Embedding , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[39] Pietro Perona,et al. Graph-Based Visual Saliency , 2006, NIPS.

[40] Ming Ouhyoung,et al. Personalized photograph ranking and selection system , 2010, ACM Multimedia.

[41] Yan Ke,et al. The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).