论文信息 - Learning a Discriminative Model for the Perception of Realism in Composite Images

Learning a Discriminative Model for the Perception of Realism in Composite Images

What makes an image appear realistic? In this work, we are answering this question from a data-driven perspective by learning the perception of visual realism directly from large amounts of data. In particular, we train a Convolutional Neural Network (CNN) model that distinguishes natural photographs from automatically generated composite images. The model learns to predict visual realism of a scene in terms of color, lighting and texture compatibility, without any human annotations pertaining to it. Our model outperforms previous works that rely on hand-crafted heuristics, for the task of classifying realistic vs. unrealistic photos. Furthermore, we apply our learned model to compute optimal parameters of a compositing method, to maximize the visual realism score predicted by our CNN model. We demonstrate its advantage against existing methods via a human perception study.

[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3] Sebastian Nowozin,et al. Discriminative Non-blind Deblurring , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Jean-François Lalonde,et al. The Perception of Lighting Inconsistencies in Composite Outdoor Scenes , 2015, ACM Trans. Appl. Percept..

[5] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[6] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[7] Adam Finkelstein,et al. A no-reference metric for evaluating the quality of motion deblurring , 2013, ACM Trans. Graph..

[8] Michael Elad,et al. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[9] Hany Farid,et al. Exposing digital forgeries by detecting traces of resampling , 2005, IEEE Transactions on Signal Processing.

[10] Julie Dorsey,et al. Understanding and improving the realism of image composites , 2012, ACM Trans. Graph..

[11] Erik Reinhard,et al. Real-time color blending of rendered and captured video , 2004 .

[12] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Vladlen Koltun,et al. Geodesic Object Proposals , 2014, ECCV.

[14] Martin J. Wainwright,et al. Image denoising using scale mixtures of Gaussians in the wavelet domain , 2003, IEEE Trans. Image Process..

[15] Alexei A. Efros,et al. Photo clip art , 2007, ACM Trans. Graph..

[16] Alexei A. Efros,et al. Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[17] Erik Reinhard,et al. Color Transfer between Images , 2001, IEEE Computer Graphics and Applications.

[18] Edward H. Adelson,et al. A multiresolution spline with application to image mosaics , 1983, TOGS.

[19] Yair Weiss,et al. From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.

[20] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[22] Maya R. Gupta,et al. How to Analyze Paired Comparison Data , 2011 .

[23] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24] Yacov Hel-Or,et al. A Discriminative Approach for Wavelet Denoising , 2008, IEEE Transactions on Image Processing.

[25] Alexei A. Efros,et al. Using Color Compatibility for Assessing Image Realism , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[27] Eli Shechtman,et al. Image melding , 2012, ACM Trans. Graph..

[28] Sylvain Paris,et al. Error-Tolerant Image Compositing , 2010, International Journal of Computer Vision.

[29] Wojciech Matusik,et al. CG2Real: Improving the Realism of Computer Generated Images Using a Large Collection of Photographs , 2011, IEEE Transactions on Visualization and Computer Graphics.

[30] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[31] Patrick Pérez,et al. Poisson image editing , 2003, ACM Trans. Graph..

[32] James M. Rehg,et al. A data-driven approach to quantifying natural human motion , 2005, SIGGRAPH '05.

[33] Karl F. MacDorman,et al. The Uncanny Valley [From the Field] , 2012, IEEE Robotics Autom. Mag..

[34] James F. O'Brien,et al. Exposing Photo Manipulation from Shading and Shadows , 2014, ACM Trans. Graph..

[35] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.