Scoot: A Perceptual Metric for Facial Sketches

While it is trivial for humans to quickly assess the perceptual similarity between two images, the underlying mechanism are thought to be quite complex. Despite this, the most widely adopted perceptual metrics today, such as SSIM and FSIM, are simple, shallow functions, and fail to consider many factors of human perception. Recently, the facial modeling community has observed that the inclusion of both structure and texture has a significant positive benefit for face sketch synthesis (FSS). But how perceptual are these so-called “perceptual features”? Which elements are critical for their success? In this paper, we design a perceptual metric, called Structure Co-Occurrence Texture (Scoot), which simultaneously considers the block-level spatial structure and co-occurrence texture statistics. To test the quality of metrics, we propose three novel meta-measures based on various reliable properties. Extensive experiments verify that our Scoot metric exceeds the performance of prior work. Besides, we built the first largest scale (152k judgments) human-perception-based sketch database that can evaluate how well a metric consistent with human perception. Our results suggest that “spatial structure” and “co-occurrence texture” are two generally applicable perceptual features in face sketch synthesis.

[1]  D. E. Roberts,et al.  The Upper Tail Probabilities of Spearman's Rho , 1975 .

[2]  Hao Zhou,et al.  Markov Weight Fields for face sketch synthesis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Zheng Lin,et al.  Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Xiaogang Wang,et al.  Face sketch synthesis and recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Jie Li,et al.  Adaptive representation-based face sketch-photo synthesis , 2017, Neurocomputing.

[6]  Frédo Durand,et al.  Programmable Style for NPR Line Drawing , 2004, Rendering Techniques.

[7]  Ahmed M. Elgammal,et al.  Link the Head to the "Beak": Zero Shot Learning from Noisy Text Description at Part Precision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[9]  David Zhang,et al.  Multi-channel Weighted Nuclear Norm Minimization for Real Color Image Denoising , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Irwin Sobel,et al.  An Isotropic 3×3 image gradient operator , 1990 .

[11]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[12]  Bin Song,et al.  Data-driven vs. model-driven: Fast face sketch synthesis , 2017, Neurocomputing.

[13]  Xinbo Gao,et al.  Deep Graphical Feature Learning for Face Sketch Synthesis , 2017, IJCAI.

[14]  David Salesin,et al.  Computer-generated pen-and-ink illustration , 1994, SIGGRAPH.

[15]  Bin Song,et al.  Evaluation on synthesized face sketches , 2016, Neurocomputing.

[16]  Yue Gao,et al.  Robust Face Sketch Synthesis via Generative Adversarial Fusion of Priors and Parametric Sigmoid , 2018, IJCAI.

[17]  Xuelong Li,et al.  Multiple Representations-Based Face Sketch–Photo Synthesis , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Ahmed M. Elgammal,et al.  A Multilayer-Based Framework for Online Background Subtraction with Freely Moving Cameras , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Xinbo Gao,et al.  Face Sketch Synthesis From a Single Photo–Sketch Pair , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Jianbing Shen,et al.  Real-Time Superpixel Segmentation by DBSCAN Clustering Algorithm. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[21]  Xinbo Gao,et al.  Random sampling for fast face sketch synthesis , 2017, Pattern Recognit..

[22]  Liang Lin,et al.  Content-Adaptive Sketch Portrait Generation by Decompositional Representation Learning , 2017, IEEE Transactions on Image Processing.

[23]  David A Clausi An analysis of co-occurrence texture statistics as a function of grey level quantization , 2002 .

[24]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Jie Li,et al.  Superpixel-Based Face Sketch–Photo Synthesis , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Xiaogang Wang,et al.  Face sketch recognition , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[27]  Vishal M. Patel,et al.  High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[28]  Yunsong Li,et al.  Markov Random Neural Fields for Face Sketch Synthesis , 2018, IJCAI.

[29]  Eli Shechtman,et al.  Example-based synthesis of stylized facial animations , 2017, ACM Trans. Graph..

[30]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[31]  Jie Li,et al.  Compositional Model-Based Sketch Generator in Facial Entertainment , 2018, IEEE Transactions on Cybernetics.

[32]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[33]  Chunna Tian,et al.  Face Sketch Synthesis Algorithm Based on E-HMM and Selective Ensemble , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Jiangjiang Liu,et al.  Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground , 2018, ECCV.

[35]  Ming-Ming Cheng,et al.  IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Fang Liu,et al.  A modified convolutional neural network for face sketch synthesis , 2018, Pattern Recognit..

[37]  Joshua B. Tenenbaum,et al.  Learning style translation for the lines of a drawing , 2003, TOGS.

[38]  Yi-Chung Chen,et al.  Facial Sketch Synthesis Using 2D Direct Combined Model-Based Face-Specific Markov Network , 2016, IEEE Transactions on Image Processing.

[39]  Xiaogang Wang,et al.  Random Sampling for Subspace Face Recognition , 2006, International Journal of Computer Vision.

[40]  Amit R.Sharma,et al.  Face Photo-Sketch Synthesis and Recognition , 2012 .

[41]  Paul L. Rosin,et al.  Clinical Skin Lesion Diagnosis Using Representations Inspired by Dermatologist Criteria , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Lihi Zelnik-Manor,et al.  How to Evaluate Foreground Maps , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Fei Gao,et al.  Composition-aided Sketch-realistic Portrait Generation , 2017, ArXiv.

[44]  Hanqing Lu,et al.  A nonlinear approach for face sketch synthesis and recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[45]  Wei Liu,et al.  Bayesian Tensor Inference for Sketch-Based Facial Photo Hallucination , 2007, IJCAI.

[46]  Andrea Baraldi,et al.  An investigation of the textural characteristics associated with gray level cooccurrence matrix statistical parameters , 1995, IEEE Transactions on Geoscience and Remote Sensing.

[47]  Bo Ren,et al.  Enhanced-alignment Measure for Binary Foreground Map Evaluation , 2018, IJCAI.

[48]  Xinbo Gao,et al.  Face Sketch Synthesis via Sparse Representation-Based Greedy Search , 2015, IEEE Transactions on Image Processing.

[49]  Xiaogang Wang,et al.  Random sampling LDA for face recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[50]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[51]  Yang Cao,et al.  Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Fei Gao,et al.  DeepSim: Deep similarity for image quality assessment , 2017, Neurocomputing.

[53]  Xuelong Li,et al.  Transductive Face Sketch-Photo Synthesis , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[54]  Quan Pan,et al.  Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2019, Computational Visual Media.

[56]  Xinbo Gao,et al.  Fast Face Sketch Synthesis via KD-Tree Search , 2016, ECCV Workshops.

[57]  Yunsong Li,et al.  Face Sketch Synthesis From Coarse to Fine , 2018, AAAI.

[58]  Xuelong Li,et al.  Face Sketch–Photo Synthesis and Retrieval Using Sparse Representation , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[59]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Ming-Hsuan Yang,et al.  Real-Time Exemplar-Based Face Sketch Synthesis , 2014, ECCV.

[61]  Jie Li,et al.  Bayesian Face Sketch Synthesis , 2017, IEEE Transactions on Image Processing.

[62]  Gustavo de Veciana,et al.  An information fidelity criterion for image quality assessment using natural scene statistics , 2005, IEEE Transactions on Image Processing.

[63]  Ren Bo,et al.  FLIC: Fast linear iterative clustering with active search , 2016, Computational Visual Media.

[64]  A. Bovik,et al.  A universal image quality index , 2002, IEEE Signal Processing Letters.

[65]  Lei Zhang,et al.  Gradient Magnitude Similarity Deviation: A Highly Efficient Perceptual Image Quality Index , 2013, IEEE Transactions on Image Processing.

[66]  Ling Shao,et al.  Submodular Trajectories for Better Motion Segmentation in Videos , 2018, IEEE Transactions on Image Processing.

[67]  Xiaogang Wang,et al.  Lighting and Pose Robust Face Sketch Synthesis , 2010, ECCV.

[68]  Xuelong Li,et al.  Face Sketch Synthesis by Multidomain Adversarial Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[69]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[71]  Peyman Milanfar,et al.  NIMA: Neural Image Assessment , 2017, IEEE Transactions on Image Processing.

[72]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[73]  Xinbo Gao,et al.  Photo-sketch synthesis and recognition based on subspace learning , 2010, Neurocomputing.

[74]  Xuelong Li,et al.  Local face sketch synthesis learning , 2008, Neurocomputing.

[75]  Alan C. Bovik,et al.  Image information and visual quality , 2006, IEEE Trans. Image Process..

[76]  王晓刚,et al.  Coupled Information-Theoretic Encoding for Face Photo-Sketch Recognition , 2011 .

[77]  Xuelong Li,et al.  A Comprehensive Survey to Face Hallucination , 2013, International Journal of Computer Vision.

[78]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[79]  Jie Li,et al.  Robust Face Sketch Style Synthesis. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[80]  Wenguan Wang,et al.  Shifting More Attention to Video Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[81]  Kai Wang,et al.  A Benchmark for Automatic Visual Classification of Clinical Skin Disease Images , 2016, ECCV.

[82]  Shaogang Gong,et al.  Free-Hand Sketch Synthesis with Deformable Stroke Models , 2016, International Journal of Computer Vision.

[83]  Mary M. Galloway,et al.  Texture analysis using gray level run lengths , 1974 .

[84]  Lei Zhang,et al.  End-to-End Photo-Sketch Generation via Fully Convolutional Representation Learning , 2015, ICMR.

[85]  Tobias Isenberg,et al.  Neural style transfer: a paradigm shift for image-based artistic rendering? , 2017, NPAR '17.

[86]  Jordi Pont-Tuset,et al.  Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[87]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[88]  Xiaogang Wang,et al.  Dual-space linear discriminant analysis for face recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[89]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[90]  Xuelong Li,et al.  Lazy Random Walks for Superpixel Segmentation , 2014, IEEE Transactions on Image Processing.

[91]  Xuelong Li,et al.  Heterogeneous image transformation , 2013, Pattern Recognit. Lett..

[92]  Steven W. Zucker,et al.  Two Stages of Curve Detection Suggest Two Styles of Visual Computation , 1989, Neural Computation.

[93]  D. Gabor,et al.  Theory of communication. Part 1: The analysis of information , 1946 .