Reflectance and texture encoding for material recognition and synthesis

OF THE DISSERTATION Reflectance and Texture Encoding for Material Recognition and Synthesis by Hang Zhang Dissertation Director: Dr. Kristin Dana Material recognition plays an important role for a machine to understand and interact with the world. For example, an autonomous vehicle can use material recognition to determine whether the terrain is asphalt, grass, gravel, ice or snow in order to optimize the mechanical control and a robot can easily grab an object with proper pressure if the object’s composition is known. This thesis is dedicated to developing compact and robust material and texture representations for fast visual recognition and synthesis. Color and geometry are not a full measure of the richness of visual appearance. Reflectance describes the characteristics of light interaction with a surface, which depends on microscopic and mesoscopic composition. This thesis explores how reflectance can reveal the material categories and physical properties. We build representations that capture the spatial and angular variation of reflectance. These multi-layer deep learning representations provide invariance to intra-class variations for recognition. We also develop representations that capture sufficient detail for synthesis. In particular, this thesis develop the following methods: 1. Reflectance Hashing: Reflectance is challenging to measure and use for recognizing materials due to its high-dimensionality. In this work, we bypass the use of a gonioreflectometer by using a novel one-shot reflectance camera based on a parabolic

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Ko Nishino,et al.  Shape and Reflectance from Natural Illumination , 2012, ECCV.

[3]  Sanjiv Kumar,et al.  Angular Quantization-based Binary Codes for Fast Similarity Search , 2012, NIPS.

[4]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[6]  Josef Sivic,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Wolfgang Heidrich,et al.  Material Classification Using Raw Time-of-Flight Measurements , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[9]  Jeremy S. De Bonet,et al.  Multiresolution sampling procedure for analysis and synthesis of texture images , 1997, SIGGRAPH.

[10]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[12]  Andrew Zisserman,et al.  A Statistical Approach to Material Classification Using Image Patch Exemplars , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[14]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[15]  Fredrik Gustafsson,et al.  Slip-based tire-road friction estimation , 1997, Autom..

[16]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[18]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[19]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[20]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[21]  Mario Fritz,et al.  Recognizing Materials from Virtual Examples , 2012, ECCV.

[22]  Jonathon Shlens,et al.  A Learned Representation For Artistic Style , 2016, ICLR.

[23]  Hang Zhang,et al.  Multi-style Generative Network for Real-time Transfer , 2017, ECCV Workshops.

[24]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[25]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[26]  Marc Pollefeys,et al.  Pulling Things out of Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Iasonas Kokkinos,et al.  Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Ken Perlin,et al.  Measuring bidirectional texture reflectance with a kaleidoscope , 2003, ACM Trans. Graph..

[29]  Kristin J. Dana BRDF/BTF measurement device , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[30]  Maryline Lewandowski,et al.  Relationship between Friction and Tactile Properties for Woven and Knitted Fabrics , 2007 .

[31]  Sung Yong Shin,et al.  On pixel-based texture synthesis by non-parametric sampling , 2006, Comput. Graph..

[32]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[33]  Kristin J. Dana,et al.  Compact representation of bidirectional texture functions , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[34]  Kristin J. Dana,et al.  Recognition methods for 3D textured surfaces , 2001, IS&T/SPIE Electronic Imaging.

[35]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[36]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[37]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[39]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Edward H. Adelson,et al.  Recognizing Materials Using Perceptually Inspired Features , 2013, International Journal of Computer Vision.

[41]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[42]  Chao Liu,et al.  Discriminative illumination: Per-pixel classification of raw materials based on optimal projections of spectral BRDF , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[44]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[45]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Kristin J. Dana,et al.  Relief texture from specularities , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Michael J. Jones,et al.  Morphable Reflectance Fields for enhancing face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Edward H. Adelson,et al.  Exploring features in a Bayesian framework for material recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[50]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[51]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[52]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[53]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[54]  Hang Zhang,et al.  Differential Angular Imaging for Material Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Song-Chun Zhu,et al.  Filters, Random Fields and Maximum Entropy (FRAME): Towards a Unified Theory for Texture Modeling , 1998, International Journal of Computer Vision.

[56]  Kristin J. Dana,et al.  Hybrid textons: modeling surfaces with reflectance and geometry , 2004, CVPR 2004.

[57]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[58]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[59]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[60]  Edward H. Adelson,et al.  On seeing stuff: the perception of materials by humans and machines , 2001, IS&T/SPIE Electronic Imaging.

[61]  Irfan A. Essa,et al.  Graphcut textures: image and video synthesis using graph cuts , 2003, ACM Trans. Graph..

[62]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[63]  Simon Lucey,et al.  Face alignment through subspace constrained mean-shifts , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[64]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[65]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[66]  Ko Nishino,et al.  Multiview Shape and Reflectance from Natural Illumination , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Kristin J. Dana,et al.  3D Texture Recognition Using Bidirectional Feature Histograms , 2004, International Journal of Computer Vision.

[68]  Hang Zhang,et al.  Photo-Realistic Facial Texture Transfer , 2017, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[69]  Ko Nishino,et al.  Single image multimaterial estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[70]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[71]  Andrew Zisserman,et al.  Classifying Images of Materials: Achieving Viewpoint and Illumination Independence , 2002, ECCV.

[72]  Barbara Caputo,et al.  Class-Specific Material Categorisation , 2005, ICCV.

[73]  Frédo Durand,et al.  Style transfer for headshot portraits , 2014, ACM Trans. Graph..

[74]  Andrea Vedaldi,et al.  Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[75]  Rajesh Rajamani,et al.  A novel wireless piezoelectric tire sensor for the estimation of slip angle , 2010 .

[76]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[77]  N. Kanwisher,et al.  Can generic expertise explain special processing for faces? , 2007, Trends in Cognitive Sciences.

[78]  Wojciech Matusik,et al.  A data-driven reflectance model , 2003, ACM Trans. Graph..

[79]  Chao Liu,et al.  Learning Discriminative Illumination and Filters for Raw Material Classification with Optimal Projections of Bidirectional Texture Functions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[80]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[81]  Ko Nishino,et al.  Reflectance and Natural Illumination from a Single Image , 2012, ECCV.

[82]  Kristin J. Dana,et al.  A novel approach for texture shape recovery , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[83]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[84]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[85]  Thomas S. Huang,et al.  Non-Local Kernel Regression for Image and Video Restoration , 2010, ECCV.

[86]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[87]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[88]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[89]  Shih-Fu Chang,et al.  Fast Orthogonal Projection Based on Kronecker Product , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[90]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[91]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[92]  Christoph H. Lampert,et al.  Deep Fisher Kernels -- End to End Learning of the Fisher Kernel GMM Parameters , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[93]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[94]  Hang Zhang,et al.  Reflectance hashing for material recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[95]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[96]  Shih-Fu Chang,et al.  Circulant Binary Embedding , 2014, ICML.

[97]  Florent Perronnin,et al.  Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[98]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[99]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[100]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[101]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[102]  Sanjiv Kumar,et al.  Learning Binary Codes for High-Dimensional Data Using Bilinear Projections , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[103]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[104]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[105]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[106]  Andrew W. Fitzgibbon,et al.  PiCoDes: Learning a Compact Code for Novel-Category Recognition , 2011, NIPS.

[107]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[108]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[109]  Wojciech Matusik,et al.  Efficient Isotropic BRDF Measurement , 2003, Rendering Techniques.

[110]  Pawan Sinha,et al.  Face Recognition by Humans: Nineteen Results All Computer Vision Researchers Should Know About , 2006, Proceedings of the IEEE.

[111]  Lei Wang,et al.  In defense of soft-assignment coding , 2011, 2011 International Conference on Computer Vision.

[112]  Kristin J. Dana,et al.  Device for convenient measurement of spatially varying bidirectional reflectance. , 2004, Journal of the Optical Society of America. A, Optics, image science, and vision.

[113]  John E. Hopcroft,et al.  Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[114]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[115]  Hang Zhang,et al.  Friction from Reflectance: Deep Reflectance Codes for Predicting Physical Surface Properties from One-Shot In-Field Reflectance , 2016, ECCV.

[116]  E. Adelson,et al.  Retrographic sensing for the measurement of surface texture and shape , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[117]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[118]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[119]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[120]  Marc Levoy,et al.  Fast texture synthesis using tree-structured vector quantization , 2000, SIGGRAPH.

[121]  Subhransu Maji,et al.  Deep filter banks for texture recognition and segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[122]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[123]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[124]  Chuan Li,et al.  Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[125]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[126]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[127]  Nenghai Yu,et al.  StyleBank: An Explicit Representation for Neural Image Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[128]  Iasonas Kokkinos,et al.  Deep Filter Banks for Texture Recognition, Description, and Segmentation , 2015, International Journal of Computer Vision.

[129]  Jack Jeswiet,et al.  A friction sensor for sheet-metal rolling , 1991 .

[130]  B. Everitt,et al.  Finite Mixture Distributions , 1981 .

[131]  Andrew Zisserman,et al.  Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[132]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[133]  Gregory J. Ward,et al.  Measuring and modeling anisotropic reflection , 1992, SIGGRAPH.

[134]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[135]  Alexei A. Efros,et al.  A 4D Light-Field Dataset and CNN Architectures for Material Recognition , 2016, ECCV.

[136]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[137]  Shree K. Nayar,et al.  Reflectance and texture of real-world surfaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[138]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[139]  S.S. Ramkumar,et al.  Developing a Polymeric Human Finger Sensor to Study the Frictional Properties of Textiles , 2003 .

[140]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[141]  Thaddeus Beier,et al.  Feature-based image metamorphosis , 1998 .

[142]  Kristin J. Dana,et al.  Deep TEN: Texture Encoding Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[143]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[144]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[145]  Santosh S. Vempala,et al.  An algorithmic theory of learning: Robust concepts and random projection , 1999, Machine Learning.

[146]  Andrea Vedaldi,et al.  Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[147]  Anton van den Hengel,et al.  Is margin preserved after random projection? , 2012, ICML.

[148]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[149]  Mark W. Schmidt,et al.  Fast Patch-based Style Transfer of Arbitrary Style , 2016, ArXiv.

[150]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[151]  A. Todoroki,et al.  Wireless strain monitoring of tires using electrical capacitance changes with an oscillating circuit , 2005 .

[152]  Yann LeCun,et al.  Stacked What-Where Auto-encoders , 2015, ArXiv.

[153]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[154]  Oliver Wang,et al.  Material classification using BRDF slices , 2009, CVPR.

[155]  Noah Snavely,et al.  Material recognition in the wild with the Materials in Context Database , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[156]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[157]  Sylvain Paris,et al.  Deep Photo Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).