A Novel Deep-learning Pipeline for Light Field Image Based Material Recognition

The primitive basis of image based material recognition builds upon the fact that discrepancies in the reflectances of distinct materials lead to imaging differences under multiple viewpoints. LF cameras possess coherent abilities to capture multiple sub-aperture views (SAIs) within one exposure, which can provide appropriate multi-view sources for material recognition. In this paper, a unified “Factorize-Connect-Merge” (FCM) deep-learning pipeline is proposed to solve problems of light field image based material recognition. 4D light-field data as input is initially decomposed into consecutive 3D light-field slices. Shallow CNN is leveraged to extract low-level visual features of each view inside these slices. As to establish correspondences between these SAIs, Bidirectional Long-Short Term Memory (Bi-LSTM) network is built upon these low-level features to model the imaging differences. After feature selection including concatenation and dimension reduction, effective and robust feature representations for material recognition can be extracted from 4D light-field data. Experimental results indicate that the proposed pipeline can obtain remarkable performances on both tasks of single-pixel material classification and full-image material segmentation. In addition, the proposed pipeline can potentially benefit and inspire other researchers who may also take LF images as input and need to extract 4D light-field representations for computer vision tasks such as object classification, semantic segmentation and edge detection.

[1]  Noah Snavely,et al.  Material recognition in the wild with the Materials in Context Database , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[3]  Thomas Pock,et al.  Convolutional Networks for Shape from Light Field , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Xiaofeng Ren,et al.  Toward Robust Material Recognition for Everyday Objects , 2011, BMVC.

[5]  In-So Kweon,et al.  EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Flexible, Fast and Accurate Densely-Sampled Light Field Reconstruction Network , 2019, ArXiv.

[7]  Qionghai Dai,et al.  Learning Sheared EPI Structure for Light Field Reconstruction , 2019, IEEE Transactions on Image Processing.

[8]  P. Hanrahan,et al.  Light Field Photography with a Hand-held Plenoptic Camera , 2005 .

[9]  Wei Yu,et al.  U-shaped Networks for Shape from Light Field , 2016, BMVC.

[10]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[11]  Shree K. Nayar,et al.  Reflectance and texture of real-world surfaces , 1999, TOGS.

[12]  Ravi Ramamoorthi,et al.  Learning to Synthesize a 4D RGBD Light Field from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Hao Sheng,et al.  Residual Networks for Light Field Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Qionghai Dai,et al.  Light Field Reconstruction Using Deep Convolutional Network on EPI , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ting-Chun Wang,et al.  Learning-based view synthesis for light field cameras , 2016, ACM Trans. Graph..

[16]  Gang Wang,et al.  Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Ko Nishino,et al.  Visual Material Traits: Recognizing Per-Pixel Material Context , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[18]  Alexei A. Efros,et al.  A 4D Light-Field Dataset and CNN Architectures for Material Recognition , 2016, ECCV.

[19]  Barbara Caputo,et al.  Class-Specific Material Categorisation , 2005, ICCV.

[20]  Edward H. Adelson,et al.  Material perception: What can you see in a brief glance? , 2010 .

[21]  In-So Kweon,et al.  Learning a Deep Convolutional Network for Light-Field Image Super-Resolution , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[22]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[23]  Stefan B. Williams,et al.  Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Ko Nishino,et al.  Automatically discovering local visual material attributes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  In-So Kweon,et al.  Light-Field Image Super-Resolution Using Convolutional Neural Network , 2017, IEEE Signal Processing Letters.

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Tieniu Tan,et al.  End-to-End View Synthesis for Light Field Imaging with Pseudo 4DCNN , 2018, ECCV.

[29]  Edward H. Adelson,et al.  Exploring features in a Bayesian framework for material recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Hang Zhang,et al.  Reflectance hashing for material recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Subhransu Maji,et al.  Deep filter banks for texture recognition and segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Tieniu Tan,et al.  LFNet: A Novel Bidirectional Recurrent Convolutional Neural Network for Light-Field Image Super-Resolution , 2018, IEEE Transactions on Image Processing.

[33]  Rong Xiao,et al.  Pairwise Rotation Invariant Co-Occurrence Local Binary Pattern , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Xiaoming Chen,et al.  Fast Light Field Reconstruction with Deep Coarse-to-Fine Modeling of Spatial-Angular Clues , 2018, ECCV.

[35]  Zhibo Chen,et al.  Light Field Spatial Super-Resolution Using Deep Efficient Spatial-Angular Separable Convolution , 2019, IEEE Transactions on Image Processing.