TOM-Net: Learning Transparent Object Matting from a Single Image

This paper addresses the problem of transparent object matting. Existing image matting approaches for transparent objects often require tedious capturing procedures and long processing time, which limit their practical use. In this paper, we first formulate transparent object matting as a refractive flow estimation problem. We then propose a deep learning framework, called TOM-Net, for learning the refractive flow. Our framework comprises two parts, namely a multi-scale encoder-decoder network for producing a coarse prediction, and a residual network for refinement. At test time, TOM-Net takes a single image as input, and outputs a matte (consisting of an object mask, an attenuation mask and a refractive flow field) in a fast feed-forward pass. As no off-the-shelf dataset is available for transparent object matting, we create a large-scale synthetic dataset consisting of 178K images of transparent objects rendered in front of images sampled from the Microsoft COCO dataset. We also collect a real dataset consisting of 876 samples using 14 transparent objects and 60 background images. Promising experimental results have been achieved on both synthetic and real data, which clearly demonstrate the effectiveness of our approach.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Yee-Hong Yang,et al.  Frequency-based environment matting , 2004, 12th Pacific Conference on Computer Graphics and Applications, 2004. PG 2004. Proceedings..

[3]  Minglun Gong,et al.  Frequency-Based Environment Matting by Compressive Sensing , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  James F. Blinn,et al.  Blue screen matting , 1996, SIGGRAPH.

[5]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[6]  Jianfei Cai,et al.  Fast environment matting extraction using compressive sensing , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[7]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Michael S. Brown,et al.  Matting and compositing of transparent and refractive objects , 2011, TOGS.

[9]  David Salesin,et al.  Environment matting and compositing , 1999, SIGGRAPH.

[10]  Jiaya Jia,et al.  Deep Automatic Portrait Matting , 2016, ECCV.

[11]  In-So Kweon,et al.  Natural Image Matting Using Deep Convolutional Neural Networks , 2016, ECCV.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Ning Xu,et al.  Deep Image Matting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Andrew W. Fitzgibbon,et al.  Image-based environment matting , 2002, SIGGRAPH '02.

[15]  Tae Hyun Kim,et al.  Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[17]  Jianfei Cai,et al.  Flexible and Accurate Transparent‐Object Matting and Compositing Using Refractive Vector Field , 2011, Comput. Graph. Forum.

[18]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[19]  Pieter Peers,et al.  Wavelet Environment matting , 2003, Rendering Techniques.

[20]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21]  David Salesin,et al.  Environment matting extensions: towards higher accuracy and real-time capture , 2000, SIGGRAPH.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Jian Shi,et al.  Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jianfei Cai,et al.  Compressive environment matting , 2014, The Visual Computer.

[25]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.