Cascade Image Matting with Deformable Graph Refinement

Image matting refers to the estimation of the opacity of foreground objects. It requires correct contours and fine details of foreground objects for the matting results. To better accomplish human image matting tasks, we propose the Cascade Image Matting Network with Deformable Graph Refinement(CasDGR), which can automatically predict precise alpha mattes from single human images without any additional inputs. We adopt a network cascade architecture to perform matting from low-to-high resolution, which corresponds to coarse-to-fine optimization. We also introduce the Deformable Graph Refinement (DGR) module based on graph neural networks (GNNs) to overcome the limitations of convolutional neural networks (CNNs). The DGR module can effectively capture long-range relations and obtain more global and local information to help produce finer alpha mattes. We also reduce the computation complexity of the DGR module by dynamically predicting the neighbors and apply DGR module to higher–resolution features. Experimental results demonstrate the ability of our CasDGR to achieve state-of-the-art performance on synthetic datasets and produce good results on real human images.

[1]  Xiaoxiao Li,et al.  Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jiaya Jia,et al.  Deep Automatic Portrait Matting , 2016, ECCV.

[3]  Jian Sun,et al.  A global sampling method for alpha matting , 2011, CVPR 2011.

[4]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jingwei Tang,et al.  Learning-Based Sampling for Natural Image Matting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Sanja Fidler,et al.  3D Graph Neural Networks for RGBD Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Ning Xu,et al.  Deep Image Matting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Yuanjie Zheng,et al.  Learning based digital matting , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  Dani Lischinski,et al.  Spectral Matting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Hujun Bao,et al.  A Late Fusion CNN for Digital Matting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Ira Kemelmacher-Shlizerman,et al.  Background Matting: The World Is Your Green Screen , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jiangyu Liu,et al.  Disentangled Image Matting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Hongtao Lu,et al.  Natural Image Matting via Guided Contextual Attention , 2020, AAAI.

[15]  Jian Sun,et al.  Poisson matting , 2004, ACM Trans. Graph..

[16]  Changxin Gao,et al.  Representative Graph Neural Network , 2020, ECCV.

[17]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[18]  Yu Qiao,et al.  Attention-Guided Hierarchical Structure Aggregation for Image Matting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Feng Liu,et al.  Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Martin Simonovsky,et al.  Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Chi-Keung Tang,et al.  KNN Matting , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[23]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[24]  Jian Sun,et al.  Fast matting using large kernel matting Laplacian matrices , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  In-So Kweon,et al.  Natural Image Matting Using Deep Convolutional Neural Networks , 2016, ECCV.

[26]  Ying Wu,et al.  Nonlocal matting , 2011, CVPR 2011.

[27]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[28]  C. Rother,et al.  A perceptually motivated online benchmark for image matting , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Deepu Rajan,et al.  Improving Image Matting Using Comprehensive Sampling Sets , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[31]  Dan Xu,et al.  Dynamic Graph Message Passing Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Gang Yu,et al.  Cascaded Pyramid Network for Multi-person Pose Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Siwei Lyu,et al.  Cascade Graph Neural Networks for RGB-D Salient Object Detection , 2020, ECCV.

[34]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[35]  Ning Xu,et al.  Mask Guided Matting via Progressive Refinement Network , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  David Salesin,et al.  A Bayesian approach to digital matting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[37]  Manuel Menezes de Oliveira Neto,et al.  Shared Sampling for Real‐Time Alpha Matting , 2010, Comput. Graph. Forum.

[38]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[39]  Nadia Magnenat-Thalmann,et al.  Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[41]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[42]  Hao Lu,et al.  Indices Matter: Learning to Index for Deep Image Matting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Michael F. Cohen,et al.  Optimized Color Sampling for Robust Matting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Quan Chen,et al.  Semantic Human Matting , 2018, ACM Multimedia.

[45]  Miaomiao Cui,et al.  Boosting Semantic Human Matting With Coarse Annotations , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Weijing Shi,et al.  Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Rüdiger Westermann,et al.  RANDOM WALKS FOR INTERACTIVE ALPHA-MATTING , 2005 .

[49]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Ruigang Yang,et al.  Human Pose Estimation with Spatial Contextual Information , 2019, ArXiv.