Real-Time High-Resolution Background Matting

We introduce a real-time, high-resolution background replacement technique which operates at 30fps in 4K resolution, and 60fps for HD on a modern GPU. Our technique is based on background matting, where an additional frame of the background is captured and used in recovering the alpha matte and the foreground layer. The main challenge is to compute a high-quality alpha matte, preserving strand-level hair details, while processing high-resolution images in real-time. To achieve this goal, we employ two neural networks; a base network computes a low-resolution result which is refined by a second network operating at high-resolution on selective patches. We introduce two largescale video and image matting datasets: VideoMatte240K and PhotoMatte13K/85. Our approach yields higher quality results compared to the previous state-of-the-art in background matting, while simultaneously yielding a dramatic boost in both speed and resolution.

[1]  Marco Forte,et al.  F, B, Alpha Matting , 2020, ArXiv.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Hujun Bao,et al.  A Late Fusion CNN for Digital Matting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[6]  M. Ibrahim Sezan,et al.  Video background replacement without a blue screen , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[7]  C. Rother,et al.  A perceptually motivated online benchmark for image matting , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Hongtao Lu,et al.  Hierarchical Opacity Propagation for Image Matting , 2020, ArXiv.

[9]  Ning Xu,et al.  Deep Image Matting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Paul L. Rosin,et al.  Pose2Seg: Detection Free Human Instance Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Ira Kemelmacher-Shlizerman,et al.  Background Matting: The World Is Your Green Screen , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jiangyu Liu,et al.  Disentangled Image Matting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Jian Sun,et al.  Poisson matting , 2004, ACM Trans. Graph..

[14]  Miaomiao Cui,et al.  Boosting Semantic Human Matting With Coarse Annotations , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Chi-Keung Tang,et al.  KNN Matting , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  David Salesin,et al.  A Bayesian approach to digital matting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Manuel Menezes de Oliveira Neto,et al.  Shared Sampling for Real‐Time Alpha Matting , 2010, Comput. Graph. Forum.

[18]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[19]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[20]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[21]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Kaiming He,et al.  PointRend: Image Segmentation As Rendering , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[24]  Feng Liu,et al.  Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Tae-Hyun Oh,et al.  Semantic soft segmentation , 2018, ACM Trans. Graph..

[26]  Jingwei Tang,et al.  Learning-Based Sampling for Natural Image Matting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Dani Lischinski,et al.  Spectral Matting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Marc Pollefeys,et al.  Designing Effective Inter-Pixel Information Flow for Natural Image Matting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Yu Qiao,et al.  Attention-Guided Hierarchical Structure Aggregation for Image Matting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Liang Lin,et al.  Look into Person: Joint Body Parsing & Pose Estimation Network and a New Benchmark , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jiaya Jia,et al.  Deep Automatic Portrait Matting , 2016, ECCV.

[32]  Michael F. Cohen,et al.  Image and Video Matting: A Survey , 2007, Found. Trends Comput. Graph. Vis..

[33]  Ming Tang,et al.  Fast Deep Matting for Portrait Animation on Mobile Phone , 2017, ACM Multimedia.

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Hao Lu,et al.  Indices Matter: Learning to Index for Deep Image Matting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Minglun Gong,et al.  Near-Real-Time Image Matting with Known Background , 2009, 2009 Canadian Conference on Computer and Robot Vision.