论文信息 - Deep Free-Form Deformation Network for Object-Mask Registration

Deep Free-Form Deformation Network for Object-Mask Registration

This paper addresses the problem of object-mask registration, which aligns a shape mask to a target object instance. Prior work typically formulate the problem as an object segmentation task with mask prior, which is challenging to solve. In this work, we take a transformation based approach that predicts a 2D non-rigid spatial transform and warps the shape mask onto the target object. In particular, we propose a deep spatial transformer network that learns free-form deformations (FFDs) to non-rigidly warp the shape mask based on a multi-level dual mask feature pooling strategy. The FFD transforms are based on B-splines and parameterized by the offsets of predefined control points, which are differentiable. Therefore, we are able to train the entire network in an end-to-end manner based on L2 matching loss. We evaluate our FFD network on a challenging object-mask alignment task, which aims to refine a set of object segment proposals, and our approach achieves the state-of-the-art performance on the Cityscapes, the PASCAL VOC and the MSCOCO datasets.

Xuming He | Haoyang Zhang | Xuming He | Haoyang Zhang

[1] Peter V. Gehler,et al. Video Propagation Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Svetlana Lazebnik,et al. Finding Things: Image Parsing with Regions and Per-Exemplar Detectors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Ronan Collobert,et al. Learning to Refine Object Segments , 2016, ECCV.

[4] Charles R. Meyer,et al. Demonstration of accuracy and clinical versatility of mutual information for automatic multimodality image fusion using affine and thin-plate spline warped geometric deformations , 1997, Medical Image Anal..

[5] Sung Yong Shin,et al. Scattered Data Interpolation with Multilevel B-Splines , 1997, IEEE Trans. Vis. Comput. Graph..

[6] Cristian Sminchisescu,et al. Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[7] James M. Rehg,et al. RIGOR: Reusing Inference in Graph Cuts for Generating Object Regions , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] David W. Jacobs,et al. WarpNet: Weakly Supervised Matching for Single-View Reconstruction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Daniel Rueckert,et al. Nonrigid registration using free-form deformations: application to breast MR images , 1999, IEEE Transactions on Medical Imaging.

[11] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[12] John Ashburner,et al. A fast diffeomorphic image registration algorithm , 2007, NeuroImage.

[13] D. Hill,et al. Non-rigid image registration: theory and practice. , 2004, The British journal of radiology.

[14] Cristian Sminchisescu,et al. Parametric Image Segmentation of Humans with Structural Shape Priors , 2016, ACCV.

[15] Kristen Grauman,et al. Shape Sharing for Object Segmentation , 2012, ECCV.

[16] Xuming He,et al. An Exemplar-Based CRF for Multi-instance Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[18] C. Lawrence Zitnick,et al. Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[19] Feng Zhou,et al. Deep Deformation Network for Object Landmark Localization , 2016, ECCV.

[20] Jan Flusser,et al. Image registration methods: a survey , 2003, Image Vis. Comput..

[21] Cristian Sminchisescu,et al. CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Matthieu Guillaumin,et al. Segmentation Propagation in ImageNet , 2012, ECCV.

[23] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] Vladlen Koltun,et al. Geodesic Object Proposals , 2014, ECCV.

[25] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[26] Jian Sun,et al. Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Nikos Paragios,et al. Shape registration in implicit spaces using information theory and free form deformations , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Jonathan T. Barron,et al. Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Subhransu Maji,et al. Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[30] Daniel Cremers,et al. Diffusion Snakes: Introducing Statistical Shape Knowledge into the Mumford-Shah Functional , 2002, International Journal of Computer Vision.

[31] Vibhav Vineet,et al. Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[33] Ronan Collobert,et al. Learning to Segment Object Candidates , 2015, NIPS.

[34] Jian Sun,et al. Convolutional feature masking for joint object and stuff segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Vittorio Ferrari,et al. Figure-ground segmentation by transferring window masks , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Bernt Schiele,et al. What Makes for Effective Detection Proposals? , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[39] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.