论文信息 - Deep View Morphing

Deep View Morphing

Recently, convolutional neural networks (CNN) have been successfully applied to view synthesis problems. However, such CNN-based methods can suffer from lack of texture details, shape distortions, or high computational complexity. In this paper, we propose a novel CNN architecture for view synthesis called Deep View Morphing that does not suffer from these issues. To synthesize a middle view of two input images, a rectification network first rectifies the two input images. An encoder-decoder network then generates dense correspondences between the rectified images and blending masks to predict the visibility of pixels of the rectified images in the middle view. A view morphing network finally synthesizes the middle view using the dense correspondences and blending masks. We experimentally show the proposed method significantly outperforms the state-of-the-art CNN-based view synthesis method.

[1] Amnon Shashua,et al. Algebraic Functions For Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[2] Takeo Kanade,et al. Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[3] Frédo Durand,et al. A gentle introduction to bilateral filtering and its applications , 2007, SIGGRAPH Courses.

[4] Changchang Wu,et al. Structure from Motion Using Structure-Less Resection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5] Thaddeus Beier,et al. Feature-based image metamorphosis , 1998 .

[6] Steven M. Seitz,et al. Single-view modelling of free-form scenes , 2002, Comput. Animat. Virtual Worlds.

[7] Lance Williams,et al. View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[8] John Flynn,et al. Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Scott E. Reed,et al. Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis , 2015, NIPS.

[10] Thomas Brox,et al. FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11] Steven M. Seitz,et al. View morphing , 1996, SIGGRAPH.

[12] Jan-Michael Frahm,et al. PatchMatch Based Joint View Selection and Depthmap Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.

[14] Frank Chongwoo Park,et al. A Geometric Particle Filter for Template-Based Visual Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Jan-Michael Frahm,et al. Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) , 2015, CVPR 2015.

[16] Thomas Brox,et al. Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[18] Ersin Yumer,et al. Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Alex Kuefler. Deep View Morphing , 2016 .

[20] Jitendra Malik,et al. View Synthesis by Appearance Flow , 2016, ECCV.

[21] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23] Hideyuki Tamura,et al. Viewpoint-dependent stereoscopic display using interpolation of multiviewpoint images , 1995, Electronic Imaging.

[24] Tomaso A. Poggio,et al. Model-based matching of line drawings by linear combinations of prototypes , 1995, Proceedings of IEEE International Conference on Computer Vision.

[25] Marc Levoy,et al. Light field rendering , 1996, SIGGRAPH.

[26] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[27] Ken-ichi Anjyo,et al. Tour into the picture: using a spidery mesh interface to make animation from a single image , 1997, SIGGRAPH.

[28] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[29] Jean Ponce,et al. Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Tomaso A. Poggio,et al. Linear Object Classes and Image Synthesis From a Single Example Image , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[31] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[32] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[33] Xiaoou Tang,et al. Video Frame Synthesis Using Deep Voxel Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34] Richard Szeliski,et al. The lumigraph , 1996, SIGGRAPH.

[35] Thomas Brox,et al. Multi-view 3D Models from Single Images with a Convolutional Network , 2015, ECCV.

[36] Changchang Wu,et al. Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[37] Geoffrey E. Hinton,et al. Transforming Autoencoders , 2011 .

[38] Alexei A. Efros,et al. Automatic photo pop-up , 2005, SIGGRAPH 2005.