论文信息 - Depth map prediction from a single image with generative adversarial nets

Depth map prediction from a single image with generative adversarial nets

A depth map is a fundamental component of 3D construction. Depth map prediction from a single image is a challenging task in computer vision. In this paper, we consider the depth prediction as an image-to-image task and propose an adversarial convolutional architecture called the Depth Generative Adversarial Network (DepthGAN) for depth prediction. To enhance the image translation ability, we take advantage of a Fully Convolutional Residual Network (FCRN) and combine it with a generative adversarial network, which has shown remarkable achievements in image-to-image tasks. We also present a new loss function including the scale-invariant (SI) error and the structural similarity (SSIM) loss function to improve our model and to output a high-quality depth map. Experiments show that the DepthGAN performs better in monocular depth prediction than the current best method on the NYU Depth v2 dataset.

[1] Alexei A. Efros,et al. Putting Objects in Perspective , 2006, CVPR.

[2] Xiaoou Tang,et al. Single Image Haze Removal Using Dark Channel Prior , 2011 .

[3] Guosheng Lin,et al. Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Huimin Lu,et al. Underwater image de-scattering and classification by deep neural network , 2016, Comput. Electr. Eng..

[5] Kang Zheng,et al. Combining local appearance and holistic view: Dual-Source Deep Neural Networks for human pose estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7] Ashutosh Saxena,et al. 3-D Depth Reconstruction from a Single Still Image , 2007, International Journal of Computer Vision.

[8] Huimin Lu,et al. Motor Anomaly Detection for Unmanned Aerial Vehicles Using Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[9] Alan L. Yuille,et al. Towards unified depth and semantic prediction from a single image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[11] Lucas Theis,et al. Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[12] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Andrew Brock,et al. Neural Photo Editing with Introspective Adversarial Networks , 2016, ICLR.

[14] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[15] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Huimin Lu,et al. Underwater image dehazing using joint trilateral filter , 2014, Comput. Electr. Eng..

[17] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[18] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[19] Sinisa Todorovic,et al. Monocular Depth Estimation Using Neural Regression Forest , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Keeley Clayden,et al. Personality, Motivation and Level of Involvement of Land-Based Recreationists in the Irish Uplands , 2012 .

[21] Aykut Erdem,et al. Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts , 2016, ArXiv.

[22] Nassir Navab,et al. Deeper Depth Prediction with Fully Convolutional Residual Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[23] Huchuan Lu,et al. Defocus Blur Detection via Multi-stream Bottom-Top-Bottom Fully Convolutional Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[25] Chunhua Shen,et al. Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[26] Huimin Lu,et al. Low illumination underwater light field images reconstruction using deep convolutional neural networks , 2018, Future Gener. Comput. Syst..

[27] Aimin Hao,et al. Super-Resolution of Multi-Observed RGB-D Images Based on Nonlocal Regression and Total Variation , 2016, IEEE Transactions on Image Processing.

[28] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Zengfu Wang,et al. A Close-Form Iterative Algorithm for Depth Inferring from a Single Image , 2010, ECCV.

[30] Yoshua Bengio,et al. Generative Adversarial Networks , 2014, ArXiv.

[31] Philip Victor Harman,et al. Rapid 2D-to-3D conversion , 2002, IS&T/SPIE Electronic Imaging.

[32] Huimin Lu,et al. Deep Context Convolutional Neural Networks for Semantic Segmentation , 2017, CCCV.

[33] Stephen Gould,et al. Single image depth estimation from predicted semantic labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[35] Kunio Kashino,et al. Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.

[37] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[39] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40] Yo-Sung Ho,et al. Depth map estimation from single-view image using object classification based on Bayesian learning , 2010, 2010 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video.

[41] Huimin Lu,et al. Deep adversarial metric learning for cross-modal retrieval , 2019, World Wide Web.

[42] Vassilios Morellas,et al. Accurate 3D ground plane estimation from a single image , 2009, 2009 IEEE International Conference on Robotics and Automation.

[43] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[44] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[45] Bin Li,et al. Wound intensity correction and segmentation with convolutional neural networks , 2017, Concurr. Comput. Pract. Exp..

[46] Huimin Lu,et al. Brain Intelligence: Go beyond Artificial Intelligence , 2017, Mobile Networks and Applications.

[47] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[48] Wolfram Burgard,et al. 3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.

[49] Yike Guo,et al. Semantic Image Synthesis via Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Huchuan Lu,et al. Deep visual tracking: Review and experimental comparison , 2018, Pattern Recognit..