Points2Pix: 3D Point-Cloud to Image Translation Using Conditional GANs

We present the first approach for 3D point-cloud to image translation based on conditional Generative Adversarial Networks (cGAN). The model handles multi-modal information sources from different domains, i.e. raw point-sets and images. The generator is capable of processing three conditions, whereas the point-cloud is encoded as raw point-set and camera projection. An image background patch is used as constraint to bias environmental texturing. A global approximation function within the generator is directly applied on the point-cloud (Point-Net). Hence, the representative learning model incorporates global 3D characteristics directly at the latent feature space. Conditions are used to bias the background and the viewpoint of the generated image. This opens up new ways in augmenting or texturing 3D data to aim the generation of fully individual images. We successfully evaluated our method on the KITTI and SunRGBD dataset with an outstanding object detection inception score.

[1]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[2]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[4]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[6]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[7]  Abhinav Gupta,et al.  Generative Image Modeling Using Style and Structure Adversarial Networks , 2016, ECCV.

[8]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Namil Kim,et al.  Pixel-Level Domain Transfer , 2016, ECCV.

[10]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Leonidas J. Guibas,et al.  Representation Learning and Adversarial Generation of 3D Point Clouds , 2017, ArXiv.

[13]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[14]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Jianxiong Xiao,et al.  Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[17]  Eugenio Culurciello,et al.  ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[18]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Bin Yang,et al.  Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[23]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[24]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[25]  Hiroshi Ishikawa,et al.  Let there be color! , 2016, ACM Trans. Graph..

[26]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[28]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[29]  Bo Li,et al.  3D fully convolutional network for vehicle detection in point cloud , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Yang Zhang,et al.  Point Cloud GAN , 2018, DGS@ICLR.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.