Conditional Adversarial Networks for Multimodal Photo-Realistic Point Cloud Rendering

We investigate whether conditional generative adversarial networks (C-GANs) are suitable for point cloud rendering. For this purpose, we created a dataset containing approximately 150,000 renderings of point cloud–image pairs. The dataset was recorded using our mobile mapping system, with capture dates that spread across 1 year. Our model learns how to predict realistically looking images from just point cloud data. We show that we can use this approach to colourize point clouds without the usage of any camera images. Additionally, we show that by parameterizing the recording date, we are even able to predict realistically looking views for different seasons, from identical input point clouds. Nutzung von Conditional Generative Adversarial Networks für das multimodale photorealistische Rendering von Punktwolken . Wir untersuchen, ob Conditional Generative Adversarial Networks (C-GANs) für das Rendering von Punktwolken geeignet sind. Zu diesem Zweck haben wir einen Datensatz erstellt, der etwa 150.000 Bildpaare enthält, jedes bestehend aus einem Rendering einer Punktwolke und dem dazugehörigen Kamerabild. Der Datensatz wurde mit unserem Mobile Mapping System aufgezeichnet, wobei die Messkampagnen über ein Jahr verteilt durchgeführt wurden. Unser Modell lernt, ausschließlich auf Basis von Punktwolkendaten realistisch aussehende Bilder vorherzusagen. Wir zeigen, dass wir mit diesem Ansatz Punktwolken ohne die Verwendung von Kamerabildern kolorieren können. Darüber hinaus zeigen wir, dass wir durch die Parametrierung des Aufnahmedatums in der Lage sind, aus identischen Eingabepunktwolken realistisch aussehende Ansichten für verschiedene Jahreszeiten vorherzusagen.

[1]  Philip H. S. Torr,et al.  Multi-agent Diverse Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Alexandre Boulch,et al.  SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks , 2017, Comput. Graph..

[3]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  WhittedTurner An improved illumination model for shaded display , 1979 .

[5]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  James T. Kajiya,et al.  The rendering equation , 1986, SIGGRAPH.

[7]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[8]  W. Jack Bouknight,et al.  A procedure for generation of three-dimensional half-toned computer graphics presentations , 1970, CACM.

[9]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[10]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[11]  Ali Borji,et al.  Pros and Cons of GAN Evaluation Measures , 2018, Comput. Vis. Image Underst..

[12]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[13]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[14]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[16]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Matthias Zwicker,et al.  Surfels: surface elements as rendering primitives , 2000, SIGGRAPH.

[18]  Stefan Milz,et al.  Points2Pix: 3D Point-Cloud to Image Translation Using Conditional GANs , 2019, GCPR.

[19]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[21]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[23]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[24]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Yaser Sheikh,et al.  PixelNN: Example-based Image Synthesis , 2017, ICLR.

[26]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[27]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[28]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[29]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[30]  Rowel Atienza,et al.  A Conditional Generative Adversarial Network for Rendering Point Clouds , 2019, CVPR Workshops.

[31]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[32]  Marcus Liwicki,et al.  TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network , 2017, ArXiv.

[33]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[34]  Matthias Zwicker,et al.  Surface splatting , 2001, SIGGRAPH.

[35]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[36]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[37]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Turner Whitted,et al.  An improved illumination model for shaded display , 1979, CACM.