论文信息 - InfiniCity: Infinite-Scale City Synthesis

InfiniCity: Infinite-Scale City Synthesis

Toward infinite-scale 3D city synthesis, we propose a novel framework, InfiniCity, which constructs and renders an unconstrainedly large and 3D-grounded environment from random noises. InfiniCity decomposes the seemingly impractical task into three feasible modules, taking advantage of both 2D and 3D data. First, an infinite-pixel image synthesis module generates arbitrary-scale 2D maps from the bird's-eye view. Next, an octree-based voxel completion module lifts the generated 2D map to 3D octrees. Finally, a voxel-based neural rendering module texturizes the voxels and renders 2D images. InfiniCity can thus synthesize arbitrary-scale and traversable 3D city environments, and allow flexible and interactive editing from users. We quantitatively and qualitatively demonstrate the efficacy of the proposed framework. Project page: https://hubert0527.github.io/infinicity/

[1] A. Schwing,et al. SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Subhransu Maji,et al. Cross-Modal 3D Shape Generation and Manipulation , 2022, ECCV.

[3] Noah Snavely,et al. InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images , 2022, ECCV.

[4] Zhe Gan,et al. NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis , 2022, NeurIPS.

[5] Peter Wonka,et al. EpiGRAF: Rethinking training of 3D GANs , 2022, NeurIPS.

[6] Alexei A. Efros,et al. Generating Long Videos of Dynamic Scenes , 2022, NeurIPS.

[7] Jaesik Park,et al. 3D Scene Painting via Semantic Image Synthesis , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] M. Nießner,et al. Texturify: Generating Textures on 3D Shape Surfaces , 2022, ECCV.

[9] Shubham Tulsiani,et al. AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Shalini De Mello,et al. Efficient Geometry-aware 3D Generative Adversarial Networks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Hwann-Tzong Chen,et al. Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Christian Theobalt,et al. StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis , 2021, ICLR.

[14] Sergey Tulyakov,et al. InfinityGAN: Towards Infinite-Pixel Image Synthesis , 2021, ICLR.

[15] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[16] Ming-Yu Liu,et al. GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17] Mohamed Elhoseiny,et al. Aligning Latent and Image Spaces to Connect the Unconnectable , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[18] Jiajun Wu,et al. 3D Shape Generation and Completion through Point-Voxel Diffusion , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[19] Nitish Srivastava,et al. Unconstrained Scene Generation with Locally Conditioned Radiance Fields , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Ren Ng,et al. PlenOctrees for Real-time Rendering of Neural Radiance Fields , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[21] Shitong Luo,et al. Diffusion Probabilistic Models for 3D Point Cloud Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Charles T. Loop,et al. Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Varun Jampani,et al. Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24] B. Ommer,et al. Taming Transformers for High-Resolution Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Jiajun Wu,et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Jingwei Huang,et al. HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures , 2020, ArXiv.

[27] Arun Mallya,et al. World-Consistent Video-to-Video Synthesis , 2020, ECCV.

[28] Andreas Geiger,et al. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis , 2020, NeurIPS.

[29] Alexei A. Efros,et al. Swapping Autoencoder for Deep Image Manipulation , 2020, NeurIPS.

[30] Yang Liu,et al. Deep Octree-based CNNs with Output-Guided Skip Connections for 3D Shape and Scene Completion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.

[32] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Wei Wei,et al. COCO-GAN: Generation by Parts via Conditional Coordinating , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34] Taesung Park,et al. Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Hao Zhang,et al. Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Yang Liu,et al. Adaptive O-CNN: A Patch-based Deep Representation of 3D Shapes , 2018 .

[37] Song-Chun Zhu,et al. Learning Descriptor Networks for 3D Shape Synthesis and Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38] Sebastian Nowozin,et al. Which Training Methods for GANs do actually Converge? , 2018, ICML.

[39] Arthur Gretton,et al. Demystifying MMD GANs , 2018, ICLR.

[40] Dong Tian,et al. FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] Leonidas J. Guibas,et al. Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[42] David Meger,et al. Improved Adversarial Systems for 3D Object Generation and Reconstruction , 2017, CoRL.

[43] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[44] Zhi Chen,et al. Adversarial Feature Matching for Text Generation , 2017, ICML.

[45] Gernot Riegler,et al. OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[47] Peng-Shuai Wang,et al. O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis , 2017, ArXiv.

[48] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[49] Roberto Manduchi,et al. Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).