论文信息 - Modulated Periodic Activations for Generalizable Local Functional Representations

Modulated Periodic Activations for Generalizable Local Functional Representations

Multi-Layer Perceptrons (MLPs) make powerful functional representations for sampling and reconstruction problems involving low-dimensional signals like images, shapes and light fields. Recent works have significantly improved their ability to represent high-frequency content by using periodic activations or positional encodings. This often came at the expense of generalization: modern methods are typically optimized for a single signal. We present a new representation that generalizes to multiple instances and achieves state-of-the-art fidelity. We use a dual-MLP architecture to encode the signals. A synthesis network creates a functional mapping from a low-dimensional input (e.g. pixel-position) to the output domain (e.g. RGB color). A modulation network maps a latent code corresponding to the target signal to parameters that modulate the periodic activations of the synthesis network. We also propose a localfunctional representation which enables generalization. The signal’s domain is partitioned into a regular grid, with each tile represented by a latent code. At test time, the signal is encoded with high-fidelity by inferring (or directly optimizing) the latent code-book. Our approach produces generalizable functional representations of images, videos and shapes, and achieves higher reconstruction quality than prior works that are optimized for a single signal.

[1] Zhe Wu,et al. A Benchmark Dataset and Evaluation for Non-Lambertian and Uncalibrated Photometric Stereo , 2019, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Gordon Wetzstein,et al. Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[3] Gordon Wetzstein,et al. MetaSDF: Meta-learning Signed Distance Functions , 2020, NeurIPS.

[4] A. Lapedes,et al. Nonlinear signal processing using neural networks: Prediction and system modelling , 1987 .

[5] Kenneth O. Stanley,et al. Compositional Pattern Producing Networks : A Novel Abstraction of Development , 2007 .

[6] Jiajun Wu,et al. Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.

[7] Ken Perlin,et al. Improving noise , 2002, SIGGRAPH.

[8] Thomas Funkhouser,et al. Local Deep Implicit Functions for 3D Shape , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Gordon Wetzstein,et al. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[10] Giambattista Parascandolo,et al. Taming the waves: sine as activation function in deep neural networks , 2017 .

[11] Kyaw Zaw Lin,et al. Neural Sparse Voxel Fields , 2020, NeurIPS.

[12] Andreas Geiger,et al. Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Sebastian Nowozin,et al. Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[15] Pratul P. Srinivasan,et al. NeRF , 2020, ECCV.

[16] Eddy Ilg,et al. Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction , 2020, ECCV.

[17] Thomas Funkhouser,et al. Local Implicit Grid Representations for 3D Scenes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Yinda Zhang,et al. DIST: Rendering Deep Implicit Signed Distance Function With Differentiable Sphere Tracing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] J. Sopena,et al. Neural networks with periodic and monotonic activation functions: a comparative study in classification problems , 1999 .

[20] Yannick Hold-Geoffroy,et al. Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images , 2020, ECCV.

[21] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[22] David Salesin,et al. Surface light fields for 3D photography , 2000, SIGGRAPH.

[23] Ira Kemelmacher-Shlizerman,et al. A theory of locally low dimensional light transport , 2007, ACM Trans. Graph..

[24] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[25] Alec Jacobson,et al. Overfit Neural Networks as a Compact Shape Representation , 2020, ArXiv.

[26] Yaron Lipman,et al. Implicit Geometric Regularization for Learning Shapes , 2020, ICML.

[27] Ping Tan,et al. A Benchmark Dataset and Evaluation for Non-Lambertian and Uncalibrated Photometric Stereo , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Hao Zhang,et al. Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Yaron Lipman,et al. SAL: Sign Agnostic Learning of Shapes From Raw Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[31] Andreas Geiger,et al. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis , 2020, NeurIPS.

[32] Xiaolong Wang,et al. Learning Continuous Image Representation with Local Implicit Image Function , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] William E. Lorensen,et al. Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[34] John Amanatides,et al. A Fast Voxel Traversal Algorithm for Ray Tracing , 1987, Eurographics.

[35] Andoni Rivera Pinto,et al. Real-Time Simulation and Rendering of 3D Fluids , 2018 .

[36] Richard A. Newcombe,et al. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Charles T. Loop,et al. Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Gordon Wetzstein,et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Justus Thies,et al. Deferred Neural Rendering: Image Synthesis using Neural Textures , 2019 .

[40] Kun Zhou,et al. Real-time KD-tree construction on graphics hardware , 2008, SIGGRAPH Asia '08.

[41] Eirikur Agustsson,et al. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42] Jonathan T. Barron,et al. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , 2020, NeurIPS.

[43] Gordon Wetzstein,et al. DeepVoxels: Learning Persistent 3D Feature Embeddings , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Noah Snavely,et al. DualSDF: Semantic Shape Manipulation Using a Two-Level Representation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Philipp Slusallek,et al. Realtime Caustics using Distributed Photon Mapping , 2004, Rendering Techniques.

[46] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[47] Jacek Tabor,et al. Hypernetwork Functional Image Representation , 2019, ICANN.

[48] Karol Myszkowski,et al. X-Fields , 2020, ACM Trans. Graph..

[49] Deborah Silver,et al. Feature Visualization , 1994, Scientific Visualization.