Implicit Neural Representations with Periodic Activation Functions

Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit neural representations are incapable of modeling signals with fine detail, and fail to represent a signal's spatial and temporal derivatives, despite the fact that these are essential to many physical signals defined implicitly as the solution to partial differential equations. We propose to leverage periodic activation functions for implicit neural representations and demonstrate that these networks, dubbed sinusoidal representation networks or Sirens, are ideally suited for representing complex natural signals and their derivatives. We analyze Siren activation statistics to propose a principled initialization scheme and demonstrate the representation of images, wavefields, video, sound, and their derivatives. Further, we show how Sirens can be leveraged to solve challenging boundary value problems, such as particular Eikonal equations (yielding signed distance functions), the Poisson equation, and the Helmholtz and wave equations. Lastly, we combine Sirens with hypernetworks to learn priors over the space of Siren functions.

[1]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[2]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Arvid Lundervold,et al.  Noise removal using fourth-order partial differential equation with applications to medical magnetic resonance images in space and time , 2003, IEEE Trans. Image Process..

[4]  L. A. Goodman On the Exact Variance of Products , 1960 .

[5]  Kenneth O. Stanley,et al.  Compositional Pattern Producing Networks : A Novel Abstraction of Development , 2007 .

[6]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Jonathan T. Barron,et al.  NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis , 2020, ECCV.

[8]  Yasushi Makihara,et al.  Object recognition supported by user interaction for service robots , 2002, Object recognition supported by user interaction for service robots.

[9]  Pierre Kornprobst,et al.  Mathematical problems in image processing - partial differential equations and the calculus of variations , 2010, Applied mathematical sciences.

[10]  Zhigang Zeng,et al.  Multistability of Recurrent Neural Networks With Nonmonotonic Activation Functions and Mixed Time Delays , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[11]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[12]  Bin Jiang,et al.  Coherent Semantic Attention for Image Inpainting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[14]  S. Bowling,et al.  A Logistic Approximation to The Cumulative Normal Distribution , 2009 .

[15]  E. Candès Harmonic Analysis of Neural Networks , 1999 .

[16]  Pascal Getreuer,et al.  Rudin-Osher-Fatemi Total Variation Denoising using Split Bregman , 2012, Image Process. Line.

[17]  M. H. Choueiki,et al.  Implementing a weighted least squares procedure in training a neural network to solve the short-term load forecasting problem , 1997 .

[18]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[19]  René Alquézar Mancho Symbolic and connectionist learning techniques for grammatical inference , 1997 .

[20]  Andreas Geiger,et al.  Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[23]  Xue-Cheng Tai,et al.  Iterative Image Restoration Combining Total Variation Minimization and a Second-Order Functional , 2005, International Journal of Computer Vision.

[24]  John Olmsted,et al.  Real Variables: An Introduction To The Theory of Functions , 2019 .

[25]  Tingting Wu,et al.  International Journal of C 2013 Institute for Scientific Numerical Analysis and Modeling Computing and Information an Optimal 9-point Finite Difference Scheme for the Helmholtz Equation with Pml , 2022 .

[26]  Chris Donahue,et al.  Adversarial Audio Synthesis , 2018, ICLR.

[27]  H. White,et al.  There exists a neural network that does not make avoidable mistakes , 1988, IEEE 1988 International Conference on Neural Networks.

[28]  Xianglong Liu,et al.  Regionwise Generative Adversarial Image Inpainting for Large Missing Areas , 2019, IEEE Transactions on Cybernetics.

[29]  Giambattista Parascandolo,et al.  Taming the waves: sine as activation function in deep neural networks , 2017 .

[30]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[31]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Eduardo D. Sontag,et al.  Using Fourier-neural recurrent networks to fit sequential input/output data , 1997, Neurocomputing.

[33]  Thomas Funkhouser,et al.  Local Implicit Grid Representations for 3D Scenes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[35]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Andreas Geiger,et al.  Texture Fields: Learning Texture Representations in Function Space , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Yaron Lipman,et al.  SAL: Sign Agnostic Learning of Shapes From Raw Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Gordon Wetzstein,et al.  Inferring Semantic Information with 3D Neural Scene Representations , 2020, ArXiv.

[39]  Nipun Kwatra,et al.  Texture optimization for example-based synthesis , 2005, ACM Trans. Graph..

[40]  Zongben Xu,et al.  Approximation by neural networks with scattered data , 2013, Appl. Math. Comput..

[41]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Paris Perdikaris,et al.  Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , 2019, J. Comput. Phys..

[43]  Andreas Geiger,et al.  Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  G. Aubert,et al.  Modeling Very Oscillating Signals. Application to Image Processing , 2005 .

[45]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[46]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[47]  Yong Xu,et al.  Audio Set Classification with Attention Model: A Probabilistic Perspective , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[48]  Aren Jansen,et al.  CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[49]  Eddy Ilg,et al.  Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction , 2020, ECCV.

[50]  Thomas A. Funkhouser,et al.  Learning Shape Templates With Structured Implicit Functions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[51]  Hugo Van hamme,et al.  Unsupervised learning of auditory filter banks using non-negative matrix factorisation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[52]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[53]  B T Cox,et al.  k-Wave: MATLAB toolbox for the simulation and reconstruction of photoacoustic wave fields. , 2010, Journal of biomedical optics.

[54]  Thomas H. Li,et al.  StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[55]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[56]  R. Ash,et al.  Probability and measure theory , 1999 .

[57]  Navdeep Jaitly,et al.  Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[58]  Anders P. Eriksson,et al.  Implicit Surface Representations As Layers in Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[59]  Guillermo Sapiro,et al.  Navier-stokes, fluid dynamics, and image and video inpainting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[60]  Hyuk Lee,et al.  Neural algorithm for solving differential equations , 1990 .

[61]  Ali Gholami,et al.  Improving full-waveform inversion by wavefield reconstruction with the alternating direction method of multipliers , 2018, GEOPHYSICS.

[62]  René Alquézar,et al.  Improvement of Learning in Recurrent Networks by Substituting the Sigmoid Activation Function , 1994 .

[63]  J. Lindeberg Eine neue Herleitung des Exponentialgesetzes in der Wahrscheinlichkeitsrechnung , 1922 .

[64]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[65]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[66]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[67]  Jacek Tabor,et al.  Hypernetwork Functional Image Representation , 2019, ICANN.

[68]  Felix J. Herrmann,et al.  Mitigating local minima in full-waveform inversion by expanding the search space , 2013 .

[69]  Nicola Pezzotti,et al.  Differentiable Image Parameterizations , 2018, Distill.

[70]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[71]  Yoshua Bengio,et al.  SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.

[72]  J. Sopena,et al.  Neural networks with periodic and monotonic activation functions: a comparative study in classification problems , 1999 .