Neural SDEs as Infinite-Dimensional GANs

Stochastic differential equations (SDEs) are a staple of mathematical modelling of temporal dynamics. However, a fundamental limitation has been that such models have typically been relatively inflexible, which recent work introducing Neural SDEs has sought to solve. Here, we show that the current classical approach to fitting SDEs may be approached as a special case of (Wasserstein) GANs, and in doing so the neural and classical regimes may be brought together. The input noise is Brownian motion, the output samples are time-evolving paths produced by a numerical solver, and by parameterising a discriminator as a Neural Controlled Differential Equation (CDE), we obtain Neural SDEs as (in modern machine learning parlance) continuous-time generative time series models. Unlike previous work on this problem, this is a direct extension of the classical approach without reference to either prespecified statistics or density functions. Arbitrary drift and diffusions are admissible, so as the Wasserstein loss has a unique global minima, in the infinite data limit any SDE may be learnt.

[1]  G. Pavliotis Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations , 2014 .

[2]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[3]  Patrick Kidger,et al.  Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU , 2020, ICLR.

[4]  Dmitry Vetrov,et al.  Stochasticity in Neural ODEs: An Empirical Study , 2020, ICLR 2020.

[5]  Katherine A. Heller,et al.  Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier , 2017, ICML.

[6]  Patrick Kidger,et al.  Universal Approximation with Deep Narrow Networks , 2019, COLT 2019.

[7]  W. Coffey,et al.  The Langevin equation : with applications to stochastic problems in physics, chemistry, and electrical engineering , 2012 .

[8]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[9]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[10]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[11]  F. Black,et al.  The Pricing of Options and Corporate Liabilities , 1973, Journal of Political Economy.

[12]  Cho-Jui Hsieh,et al.  Neural SDE: Stabilizing Neural ODE Networks with Stochastic Noise , 2019, ArXiv.

[13]  Christa Cuchiero,et al.  A Generative Adversarial Network Approach to Calibration of Local Stochastic Volatility Models , 2020, Risks.

[14]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[15]  Alessandro Barp,et al.  Statistical Inference for Generative Models with Maximum Mean Discrepancy , 2019, ArXiv.

[16]  Peter K. Friz,et al.  Multidimensional Stochastic Processes as Rough Paths: Theory and Applications , 2010 .

[17]  T. Huillet On Wright–Fisher diffusion and its relatives , 2007 .

[18]  Hao Wu,et al.  Stochastic Normalizing Flows , 2020, NeurIPS.

[19]  M. Yor DIFFUSIONS, MARKOV PROCESSES AND MARTINGALES: Volume 2: Itô Calculus , 1989 .

[20]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[21]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[22]  Benjamin M. Marlin,et al.  A scalable end-to-end Gaussian process adapter for irregularly sampled time series classification , 2016, NIPS.

[23]  Greg Mori,et al.  Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows , 2020, NeurIPS.

[24]  Gabriel Stoltz,et al.  Partial differential equations and stochastic methods in molecular dynamics* , 2016, Acta Numerica.

[25]  Ricky T. Q. Chen,et al.  Scalable Gradients and Variational Inference for Stochastic Differential Equations , 2019, AABI.

[26]  T. Alderweireld,et al.  A Theory for the Term Structure of Interest Rates , 2004, cond-mat/0405293.

[27]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[28]  Jing He,et al.  Cautionary tales on air-quality improvement in Beijing , 2017, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[29]  Vladlen Koltun,et al.  Multiscale Deep Equilibrium Models , 2020, NeurIPS.

[30]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[31]  Stefan Winkler,et al.  The Unusual Effectiveness of Averaging in GAN Training , 2018, ICLR.

[32]  Maxim Raginsky,et al.  Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit , 2019, ArXiv.

[33]  Patrick Kidger,et al.  Neural CDEs for Long Time Series via the Log-ODE Method , 2020, ArXiv.

[34]  Allan Pinkus,et al.  Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[35]  T. Soboleva,et al.  Population growth as a nonlinear stochastic process , 2003 .

[36]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[37]  Gunnar Rätsch,et al.  Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs , 2017, ArXiv.

[38]  Ali Ramadhan,et al.  Universal Differential Equations for Scientific Machine Learning , 2020, ArXiv.

[39]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[40]  Harald Oberhauser,et al.  Variational Gaussian Processes with Signature Covariances , 2019, ArXiv.

[41]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[42]  Terry Lyons,et al.  Learning from the past, predicting the statistics for the future, learning an evolving system , 2013, 1309.0260.

[43]  Maxim Raginsky,et al.  Theoretical guarantees for sampling and inference in generative models with latent diffusions , 2019, COLT.

[44]  Vladlen Koltun,et al.  Deep Equilibrium Models , 2019, NeurIPS.

[45]  David Siska,et al.  Robust Pricing and Hedging via Neural SDEs , 2020, SSRN Electronic Journal.

[46]  Franz J. Király,et al.  Kernels for sequentially ordered data , 2016, J. Mach. Learn. Res..

[47]  David Duvenaud,et al.  Latent Ordinary Differential Equations for Irregularly-Sampled Time Series , 2019, NeurIPS.

[48]  M. Yor,et al.  Continuous martingales and Brownian motion , 1990 .

[49]  Yang Song,et al.  Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[50]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[51]  Terry Lyons,et al.  Neural Controlled Differential Equations for Irregular Time Series , 2020, NeurIPS.

[52]  Abhishek Kumar,et al.  Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[53]  Alan Edelman,et al.  A Differentiable Programming System to Bridge Machine Learning and Scientific Computing , 2019, ArXiv.

[54]  Hajime Asama,et al.  Dissecting Neural ODEs , 2020, NeurIPS.

[55]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[56]  M. Arató A famous nonlinear stochastic equation (Lotka-Volterra model with diffusion) , 2003 .

[57]  Miles Cranmer,et al.  Lagrangian Neural Networks , 2020, ICLR 2020.

[58]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[59]  D. Brigo,et al.  Interest Rate Models , 2001 .