JKOnet: Proximal Optimal Transport Modeling of Population Dynamics

Consider a heterogeneous population of points evolving with time. While the population evolves, both in size and nature, we can observe it periodically, through snapshots taken at different timestamps. Each of these snapshots is formed by sampling points from the population at that time, and then creating features to recover point clouds. While these snapshots describe the population's evolution on aggregate, they do not provide directly insights on individual trajectories. This scenario is encountered in several applications, notably single-cell genomics experiments, tracking of particles, or when studying crowd motion. In this paper, we propose to model that dynamic as resulting from the celebrated Jordan-Kinderlehrer-Otto (JKO) proximal scheme. The JKO scheme posits that the configuration taken by a population at time $t$ is one that trades off a decrease w.r.t. an energy (the model we seek to learn) penalized by an optimal transport distance w.r.t. the previous configuration. To that end, we propose JKOnet, a neural architecture that combines an energy model on measures, with (small) optimal displacements solved with input convex neural networks (ICNN). We demonstrate the applicability of our model to explain and predict population dynamics.

[1]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[2]  Sewoong Oh,et al.  Optimal transport mapping via input convex neural networks , 2019, ICML.

[3]  Marco Cuturi,et al.  Spatio-Temporal Alignments: Optimal transport through space and time , 2019, AISTATS.

[4]  Marco Cuturi,et al.  On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests , 2015, Entropy.

[5]  D. Kinderlehrer,et al.  THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .

[6]  S. Shreve,et al.  Stochastic differential equations , 1955, Mathematical Proceedings of the Cambridge Philosophical Society.

[7]  P. Rigollet,et al.  Optimal-Transport Analysis of Single-Cell Gene Expression Identifies Developmental Trajectories in Reprogramming , 2019, Cell.

[8]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[9]  Gabriel Peyré,et al.  Entropic Approximation of Wasserstein Gradient Flows , 2015, SIAM J. Imaging Sci..

[10]  Anna Korba,et al.  The Wasserstein Proximal Gradient Algorithm , 2020, NeurIPS.

[11]  Han Zhang,et al.  Improving GANs Using Optimal Transport , 2018, ICLR.

[12]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[13]  Aaron C. Courville,et al.  Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization , 2020, ICLR.

[14]  Lei Xu,et al.  Input Convex Neural Networks : Supplementary Material , 2017 .

[15]  Ronald R. Coifman,et al.  Visualizing structure and transitions in high-dimensional biological data , 2019, Nature Biotechnology.

[16]  Quentin Mérigot,et al.  Discretization of functionals involving the Monge–Ampère operator , 2014, Numerische Mathematik.

[17]  Tommi S. Jaakkola,et al.  Learning population-level diffusions with generative recurrent networks , 2016, ICML 2016.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  David Duvenaud,et al.  Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[20]  F. Santambrogio {Euclidean, metric, and Wasserstein} gradient flows: an overview , 2016, 1609.03890.

[21]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[22]  Alain Trouvé,et al.  Interpolating between Optimal Transport and MMD using Sinkhorn Divergences , 2018, AISTATS.

[23]  Gabriel Peyré,et al.  Sample Complexity of Sinkhorn Divergences , 2018, AISTATS.

[24]  David Heath,et al.  Stochastic Differential Equations , 2006 .

[25]  Yuanyuan Shi,et al.  Optimal Control Via Neural Networks: A Convex Approach , 2018, ICLR.

[26]  J. Carrillo,et al.  Primal Dual Methods for Wasserstein Gradient Flows , 2019, Foundations of Computational Mathematics.

[27]  M. Burger,et al.  A mixed finite element method for nonlinear diffusion equations , 2010 .