Proximal Optimal Transport Modeling of Population Dynamics

Consider a population of particles evolving with time, monitored through snapshots, using particles sampled within the population at successive timestamps. Given only access to these snapshots, can we reconstruct individual trajectories for these particles? This question arises in many crucial scientific challenges of our time, notably single-cell genomics. In this paper, we propose to model population dynamics as realizations of a causal JordanKinderlehrer-Otto (JKO) flow of measures: The JKO scheme posits that the new configuration taken by a population at time t+ 1 is one that trades off a better configuration for the population, in the sense that it decreases an energy, while remaining close (in Wasserstein distance) to the previous configuration observed at t. Our goal in this work is to learn such an energy given data. To that end, we propose JKOnet, a neural architecture that computes (in end-to-end differentiable fashion) the JKO flow given a parametric energy and initial configuration of points.We demonstrate the good performance and robustness of the JKOnet fitting procedure, compared to a more direct forward method.

[1]  Nicolas Courty,et al.  POT: Python Optimal Transport , 2021, J. Mach. Learn. Res..

[2]  Johan Karlsson,et al.  Estimating ensemble flows on a hidden Markov chain , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[3]  Tommi S. Jaakkola,et al.  Learning population-level diffusions with generative recurrent networks , 2016, ICML 2016.

[4]  D. Kinderlehrer,et al.  THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .

[5]  Kevin Scaman,et al.  Lipschitz regularity of deep neural networks: analysis and efficient estimation , 2018, NeurIPS.

[6]  Gabriel Peyré,et al.  Entropic Approximation of Wasserstein Gradient Flows , 2015, SIAM J. Imaging Sci..

[7]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[8]  Luis A. Caffarelli,et al.  Monotonicity Properties of Optimal Transportation¶and the FKG and Related Inequalities , 2000 .

[9]  Youssef Mroueh,et al.  Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks , 2021, Trans. Mach. Learn. Res..

[10]  Alexandre d'Aspremont,et al.  Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport , 2019, AISTATS.

[11]  Ronald R. Coifman,et al.  Visualizing structure and transitions in high-dimensional biological data , 2019, Nature Biotechnology.

[12]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[13]  Yee Whye Teh,et al.  Set Transformer , 2018, ICML.

[14]  Evgeny Burnaev,et al.  Large-Scale Wasserstein Gradient Flows , 2021, NeurIPS.

[15]  Wenhan Luo,et al.  Multiple object tracking: A literature review , 2014, Artif. Intell..

[16]  Rahul Singh,et al.  Multi-Marginal Optimal Transport and Probabilistic Graphical Models , 2020, IEEE Transactions on Information Theory.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[19]  Han Zhang,et al.  Improving GANs Using Optimal Transport , 2018, ICLR.

[20]  Gabriel Peyré,et al.  Stochastic Deep Networks , 2018, ICML.

[21]  Thomas G. Dietterich,et al.  Collective Graphical Models , 2011, NIPS.

[22]  Jean-David Benamou,et al.  An augmented Lagrangian approach to Wasserstein gradient flows and applications , 2016 .

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  A. Figalli The Optimal Partial Transport Problem , 2010 .

[25]  S. Shreve,et al.  Stochastic differential equations , 1955, Mathematical Proceedings of the Cambridge Philosophical Society.

[26]  Dexter Kozen,et al.  Collective Inference on Markov Models for Modeling Bird Migration , 2007, NIPS.

[27]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[28]  Gabriel Peyré,et al.  Sample Complexity of Sinkhorn Divergences , 2018, AISTATS.

[29]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[30]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[31]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[32]  Lei Xu,et al.  Input Convex Neural Networks : Supplementary Material , 2017 .

[33]  Saradha Venkatachalapathy,et al.  Predicting cell lineages using autoencoders and optimal transport , 2020, PLoS Comput. Biol..

[34]  Aaron Courville,et al.  Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization , 2021, ICLR.

[35]  Stephen J. Wright,et al.  Data assimilation in weather forecasting: a case study in PDE-constrained optimization , 2009 .

[36]  David Duvenaud,et al.  Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[37]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[38]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[39]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[40]  David van Dijk,et al.  TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics , 2020, ICML.

[41]  Quentin Mérigot,et al.  Discretization of functionals involving the Monge–Ampère operator , 2014, Numerische Mathematik.

[42]  Caroline Uhler,et al.  Scalable Unbalanced Optimal Transport using Generative Adversarial Networks , 2018, ICLR.

[43]  F. Santambrogio {Euclidean, metric, and Wasserstein} gradient flows: an overview , 2016, 1609.03890.

[44]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[45]  Ricky T. Q. Chen,et al.  Scalable Gradients and Variational Inference for Stochastic Differential Equations , 2019, AABI.

[46]  J. Carrillo,et al.  Primal Dual Methods for Wasserstein Gradient Flows , 2019, Foundations of Computational Mathematics.

[47]  M. Burger,et al.  A mixed finite element method for nonlinear diffusion equations , 2010 .

[48]  Uri Shalit,et al.  Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.

[49]  W. Stahel,et al.  Stochastic partial differential equation based modelling of large space–time data sets , 2012, 1204.6118.

[50]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[51]  Sewoong Oh,et al.  Optimal transport mapping via input convex neural networks , 2019, ICML.

[52]  Yongxin Chen,et al.  Multimarginal Optimal Transport with a Tree-Structured Cost and the Schrödinger Bridge Problem , 2021, SIAM J. Control. Optim..

[53]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[54]  Yuanyuan Shi,et al.  Optimal Control Via Neural Networks: A Convex Approach , 2018, ICLR.

[55]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[56]  Fabian J Theis,et al.  Current best practices in single‐cell RNA‐seq analysis: a tutorial , 2019, Molecular systems biology.