Robust Hybrid Learning With Expert Augmentation

Hybrid modelling reduces the misspecification of expert models by combining them with machine learning (ML) components learned from data. Similarly to many ML algorithms, hybrid model performance guarantees are limited to the training distribution. Leveraging the insight that the expert model is usually valid even outside the training domain, we overcome this limitation by introducing a hybrid data augmentation strategy termed \textit{expert augmentation}. Based on a probabilistic formalization of hybrid modelling, we demonstrate that expert augmentation, which can be incorporated into existing hybrid systems, improves generalization. We empirically validate the expert augmentation on three controlled experiments modelling dynamical systems with ordinary and partial differential equations. Finally, we assess the potential real-world applicability of expert augmentation on a dataset of a real double pendulum.

[1]  X. Jia,et al.  Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems , 2020, ACM Comput. Surv..

[2]  Marcin Andrychowicz,et al.  Deep learning for twelve hour precipitation forecasts , 2022, Nature Communications.

[3]  D. Barber,et al.  Generalization Gap in Amortized Inference , 2022, NeurIPS.

[4]  Roy R. Lederman,et al.  Adaptation of the Independent Metropolis-Hastings Sampler with Normalizing Flow Proposals , 2021, AISTATS.

[5]  A. Stuart,et al.  A Framework for Machine Learning of Model Error in Dynamical Systems , 2021, Communications of the American Mathematical Society.

[6]  Nal Kalchbrenner,et al.  Skillful Twelve Hour Precipitation Forecasts using Large Context Neural Networks , 2021, ArXiv.

[7]  Gary R. Mirams,et al.  Neural Network Differential Equations For Ion Channel Modelling , 2021, Frontiers in Physiology.

[8]  Mihaela van der Schaar,et al.  Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression , 2021, NeurIPS.

[9]  Alexandros Kalousis,et al.  Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling , 2021, NeurIPS.

[10]  Uri Shalit,et al.  On Calibration and Out-of-domain Generalization , 2021, NeurIPS.

[11]  Pang Wei Koh,et al.  WILDS: A Benchmark of in-the-Wild Distribution Shifts , 2020, ICML.

[12]  Richard Bonneau,et al.  Masked graph modeling for molecule generation , 2020, Nature Communications.

[13]  R. Zemel,et al.  Environment Inference for Invariant Learning , 2020, ICML.

[14]  Emmanuel de B'ezenac,et al.  Augmenting physical models with deep networks for complex dynamics forecasting , 2020, ICLR.

[15]  David Lopez-Paz,et al.  In Search of Lost Domain Generalization , 2020, ICLR.

[16]  J. Schneider,et al.  Neural Dynamical Systems: Balancing Structure and Flexibility in Physical Prediction , 2020, 2021 60th IEEE Conference on Decision and Control (CDC).

[17]  Saibal Mukhopadhyay,et al.  Physics-incorporated convolutional recurrent neural networks for source identification and forecasting of dynamical systems , 2020, Neural Networks.

[18]  Aaron C. Courville,et al.  Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.

[19]  José Miguel Hernández-Lobato,et al.  A Gradient Based Strategy for Hamiltonian Monte Carlo Hyperparameter Optimization , 2021, ICML.

[20]  Ed H. Chi,et al.  Fairness without Demographics through Adversarially Reweighted Learning , 2020, NeurIPS.

[21]  M. Bethge,et al.  Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[22]  Miles Cranmer,et al.  Lagrangian Neural Networks , 2020, ICLR 2020.

[23]  Nicolas Thome,et al.  Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  F. Aghili,et al.  Energetically consistent model of slipping and sticking frictional impacts in multibody systems , 2020, Multibody System Dynamics.

[25]  Tatsunori B. Hashimoto,et al.  Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.

[26]  Andrew Zisserman,et al.  Sim2real transfer learning for 3D human pose estimation: motion to the rescue , 2019, NeurIPS.

[27]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Gabriel Peyré,et al.  Universal Invariant and Equivariant Graph Neural Networks , 2019, NeurIPS.

[29]  Prabhat,et al.  Deep learning and process understanding for data-driven Earth system science , 2019, Nature.

[30]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[31]  James Y. Zou,et al.  Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.

[32]  Tomasz Kornuta,et al.  Learning beyond simulated physics , 2018 .

[33]  D. Tao,et al.  Deep Domain Generalization via Conditional Invariant Adversarial Networks , 2018, ECCV.

[34]  Guy N. Rothblum,et al.  Multicalibration: Calibration for the (Computationally-Identifiable) Masses , 2018, ICML.

[35]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[36]  Sergey Levine,et al.  Sim2Real View Invariant Visual Servoing by Recurrent Control , 2017, ArXiv.

[37]  Sergey Levine,et al.  Sim2Real View Invariant Visual Servoing by Recurrent Control , 2017 .

[38]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[39]  Regina Barzilay,et al.  Aspect-augmented Adversarial Networks for Domain Adaptation , 2017, TACL.

[40]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[42]  MarchandMario,et al.  Domain-adversarial training of neural networks , 2016 .

[43]  J. Rissanen Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.

[44]  Toshio Okada,et al.  A numerical analysis of chaos in the double pendulum , 2006 .

[45]  James E. Braun,et al.  An Inverse Gray-Box Model for Transient Building Load Prediction , 2002 .

[46]  M. Velez-Reyes,et al.  Gray-box modeling of electric drive systems using neural networks , 2001, Proceedings of the 2001 IEEE International Conference on Control Applications (CCA'01) (Cat. No.01CH37204).

[47]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[48]  Hayes,et al.  Review of Particle Physics. , 1996, Physical review. D, Particles and fields.

[49]  I.G. Kevrekidis,et al.  Continuous-time nonlinear signal processing: a neural network based approach for gray box identification , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[50]  Mark A. Kramer,et al.  Modeling chemical processes using prior knowledge and neural networks , 1994 .

[51]  Lyle H. Ungar,et al.  A hybrid neural network‐first principles approach to process modeling , 1992 .

[52]  A. Hodgkin,et al.  A quantitative description of membrane current and its application to conduction and excitation in nerve , 1990, Bulletin of mathematical biology.

[53]  Peter R. Smith,et al.  Physiological Models of the Human Vasculature and Photoplethysmography , 2022 .