AI Descartes: Combining Data and Theory for Derivable Scientific Discovery

Scientists have long aimed to discover meaningful formulae which accurately describe experimental data. One common approach is to manually create mathematical models of natural phenomena using domain knowledge, then fit these models to data. In contrast, machine-learning algorithms automate the construction of accurate data-driven models while consuming large amounts of data. Ensuring that such models are consistent with existing knowledge is an open problem. We develop a method for combining logical reasoning with symbolic regression, enabling principled derivations of models of natural phenomena. We demonstrate these concepts for Kepler’s third law of planetary motion, Einstein’s relativistic time-dilation law, and Langmuir’s theory of adsorption, automatically connecting experimental data with background theory in each case. We show that laws can be discovered from few data points when using formal logical reasoning to distinguish the correct formula from a set of plausible formulas that have similar error on the data. The combination of reasoning with machine learning provides generalizable insights into key aspects of natural phenomena. We envision that this combination will enable derivable discovery of fundamental laws of science. We believe that this is a crucial first step for connecting the missing links in automating the scientific method.

[1]  Omar Fawzi,et al.  Learning dynamic polynomial proofs , 2019, NeurIPS.

[2]  Achille Fokoue,et al.  A Deep Reinforcement Learning based Approach to Learning Transferable Proof Guidance Strategies , 2019, ArXiv.

[3]  Marco Gori,et al.  Constraint-Based Visual Generation , 2018, ICANN.

[4]  Stefano Curtarolo,et al.  SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates , 2017, Physical Review Materials.

[5]  Una-May O'Reilly,et al.  Genetic Programming II: Automatic Discovery of Reusable Programs. , 1994, Artificial Life.

[6]  Vijay Ganesh,et al.  Logic Guided Genetic Algorithms (Student Abstract) , 2021, AAAI.

[7]  Rui Xu,et al.  Discovering Symbolic Models from Deep Learning with Inductive Biases , 2020, NeurIPS.

[8]  D. Rugar,et al.  Optical clocks and relativity , 2013 .

[9]  Guillaume Lample,et al.  Deep Learning for Symbolic Mathematics , 2019, ICLR.

[10]  B. Novaković Orbits Of Five Visual Binary Stars , 2007, 0712.4242.

[11]  Glenn S. Smith A simple electromagnetic model for the light clock of special relativity , 2011 .

[12]  André Platzer,et al.  Differential Dynamic Logic for Hybrid Systems , 2008, Journal of Automated Reasoning.

[13]  Max Tegmark,et al.  AI Feynman: A physics-inspired method for symbolic regression , 2019, Science Advances.

[14]  Andy R. Terrel,et al.  SymPy: Symbolic computing in Python , 2017, PeerJ Prepr..

[15]  Helio J. C. Barbosa,et al.  Symbolic regression via genetic programming , 2000, Proceedings. Vol.1. Sixth Brazilian Symposium on Neural Networks.

[16]  Ruslan Salakhutdinov,et al.  On Universal Approximation by Neural Networks with Uniform Guarantees on Approximation of Infinite Dimensional Maps , 2019, ArXiv.

[17]  Robert Babuska,et al.  Symbolic Regression for Constructing Analytic Models in Reinforcement Learning , 2019, ArXiv.

[18]  Max Tegmark,et al.  AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity , 2020, NeurIPS.

[19]  David Steurer,et al.  Sum-of-squares proofs and the quest toward optimal algorithms , 2014, Electron. Colloquium Comput. Complex..

[20]  D. Castelvecchi AI Copernicus ‘discovers’ that Earth orbits the Sun , 2019, Nature.

[21]  A. Myers,et al.  Rigorous thermodynamic treatment of gas adsorption , 1988 .

[22]  Bayesian Symbolic Regression , 2019, 1910.08892.

[23]  Sarah M. Loos,et al.  Mathematical Reasoning in Latent Space , 2019, ICLR.

[24]  I. Langmuir THE ADSORPTION OF GASES ON PLANE SURFACES OF GLASS, MICA AND PLATINUM. , 1918 .

[25]  Gitta Kutyniok,et al.  Tensor network approaches for learning non-linear dynamical laws , 2020, ArXiv.

[26]  Renato Renner,et al.  Discovering physical concepts with neural networks , 2018, Physical review letters.

[27]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[28]  P. Parrilo Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization , 2000 .

[29]  Nikolaos V. Sahinidis,et al.  A global MINLP approach to symbolic regression , 2018, Mathematical Programming.

[30]  Anima Anandkumar,et al.  Combining Symbolic Expressions and Black-box Function Evaluations in Neural Programs , 2018, ICLR.

[31]  Armando Solar-Lezama,et al.  DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning , 2021, PLDI.

[32]  S. Brunton,et al.  Discovering governing equations from data by sparse identification of nonlinear dynamical systems , 2015, Proceedings of the National Academy of Sciences.

[33]  Nikolaos V. Sahinidis,et al.  BARON: A general purpose global optimization software package , 1996, J. Glob. Optim..

[34]  Benjamin Müller,et al.  The SCIP Optimization Suite 5.0 , 2017, 2112.08872.

[35]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[36]  Marta Sales-Pardo,et al.  A Bayesian machine scientist to aid in the solution of challenging scientific problems , 2020, Science Advances.

[37]  Michael F. Korns Accuracy in Symbolic Regression , 2011 .

[38]  Peter Baumgartner,et al.  Beagle - A Hierarchic Superposition Theorem Prover , 2015, CADE.

[39]  Nathan Fulton,et al.  KeYmaera X: An Axiomatic Tactical Theorem Prover for Hybrid Systems , 2015, CADE.

[40]  B. Schieber,et al.  Globally Optimal Symbolic Regression , 2017, 1710.10720.

[41]  Dario Izzo,et al.  Differentiable Genetic Programming , 2016, EuroGP.

[42]  Christoph H. Lampert,et al.  Extrapolation and learning equations , 2016, ICLR.

[43]  Hiroaki Kitano,et al.  Artificial Intelligence to Win the Nobel Prize and Beyond: Creating the Engine for Scientific Discovery , 2016, AI Mag..

[44]  F. Behroozi A Simple Derivation of Time Dilation and Length Contraction in Special Relativity , 2014 .

[45]  EDWARD A. HIRSCH,et al.  COMPLEXITY OF SEMIALGEBRAIC PROOFS , 2003 .

[46]  Herbert B. Enderton,et al.  A mathematical introduction to logic , 1972 .