An Artificial Experimenter for Enzymatic Response Characterisation

Identifying the characteristics of biological systems through physical experimentation, is restricted by the resources available, which are limited in comparison to the size of the parameter spaces being investigated. New tools are required to assist scientists in the effective characterisation of such behaviours. By combining artificial intelligence techniques for active experiment selection, with a microfluidic experimentation platform that reduces the volumes of reactants required per experiment, a fully autonomous experimentation machine is in development to assist biological response characterisation. Part of this machine, an artificial experimenter, has been designed that automatically proposes hypotheses, then determines experiments to test those hypotheses and explore the parameter space. Using a multiple hypotheses approach that allows for representative models of response behaviours to be produced with few observations, the artificial experimenter has been employed in a laboratory setting, where it selected experiments for a human scientist to perform, to investigate the optical absorbance properties of NADH.

[1]  G. A. Montgomery,et al.  Ultraviolet absorption spectra of DPN and analogs of DPN. , 1959, Archives of biochemistry and biophysics.

[2]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[3]  T. C. CHAMBERLIN The Method of Multiple Working Hypotheses , 1931, The Journal of Geology.

[4]  Jonathan S. Lindsey,et al.  A two-tiered strategy for simplex and multidirectional optimization of reactions with an automated chemistry workstation , 2002 .

[5]  Klaus-Peter Zauner,et al.  Self-Adaptive Scouting - Autonomous Experimentation for Systems Biology , 2004, EvoWorkshops.

[6]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[7]  P. Langley,et al.  Computational Models of Scientific Discovery and Theory Formation , 1990 .

[8]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[9]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[10]  J. March Exploration and exploitation in organizational learning , 1991, STUDI ORGANIZZATIVI.

[11]  Peter C. Cheeseman,et al.  Onboard Science Data Analysis: Applying Data Mining to Science-Directed Autonomy , 1998, IEEE Intell. Syst..

[12]  Yang Li,et al.  Analysis of Tiling Microarray Data by Learning Vector Quantization and Relevance Learning , 2007, IDEAL.

[13]  F. J. Anscombe,et al.  Graphs in Statistical Analysis , 1973 .

[14]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[15]  Jonathan S. Lindsey,et al.  A parallel simplex search method for use with an automated chemistry workstation , 2002 .

[16]  Tom Addis,et al.  A Simulation of Model-Based Reasoning about Disparate Phenomena , 1999 .

[17]  Klaus-Peter Zauner,et al.  Integration of Cellular Biological Structures Into Robotic Systems , 2009 .

[18]  R. L. Eubank A Simple Smoothing Spline , 1994 .

[19]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[20]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[21]  Shlomo Argamon,et al.  Minimizing Manual Annotation Cost in Supervised Training from Corpora , 1996, ACL.

[22]  Deepayan Chakrabarti,et al.  Multi-armed bandit problems with dependent arms , 2007, ICML '07.

[23]  Sebastian Thrun,et al.  Active Exploration in Dynamic Environments , 1991, NIPS.

[24]  W. Madych,et al.  Polyharmonic cardinal splines , 1990 .

[25]  Stephen Muggleton,et al.  Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes , 2001, Electron. Trans. Artif. Intell..

[26]  Saibal Roy,et al.  A micro electromagnetic generator for vibration energy harvesting , 2007 .

[27]  Herbert A. Simon,et al.  Experimentation in machine discovery , 1990 .

[28]  A. Lehninger Principles of Biochemistry , 1984 .

[29]  K. Salkauskas $C^1$ >splines for interpolation of rapidly varying data , 1984 .

[30]  Partha Niyogi,et al.  Active Learning for Function Approximation , 1994, NIPS.

[31]  Robert D. Nowak,et al.  Faster Rates in Regression via Active Learning , 2005, NIPS.

[32]  Ian Sinclair,et al.  Designing Committees of Models through Deliberate Weighting of Data Points , 2003, J. Mach. Learn. Res..

[33]  R. Castano,et al.  Onboard Autonomous Rover Science , 2007, 2007 IEEE Aerospace Conference.

[34]  G. Wahba Smoothing noisy data with spline functions , 1975 .

[35]  Ross D King,et al.  Intelligent software for laboratory automation. , 2004, Trends in biotechnology.

[36]  P. Davies,et al.  Approximating data with weighted smoothing splines , 2007, 0712.1692.

[37]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[38]  Sebastian Thrun,et al.  Exploration in active learning , 1998 .

[39]  Jonathan S. Lindsey,et al.  An approach for parallel and adaptive screening of discrete compounds followed by reaction optimization using an automated chemistry workstation , 2002 .

[40]  Raúl E. Valdés-Pérez,et al.  Machine Discovery of Chemical Reaction Pathways , 1991, AI Mag..

[41]  Jeffrey O. Pfaffmann,et al.  Scouting context-sensitive components , 2001, Proceedings Third NASA/DoD Workshop on Evolvable Hardware. EH-2001.

[42]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[43]  A. Atkinson,et al.  The design of experiments for discriminating between two rival models , 1975 .

[44]  P. Hall,et al.  Sequential methods for design-adaptive estimation of discontinuities in regression curves and surfaces , 2003 .

[45]  Klaus-Peter Zauner,et al.  Molecular Information Technology , 2005 .

[46]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[47]  Thomas G. Dietterich,et al.  Editors. Advances in Neural Information Processing Systems , 2002 .

[48]  Anne Lohrli Chapman and Hall , 1985 .

[49]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[50]  Jean-Christophe Plouvier,et al.  Experiment planner for strategic experimentation with an automated chemistry workstation , 1992 .

[51]  Jan M. Żytkow,et al.  Discovering quarks and hidden structure , 1991 .

[52]  Ross D. King,et al.  Active Learning for Regression Based on Query by Committee , 2007, IDEAL.

[53]  A. P. de Silva,et al.  Molecular logic and computing. , 2007, Nature nanotechnology.

[54]  T. C. Chamberlin The Method of Multiple Working Hypotheses , 1931, The Journal of Geology.

[55]  Amanda Clare,et al.  An ontology for a Robot Scientist , 2006, ISMB.

[56]  L. Andrew Corkan,et al.  Experiment manager software for an automated chemistry workstation, including a scheduler for parallel experimentation , 1992 .

[57]  Masashi Sugiyama,et al.  Active Learning with Model Selection in Linear Regression , 2008, SDM.

[58]  Archie C. Chapman,et al.  ε-first policies for budget-limited multi-armed bandits , 2010, AAAI 2010.

[59]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[60]  H. Chernoff Sequential Design of Experiments , 1959 .

[61]  R. L. Eubank,et al.  A simple smoothing spline, III , 2004, Comput. Stat..

[62]  G. Seelig,et al.  Enzyme-Free Nucleic Acid Logic Circuits , 2022 .

[63]  Klaus-Peter Zauner,et al.  Scouting enzyme behavior , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[64]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[65]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[66]  Jon Williamson,et al.  The Philosophy of Science and its relation to Machine Learning , 2009, Scientific Data Mining and Knowledge Discovery.

[67]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[68]  M Conrad,et al.  Enzymatic Computing , 2001, Biotechnology progress.

[69]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[70]  Leonard G. C. Hamey,et al.  Minimisation of data collection by active learning , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[71]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[72]  L. Andrew Corkan,et al.  A planning module for performing grid search, factorial design, and related combinatorial studies on an automated chemistry workstation , 1999 .

[73]  Fredrik Olsson,et al.  A literature survey of active machine learning in the context of natural language processing , 2009 .

[74]  David Gooding,et al.  Experiment and the Making of Meaning , 1990 .

[75]  Christopher H. Bryant,et al.  Functional genomic hypothesis generation and experimentation by a robot scientist , 2004, Nature.

[76]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[77]  Ken E. Whelan,et al.  The Automation of Science , 2009, Science.

[78]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[79]  Jan M. Zytkow,et al.  Constructing Models of Hidden Structure , 1991, ISMIS.

[80]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[81]  Sebastian Thrun,et al.  The role of exploration in learning control , 1992 .

[82]  C. Buck,et al.  Popper's philosophy for epidemiologists. , 1975, International journal of epidemiology.

[83]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[84]  Peter Ruoff,et al.  Computer Controlled Automated Assay for Comprehensive Studies of Enzyme Kinetic Parameters , 2010, PloS one.

[85]  A. Atkinson,et al.  Optimal design : Experiments for discriminating between several models , 1975 .

[86]  C. Reinsch Smoothing by spline functions , 1967 .

[87]  J. S. Hunter,et al.  Statistics for Experimenters: Design, Innovation, and Discovery , 2006 .

[88]  Jan M. Zytkow,et al.  Automated Discovery of Empirical Equations from Data , 1991, ISMIS.

[89]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[90]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[91]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[92]  R. A. Fisher,et al.  Design of Experiments , 1936 .

[93]  Shensheng Zhang,et al.  Active Learning with Ensembles for DOE , 2005 .

[94]  Douglas C. Montgomery,et al.  Response Surface Methodology: Process and Product Optimization Using Designed Experiments , 1995 .

[95]  John N. Tsitsiklis,et al.  Active Learning Using Arbitrary Binary Valued Queries , 1993, Machine Learning.

[96]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[97]  Klaus-Peter Zauner,et al.  Molecular approach to informal computing , 2001, Soft Comput..

[98]  G. Wahba Improper Priors, Spline Smoothing and the Problem of Guarding Against Model Errors in Regression , 1978 .

[99]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[100]  P E Brodelius Enzyme assays. , 1991, Current opinion in biotechnology.

[101]  I J Schoenberg,et al.  SPLINE FUNCTIONS AND THE PROBLEM OF GRADUATION. , 1964, Proceedings of the National Academy of Sciences of the United States of America.

[102]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[103]  Mark Woods,et al.  Developing an autonomous science capability for european mars missions , 2008, ICRA 2008.

[104]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[105]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[106]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[107]  Steve Chien,et al.  Enhancing Science and Automating Operations using Onboard Autonomy , 2006 .

[108]  Jan M. Zytkow,et al.  Automated Discovery: A Fusion of Multidisciplinary Principles , 2000, Canadian Conference on AI.

[109]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[110]  G. R. Hext,et al.  Sequential Application of Simplex Designs in Optimisation and Evolutionary Operation , 1962 .

[111]  Steven A. Orszag,et al.  CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .

[112]  M. E. Grismer,et al.  FIELD SENSOR NETWORKS AND AUTOMATED MONITORING OF SOIL WATER SENSORS , 1992 .

[113]  Kun Deng,et al.  Balancing exploration and exploitation: a new algorithm for active machine learning , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[114]  Wei-Min Shen Discovery as autonomous learning from the environment , 2004, Machine Learning.

[115]  G. Wahba Spline models for observational data , 1990 .

[116]  Jonathan S. Lindsey,et al.  An experiment planner for performing successive focused grid searches with an automated chemistry workstation , 2002 .

[117]  Jieming Zhu,et al.  Automated Discovery in a Chemistry Laboratory , 1990, AAAI.