ReviewTowards Robot Scientists for autonomous scientific discovery

We review the main components of autonomous scientific discovery, and how they lead to the concept of a Robot Scientist. This is a system which uses techniques from artificial intelligence to automate all aspects of the scientific discovery process: it generates hypotheses from a computer model of the domain, designs experiments to test these hypotheses, runs the physical experiments using robotic systems, analyses and interprets the resulting data, and repeats the cycle. We describe our two prototype Robot Scientists: Adam and Eve. Adam has recently proven the potential of such systems by identifying twelve genes responsible for catalysing specific reactions in the metabolic pathways of the yeast Saccharomyces cerevisiae. This work has been formally recorded in great detail using logic. We argue that the reporting of science needs to become fully formalised and that Robot Scientists can help achieve this. This will make scientific information more reproducible and reusable, and promote the integration of computers in scientific reasoning. We believe the greater automation of both the physical and intellectual aspects of scientific investigations to be essential to the future of science. Greater automation improves the accuracy and reliability of experiments, increases the pace of discovery and, in common with conventional laboratory automation, removes tedious and repetitive tasks from the human scientist.

[1]  Sašo Džeroski,et al.  Computational Discovery of Scientific Knowledge, Introduction, Techniques, and Applications in Environmental and Life Sciences , 2007, Computational Discovery of Scientific Knowledge.

[2]  Rob Edwards,et al.  Modular Approaches to Automation System Design Using Industrial Robots , 2008 .

[3]  A. Manz,et al.  Lab-on-a-chip: microfluidics in drug discovery , 2006, Nature Reviews Drug Discovery.

[4]  Douglas B. Lenat,et al.  Why AM and EURISKO Appear to Work , 1984, Artif. Intell..

[5]  Herbert A. Simon,et al.  Scientific discovery: compulalional explorations of the creative process , 1987 .

[6]  Ken E. Whelan,et al.  The Automation of Science , 2009, Science.

[7]  Paolo Mancarella,et al.  Abductive Logic Programming , 1992, LPNMR.

[8]  Peter A. Flach,et al.  Abduction, induction, and the logic of scientific knowledge development , 2006 .

[9]  Satoshi Saitoh,et al.  Fully Automated Laboratory Robotic System for Automating Sample Preparation and Analysis to Reduce Cost and Time in Drug Development Process , 2008 .

[10]  Pat Langley,et al.  A Robust Approach to Numeric Discovery , 1990, ML.

[11]  Darko Butina,et al.  Unsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets , 1999, J. Chem. Inf. Comput. Sci..

[12]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[13]  Jim Gray,et al.  2020 Computing: Science in an exponential world , 2006, Nature.

[14]  Jessica A. Turner,et al.  Modeling biomedical experimental processes with OBI , 2010, J. Biomed. Semant..

[15]  Debra Burdick,et al.  Living with Irresolute Cell Lines in an Automated World , 2008 .

[16]  Tatsuji Nakamura,et al.  A System for LogD Screening of New Drug Candidates using a Water-Plug Injection Method and Automated Liquid Handler , 2009 .

[17]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[18]  Ross D King,et al.  An ontology of scientific experiments , 2006, Journal of The Royal Society Interface.

[19]  Jan M. Zytkow,et al.  Automated Discovery of Empirical Laws , 1996, Fundam. Informaticae.

[20]  Peter D. Karp,et al.  A survey of orphan enzyme activities , 2007, BMC Bioinformatics.

[21]  Pat Langley,et al.  The Computer-Aided Discovery of Scientific Knowledge , 1998, Discovery Science.

[22]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[23]  Amanda Clare,et al.  The EXACT description of biomedical protocols , 2008, ISMB.

[24]  Ross D. King,et al.  Using a logical model to predict the growth of yeast , 2008, BMC Bioinformatics.

[25]  Brian Falkenhainer,et al.  Integrating quantitative and qualitative discovery: The ABACUS system , 2004, Machine Learning.

[26]  Herbert A. Simon,et al.  The Processes of Scientific Discovery: The Strategy of Experimentation , 1988, Cogn. Sci..

[27]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[28]  Stephen H. Muggleton,et al.  2020 Computing: Exceeding human limits , 2006, Nature.

[29]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[30]  Miguel A. Maccio,et al.  Modular Automation Platforms: A Case Study of a Flexible NMR Sample Preparation Robot , 2006 .

[31]  Christopher H. Bryant,et al.  Functional genomic hypothesis generation and experimentation by a robot scientist , 2004, Nature.

[32]  Tomasz Arodz,et al.  Computational methods in developing quantitative structure-activity relationships (QSAR): a review. , 2006, Combinatorial chemistry & high throughput screening.

[33]  Joshua Lederberg,et al.  DENDRAL: A Case Study of the First Expert System for Scientific Hypothesis Formation , 1993, Artif. Intell..

[34]  K. Tatchell,et al.  Yeast cAMP-dependent protein kinase regulatory subunit mutations display a variety of phenotypes. , 1990, The Journal of biological chemistry.

[35]  Ian Foster,et al.  2020 Computing: A two-way street to science's future , 2006, Nature.

[36]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[37]  Stephen Muggleton,et al.  Developing a Logical Model of Yeast Metabolism , 2001, Electron. Trans. Artif. Intell..

[38]  Amanda Clare,et al.  An ontology for a Robot Scientist , 2006, ISMB.