Principles of human-computer collaboration for knowledge discovery in science

An important problem in computational scientific discovery is to identify, among the diversity of discovery programs written in various sciences, a commonality that will take a next step beyond the acknowledged general--but weak--framework of heuristic search. We characterize discovery in science as the generation of novel, interesting, plausible, and intelligible knowledge about the objects of study. We then analyze four current machine discovery programs in chemistry, medicine, mathematics, and linguistics according to how their design, or the circumstances of their application, heighten the chances of finding knowledge that has all four properties. Some general patterns emerge, although some strategies seem idiosyncratic. Our candidate for a commonality, which focuses on human factors, can be used pragmatically to evaluate and compare the designs of discovery programs that are intended to be used as collaborators by scientists. © 1999 Elsevier Science B.V. All rights reserved.

[1]  J. C. Shaw,et al.  Programming the logic theory machine , 1899, IRE-AIEE-ACM '57 (Western).

[2]  W. Salmon The foundations of scientific inference , 1967 .

[3]  Joshua Lederberg,et al.  Applications of Artificial Intelligence for Organic Chemistry: The DENDRAL Project , 1980 .

[4]  Fan Chung,et al.  The average distance and the independence number , 1988 .

[5]  Martin Stacey,et al.  Scientific Discovery: Computational Explorations of the Creative Processes , 1988 .

[6]  N.A.B. Gray,et al.  Dendral and meta-dendral — the myth and the reality , 1988 .

[7]  Siemion Fajtlowicz,et al.  On conjectures of Graffiti , 1988, Discret. Math..

[8]  D. Swanson Migraine and Magnesium: Eleven Neglected Connections , 2015, Perspectives in biology and medicine.

[9]  Raúl E. Valdés-Pérez,et al.  Algorithm to generate reaction pathways for computer‐assisted elucidation , 1992 .

[10]  Bruce G. Buchanan,et al.  DENDRAL and Meta-DENDRAL: Roots of Knowledge Systems and Expert System Applications , 1993, Artif. Intell..

[11]  Jan M. Zytkow,et al.  Scientific Model-Building as Search in Matrix Spaces , 1993, AAAI.

[12]  Joshua Lederberg,et al.  DENDRAL: A Case Study of the First Expert System for Scientific Hypothesis Formation , 1993, Artif. Intell..

[13]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[14]  P. Thagard,et al.  Explanatory coherence , 1993 .

[15]  Raúl E. Valdés-Pérez,et al.  Algorithm to test the structural plausibility of a proposed elementary reaction , 1993, J. Comput. Chem..

[16]  Peter Dankelmann Average Distance and Independence Number , 1994, Discret. Appl. Math..

[17]  Human/computer interactive elucidation of reaction mechanisms: application to catalyzed hydrogenolysis of ethane , 1994 .

[18]  L. Darden,et al.  Reasoning Strategies in Molecular Biology: Abstractions, Scans and Anomalies , 1994, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association.

[19]  Raúl E. Valdés-Pérez,et al.  Algorithm to infer the structures of molecular formulas within a reaction pathway , 1994, J. Comput. Chem..

[20]  Raúl E. Valdés-Pérez,et al.  Some Recent Human-Computer Discoveries in Science and What Accounts for Them , 1995, AI Mag..

[21]  Rá Ul,et al.  Machine Discovery in Chemistry: New Results , 1995 .

[22]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[23]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[24]  Valdés-Pérez,et al.  Systematic generation of constituent models of particle families. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[25]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[26]  Raúl E. Valdés-Pérez Systematic detection of subtle spatio–temporal patterns in time‐lapse imaging: I. Mitosis , 1996 .

[27]  Rajjan Shinghal,et al.  Proposed Interestingness Measure for Characteristic Rules , 1996, AAAI/IAAI, Vol. 2.

[28]  Aurora Pérez,et al.  A Computational Approach to George Boole's Discovery of Mathematical Logic , 1997, Artif. Intell..

[29]  Arie Rip,et al.  The Computer Revolution in Science: Steps Towards the Realization of Computer-Supported Discovery Environments , 1997, Artif. Intell..

[30]  Neil R. Smalheiser,et al.  Artificial Intelligence An interactive system for finding complementary literatures : a stimulus to scientific discovery , 1995 .

[31]  Raúl E. Valdés-Pérez,et al.  Computer-Aided Mechanism Elucidation of Acetylene Hydrocarboxylation to Acrylic Acid Based on a Novel Union of Empirical and Formal Methods , 1997 .

[32]  Jean-Gabriel Ganascia,et al.  Induction and the Discovery of the Causes of Scurvy: A Computational Reconstruction , 1997, Artif. Intell..

[33]  Usama M. Fayyad,et al.  Knowledge Discovery in Databases: An Overview , 1997, ILP.

[34]  Derek H. Sleeman,et al.  ReTAX: A Step in the Automation of Taxonomic Revision , 1997, Artif. Intell..

[35]  Derek H. Sleeman Scientific discovery and simplicity of method Editorial , 1997 .

[36]  Raúl E. Valdés-Pérez,et al.  Maximally Parsimonious Discrimination: A Generic Task from Linguistic Discovery , 1997, AAAI/IAAI.

[37]  Pat Langley,et al.  The Computer-Aided Discovery of Scientific Knowledge , 1998, Discovery Science.

[38]  Raúl E. Valdés-Pérez,et al.  Oxidative carbonylation of phenylacetylene catalyzed by Pd(II) and Cu(I): Experimental tests of forty-one computer-generated mechanistic hypotheses , 1998 .

[39]  S. Salzberg,et al.  Microbial gene identification using interpolated Markov models. , 1998, Nucleic acids research.

[40]  Raúl E. Valdés-Pérez,et al.  Systematic detection of subtle spatio‐temporal patterns in time‐lapse imaging: II. Particle migrations , 1998 .

[41]  O. Temkin,et al.  Metal-catalyzed ethylene hydrogenation : The method of interactive search for multiple working hypotheses , 1998 .

[42]  Raúl E. Valdés-Pérez,et al.  A Procedure for Multi-Class Discrimination and some Linguistic Applications , 1998, COLING-ACL.

[43]  Raúl E. Valdés-Pérez,et al.  Automatic componential analysis of kinship semantics with a proposed structural solution to the problem of multiple models , 1998 .

[44]  Raúl E. Valdés-Pérez,et al.  Proposed Methodological Improvement in the Elucidation of Chemical Reaction Mechanisms Based on Chemist-Computer Interaction , 2000 .

[45]  Rá Ul,et al.  A New Theorem in Particle Physics Enabled by Machine Discovery , 2022 .