What Constitutes a Scientific Database?

We propose that a scientific database should be inherently different from, say a business database. The difference is based on the nature of science itself, in which hypotheses, or logical implications, form an essential part of the discipline. Empirical observations give rise to tentative hypotheses. Individual hypotheses are then tested, refuted or refined, by further empirical observation. In the paper, we propose representing the observational data of science in a lattice format that also conveys all the logical implications that can be supported by those observations. We claim that such a structure can be incrementally created and that the hypotheses formed will adapt to new data. We demonstrate its practicality by presenting two real situations in which it has been used. Finally, we look at the rather considerable storage costs associated with this approach and discuss other limitations that are still unresolved in this new approach to the representation of scientific data.

[1]  Klaus Denecke,et al.  Galois connections and applications , 2004 .

[2]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[3]  John L. Pfaltz,et al.  Incremental Transformation of Lattices: A Key to Effective Knowledge Discovery , 2002, ICGT.

[4]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[5]  E. Allen Emerson,et al.  Temporal and Modal Logic , 1991, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[6]  David L. Spooner,et al.  The ROSE Data Manager: Using Object Technology to Support Interactive Engineering Applications , 1989, IEEE Trans. Knowl. Data Eng..

[7]  David J. DeWitt,et al.  The Object-Oriented Database System Manifesto , 1994, Building an Object-Oriented Database System, The Story of O2.

[8]  John L. Pfaltz Using Concept Lattices to Uncover Causal Dependencies in Software , 2006, ICFCA.

[9]  Jacob Stein,et al.  The GemStone object database management system , 1991, CACM.

[10]  Narain H. Gehani,et al.  ODE (Object Database and Environment): the language and the data model , 1989, SIGMOD '89.

[11]  Paul H. Edelman,et al.  The theory of convex geometries , 1985 .

[12]  Rokia Missaoui,et al.  A framework for incremental generation of closed itemsets , 2008, Discret. Appl. Math..

[13]  Thomas Behr,et al.  Topological relationships between complex spatial objects , 2006, TODS.

[14]  Gregor Snelting,et al.  Assessing Modular Structure of Legacy Code Based on Mathematical Concept Analysis , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[15]  Max J. Egenhofer,et al.  Spatial SQL: A Query and Presentation Language , 1994, IEEE Trans. Knowl. Data Eng..

[16]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[17]  G. Castellini Categorical closure operators , 2003 .

[18]  John F. Roddick,et al.  SQL/SE: a query language extension for databases supporting schema evolution , 1992, SGMD.

[19]  David J. DeWitt,et al.  Object and File Management in the EXODUS Extensible Database System , 1986, VLDB.

[20]  Padhraic Smyth,et al.  An Information Theoretic Approach to Rule Induction from Databases , 1992, IEEE Trans. Knowl. Data Eng..

[21]  Donald D. Chamberlin,et al.  A History of System R and SQL/Data System (Invited Paper) , 1981, VLDB.

[22]  John L. Pfaltz,et al.  A Functional Approach to Scientific Database Implementation , 1992, SSDBM.

[23]  Christophe Lécluse,et al.  O2, an object-oriented data model , 1988, SIGMOD '88.

[24]  John L. Pfaltz,et al.  Closed Set Mining of Biological Data , 2002, BIOKDD.

[25]  B. Ganter,et al.  Finding all closed sets: A general approach , 1991 .

[26]  Thomas Ball,et al.  The concept of dynamic analysis , 1999, ESEC/FSE-7.

[27]  Jack A. Orenstein,et al.  The ObjectStore database system , 1991, CACM.

[28]  Fernando Vélez,et al.  O2, an Object-Oriented Data Model , 1992, Building an Object-Oriented Database System, The Story of O2.

[29]  Vassilis J. Tsotras,et al.  Comparison of access methods for time-evolving data , 1999, CSUR.

[30]  Elke A. Rundensteiner,et al.  A Transparent Schema-Evolution System Based on Object-Oriented View Technology , 1997, IEEE Trans. Knowl. Data Eng..

[31]  David B. Lomet,et al.  Spatial database access methods , 1991, SGMD.

[32]  Rokia Missaoui,et al.  An Incremental Concept Formation Approach for Learning from Databases , 1994, Theor. Comput. Sci..

[33]  Ramez Elmasri,et al.  TSQL2 language specification , 1994, SGMD.

[34]  John L. Pfaltz,et al.  Jordan Surfaces in Discrete Antimatroid Topologies , 2004, IWCIA.

[35]  Douglas B. Lenat,et al.  DESIGNING A RULE SYSTEM THAT SEARCHES FOR SCIENTIFIC DISCOVERIES1 , 1978 .

[36]  Shashi K. Gadia,et al.  A Relational Model and SQL-like Query Language for Spatial Databases , 1993, Advanced Database Systems.

[37]  John L. Pfaltz,et al.  Closure spaces that are not uniquely generated , 2005, Discret. Appl. Math..

[38]  Douglas B. Lenat,et al.  Designing a rule system that searches for scientific discoveries , 1977, SGAR.

[39]  Jiming Liu,et al.  A Method of Learning Implication Networks from Empirical Data: Algorithm and Monte-Carlo Simulation-Based Validation , 1997, IEEE Trans. Knowl. Data Eng..

[40]  Roger King,et al.  Cactis: a self-adaptive, concurrent implementation of an object-oriented database management system , 1989, ACM Trans. Database Syst..

[41]  R. Singer,et al.  The Audubon Society field guide to North American mushrooms , 1981 .

[42]  John L. Pfaltz,et al.  Implementing Subscripted Identifiers in Scientific Databases , 1990, SSDBM.

[43]  Christian S. Jensen,et al.  Temporal statement modifiers , 2000, TODS.

[44]  John L. Pfaltz,et al.  Closure lattices , 1996, Discret. Math..