Declarative and Preferential Bias in GP-based Scientific Discovery

This work examines two methods for evolving dimensionally correct equations on the basis of data. It is demonstrated that the use of units of measurement aids in evolving equations that are amenable to interpretation by domain specialists. One method uses a strong typing approach that implements a declarative bias towards correct equations, the other method uses a coercion mechanism in order to implement a preferential bias towards the same objective. Four experiments using real-world, unsolved scientific problems were performed in order to examine the differences between the approaches and to judge the worth of the induction methods.Not only does the coercion approach perform significantly better on two out of the four problems when compared to the strongly typed approach, but it also regularizes the expressions it induces, resulting in a more reliable search process.A trade-off between type correctness and ability to solve the problem is identified. Due to the preferential bias implemented in the coercion approach, this trade-off does not lead to sub-optimal performance. No evidence is found that the reduction of the search space achieved through declarative bias helps in finding better solutions faster. In fact, for the class of scientific discovery problems the opposite seems to be the case.

[1]  Jørgen Fredsøe,et al.  Data analysis of bed concentration of suspended sediment , 1994 .

[2]  M. Keijzer,et al.  Genetic programming as a model induction engine , 2000 .

[3]  Christopher D. Clack,et al.  PolyGP: a polymorphic genetic programming system in Haskell , 1997 .

[4]  J. K. Kinnear,et al.  Advances in Genetic Programming , 1994 .

[5]  Peter A. Jumars,et al.  Transport and breakdown of fecal pellets: Biological and sedimentological consequences1 , 1984 .

[6]  Vladan Babovic,et al.  Emergence, evolution, intelligence: hydroinformatics , 1996 .

[7]  C. Davies,et al.  Definitive equations for the fluid resistance of spheres , 1945 .

[8]  D. B. Simons,et al.  Summary of alluvial channel data from flume experiments, 1956-61 , 1966 .

[9]  David A. Link,et al.  The Relationship Between Sphere Size And Settling Velocity , 1971 .

[10]  Vladan Babovic,et al.  Genetic Programming, Ensemble Methods and the Bias/Variance Tradeoff - Introductory Investigations , 2000, EuroGP.

[11]  Conor Ryan,et al.  Adaptive logic programming , 2001 .

[12]  R. Hallermeier,et al.  Terminal settling velocity of commonly occurring sand grains , 1981 .

[13]  Peter J. Angeline,et al.  On Using Syntactic Constraints with Genetic Programming , 1996 .

[14]  Conor Ryan,et al.  Ripple Crossover in Genetic Programming , 2001, EuroGP.

[15]  Vedrana Kutija,et al.  A numerical model for assessing the additional resistance to flow introduced by flexible vegetation , 1996 .

[16]  M. O'Neill,et al.  Grammatical evolution , 2001, GECCO '09.

[17]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[18]  Torben Larsen,et al.  Discharge/Stage Relations in vegetated Danish Streams , 1990 .

[19]  Wolfgang Banzhaf,et al.  Genotype-Phenotype-Mapping and Neutral Variation - A Case Study in Genetic Programming , 1994, PPSN.

[20]  Peter A. Whigham,et al.  Grammatical bias for evolutionary learning , 1996 .

[21]  Paul D. Komar,et al.  Analyses of the settling velocities of fecal pellets from the subtidal polychaete Amphicteis scaphobranchiata , 1985 .

[22]  David J. Montana,et al.  Strongly Typed Genetic Programming , 1995, Evolutionary Computation.

[23]  Marcelo Horacio Garcia,et al.  Entrainment of Bed Sediment into Suspension , 1991 .

[24]  Man Leung Wong,et al.  Evolutionary Program Induction Directed by Logic Grammars , 1997, Evolutionary Computation.

[25]  Maarten Keijzer,et al.  Efficiently representing populations in genetic programming , 1996 .

[26]  Peter A. Whigham,et al.  Search bias, language bias and genetic programming , 1996 .

[27]  M. Keijzer,et al.  Dimensionally aware genetic programming , 1999 .

[28]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[29]  Roger L. Wainwright,et al.  Type inheritance in strongly typed genetic programming , 1996 .

[30]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[31]  Michèle Sebag,et al.  Genetic Programming and Domain Knowledge: Beyond the Limitations of Grammar-Guided Machine Discovery , 2000, PPSN.