Word count as a traditional programming benchmark problem for genetic programming

The Unix utility program wc, which stands for "word count," takes any number of files and prints the number of newlines, words, and characters in each of the files. We show that genetic programming can find programs that replicate the core functionality of the wc utility, and propose this problem as a "traditional programming" benchmark for genetic programming systems. This "wc problem" features key elements of programming tasks that often confront human programmers, including requirements for multiple data types, a large instruction set, control flow, and multiple outputs. Furthermore, it mimics the behavior of a real-world utility program, showing that genetic programming can automatically synthesize programs with general utility. We suggest statistical procedures that should be used to compare performances of different systems on traditional programming problems such as the wc problem, and present the results of a short experiment using the problem. Finally, we give a short analysis of evolved solution programs, showing how they make use of traditional programming concepts.

[1]  Hod Lipson,et al.  Coevolution of Fitness Predictors , 2008, IEEE Transactions on Evolutionary Computation.

[2]  Lee Spector,et al.  Evolving a digital multiplier with the pushgp genetic programming system , 2013, GECCO.

[3]  H. J. Arnold Introduction to the Practice of Statistics , 1990 .

[4]  W. Langdon,et al.  Autoconstructive Evolution : Push , PushGP , and Pushpop , 2001 .

[5]  Lee Spector,et al.  Uniform Linear Transformation with Repair and Alternation in Genetic Programming , 2013, GPTP.

[6]  Lee Spector,et al.  Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report , 2012, GECCO '12.

[7]  Leonardo Vanneschi,et al.  Genetic programming needs better benchmarks , 2012, GECCO '12.

[8]  Timothy Perkis,et al.  Stack-based genetic programming , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[9]  Lee Spector,et al.  Genetic Programming and Autoconstructive Evolution with the Push Programming Language , 2002, Genetic Programming and Evolvable Machines.

[10]  Lee Spector,et al.  What’s in an Evolved Name? The Evolution of Modularity via Tag-Based Reference , 2011 .

[11]  G. Cumming The New Statistics: Why and How , 2013 .

[12]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[13]  Robert I. McKay,et al.  Fitness Sharing in Genetic Programming , 2000, GECCO.

[14]  Robin Harper,et al.  Spatial co-evolution: quicker, fitter and less bloated , 2012, GECCO '12.

[15]  Maarten Keijzer,et al.  The Push3 execution stack and the evolution of control , 2005, GECCO '05.

[16]  Lee Spector,et al.  Tag-based modules in genetic programming , 2011, GECCO '11.

[17]  Wojciech Jaskowski,et al.  Better GP benchmarks: community survey results and proposals , 2012, Genetic Programming and Evolvable Machines.