Symbolic regression on noisy data with genetic and gene expression programming

This paper presents a novel method to perform regression on a finite sample of noisy data. The purpose is to obtain a mathematical model for data which is both reliable and valid, yet the analytical expression is not restricted to any particular form. To obtain a statistical model of the noisy data set we use symbolic regression with pseudorandom number generators. We begin by describing symbolic regression and our implementation of this technique using genetic programming (GP) and gene expression programming (GEP). We present some results for symbolic regression on computer generated and real financial data sets in the final part of this paper.