Convergency of Genetic Regression In Data Mining Based On Gene Expression Programming and Optimized Solution

Abstract This paper investigates the convergency of the probability of genetic regression in data mining based on Gene Expression Programming (GEP ) and the proposed optimized algorithm based on GEP-Minimized Residual Sum of Square Genetic Algorithm (MRSSGA). By extensive experiments on Genetic Programming (GP), GEP and MRSSGA show: (1) that all algorithms could find the target function from the data with low noise; (2) by comparing the convergency speeds, new algorithms in GEP are 20 times faster than GP and MRSSGA and 60 times faster than GP for simple data; (3) for very complex data with an unknown function type, GEP and MRSSGA are respectively 900 and 1800 times faster than GP at finding ideal functions; and (4) aimed at the actual data, the precision of models created by using genetic regression methods is much more exact than traditional methods.