Toward Machine Learning Through Genetic Code-like Transformations

The gene expression process in nature involves several representation transformations of the genome. Translation is one among them; it constructs the amino acid sequence in proteins from the nucleic acid-based mRNA sequence. Translation is defined by a code book, known as the universal genetic code. This paper explores the role of genetic code and similar representation transformations for enhancing the performance of inductive machine learning algorithms. It considers an abstract model of genetic code-like transformations (GCTs) introduced elsewhere [21] and develops the notion of randomized GCTs. It shows that randomized GCTs can construct a representation of the learning problem where the mean-square-error surface is almost convex quadratic and therefore easier to minimize. It considers the functionally complete Fourier representation of Boolean functions to analyze this effect of such representation transformations. It offers experimental results to substantiate this claim. It shows that a linear classifier like the Perceptron [38] can learn non-linear XOR and DNF functions using a gradient-descent algorithm in a representation constructed by randomized GCTs. The paper also discusses the immediate challenges that must be solved before the proposed technique can be used as a viable approach for representation construction in machine learning.

[1]  L. Hurst,et al.  Early fixation of an optimal genetic code. , 2000, Molecular biology and evolution.

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[4]  J. Bashford,et al.  A supersymmetric model for the evolution of the genetic code. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Dirk Thierens,et al.  Scalability Problems of Simple Genetic Algorithms , 1999, Evolutionary Computation.

[6]  Hillol Kargupta,et al.  A perspective on the foundation and evolution of the linkage learning genetic algorithms , 2000 .

[7]  Michael O'Neill,et al.  Genetic Code Degeneracy: Implications for Grammatical , 1999, ECAL.

[8]  J Otsuka,et al.  Evolution of genetic information flow from the viewpoint of protein sequence similarity. , 1994, Journal of theoretical biology.

[9]  Hillol Kargupta Gene expression: The missing link in evolutionary computation , 1997 .

[10]  J. Monod,et al.  Genetic regulatory mechanisms in the synthesis of proteins. , 1961, Journal of Molecular Biology.

[11]  Anne Brindle,et al.  Genetic algorithms for function optimization , 1980 .

[12]  Wolfgang Banzhaf,et al.  The evolution of genetic code in Genetic Programming , 1999 .

[13]  Peter F. Stadler,et al.  Fast Fourier Transform for Fitness Landscapes , 2002 .

[14]  P. Schuster The Role of Neutral Mutations in the Evolution of RNA Molecules , 1997 .

[15]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[16]  Hillol Kargupta,et al.  Learning functions using randomized genetic code-like transformations: probabilistic properties and experimentations , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  Cândida Ferreira,et al.  Gene Expression Programming: A New Adaptive Algorithm for Solving Problems , 2001, Complex Syst..

[18]  P Béland,et al.  The origin and evolution of the genetic code. , 1994, Journal of theoretical biology.

[19]  R. Knight,et al.  The Early Evolution of the Genetic Code , 2000, Cell.

[20]  M. Victor Wickerhauser,et al.  Adapted wavelet analysis from theory to software , 1994 .

[21]  Hillol Kargupta,et al.  Function induction, gene expression, and evolutionary representation construction , 1999 .

[22]  Michael O'Neill,et al.  Grammatical Evolution: Evolving Programs for an Arbitrary Language , 1998, EuroGP.

[23]  R. Rosenberg Simulation of genetic populations with biochemical properties : technical report , 1967 .

[24]  D. R. McGregor,et al.  Designing application-specific neural networks using the structured genetic algorithm , 1992, [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.

[25]  Hillol Kargupta,et al.  SEARCH, Computational Processes in Evolution, and Preliminary Development of the Gene Expression Messy Genetic Algorithm , 1997, Complex Syst..

[26]  Wolfgang Banzhaf,et al.  Genotype-Phenotype-Mapping and Neutral Variation - A Case Study in Genetic Programming , 1994, PPSN.

[27]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[28]  Hillol Kargupta,et al.  Extending the class of order-k delineable problems for the gene expression messy genetic algorithm , 1996 .

[29]  B. Sankur,et al.  Applications of Walsh and related functions , 1986 .

[30]  Dirk Thierens Estimating the significant non-linearities in the genome problem-coding , 1999 .

[31]  Hillol Kargupta,et al.  A Striking Property of Genetic Code-like Transformations , 2001, Complex Syst..

[32]  Eyal Kushilevitz,et al.  Learning decision trees using the Fourier spectrum , 1991, STOC '91.

[33]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[34]  J. Walsh A Closed Set of Normal Orthogonal Functions , 1923 .

[35]  Annie S. Wu,et al.  A Survey of Intron Research in Genetics , 1996, PPSN.

[36]  Stuart A. Kauffman,et al.  ORIGINS OF ORDER , 2019, Origins of Order.

[37]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[38]  J. C. Jackson The harmonic sieve: a novel application of Fourier analysis to machine learning theory and practice , 1996 .

[39]  Hillol Kargupta,et al.  The Gene Expression Messy Genetic Algorithm , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[40]  Hornos Algebraic model for the evolution of the genetic code. , 1993, Physical review letters.

[41]  John Daniel. Bagley,et al.  The behavior of adaptive systems which employ genetic and correlation algorithms : technical report , 1967 .

[42]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[43]  Hillol Kargupta,et al.  DNA To Protein: Transformations and Their Possible Role in Linkage Learning , 1997, ICGA.

[44]  Hillol Kargupta,et al.  Gene Expression and Fast Construction of Distributed Evolutionary Representation , 2001, Evolutionary Computation.

[45]  Annie S. Wu,et al.  Empirical Studies of the Genetic Algorithm with Noncoding Segments , 1995, Evolutionary Computation.

[46]  Christian M. Reidys,et al.  Evolution on Random Structures , 1995 .

[47]  Kalyanmoy Deb,et al.  Messy Genetic Algorithms: Motivation, Analysis, and First Results , 1989, Complex Syst..