Molecular learning with DNA kernel machines

We present a computational learning method for bio-molecular classification. This method shows how to design biochemical operations both for learning and pattern classification. As opposed to prior work, our molecular algorithm learns generic classes considering the realization in vitro via a sequence of molecular biological operations on sets of DNA examples. Specifically, hybridization between DNA molecules is interpreted as computing the inner product between embedded vectors in a corresponding vector space, and our algorithm performs learning of a binary classifier in this vector space. We analyze the thermodynamic behavior of these learning algorithms, and show simulations on artificial and real datasets as well as demonstrate preliminary wet experimental results using gel electrophoresis.

[1]  Sridhar Hannenhalli,et al.  Eukaryotic transcription factor binding sites - modeling and integrative search methods , 2008, Bioinform..

[2]  R. Britten,et al.  Repeated Sequences in DNA , 1968 .

[3]  R. Bernards,et al.  Enabling personalized cancer medicine through analysis of gene-expression patterns , 2008, Nature.

[4]  D. Y. Zhang,et al.  Engineering Entropy-Driven Reactions and Networks Catalyzed by DNA , 2007, Science.

[5]  A Hjelmfelt,et al.  Chemical implementation of neural networks and Turing machines. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. SantaLucia,et al.  The thermodynamics of DNA structural motifs. , 2004, Annual review of biophysics and biomolecular structure.

[7]  Gunnar Rätsch,et al.  KIRMES: kernel-based identification of regulatory modules in euchromatic sequences , 2009, BMC Bioinformatics.

[8]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Thomas Gärtner,et al.  Kernels and Distances for Structured Data , 2004, Machine Learning.

[10]  Charles H. Bennett,et al.  The thermodynamics of computation—a review , 1982 .

[11]  Byoung-Tak Zhang,et al.  An evolutionary Monte Carlo algorithm for predicting DNA hybridization , 2008, Biosyst..

[12]  B. Schölkopf,et al.  Accurate Splice Site Detection for Caenorhabditis elegans , 2004 .

[13]  Erik Winfree,et al.  Neural Network Computation by In Vitro Transcriptional Circuits , 2004, NIPS.

[14]  E. Winfree,et al.  Construction of an in vitro bistable circuit from synthetic transcriptional switches , 2006, Molecular systems biology.

[15]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[16]  Byoung-Tak Zhang,et al.  A Bayesian Algorithm for In Vitro Molecular Evolution of Pattern Classifiers , 2004, DNA.

[17]  A. Mills,et al.  Gene expression profiling diagnosis through DNA molecular computation. , 2002, Trends in biotechnology.

[18]  Jehoshua Bruck,et al.  Neural network computation with DNA strand displacement cascades , 2011, Nature.

[19]  Bernard Haasdonk,et al.  Feature space interpretation of SVMs with indefinite kernels , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[21]  Ehud Shapiro,et al.  Towards molecular computers that operate in a biological environment , 2008 .

[22]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[23]  Daniel Zwillinger,et al.  CRC standard mathematical tables and formulae; 30th edition , 1995 .

[24]  Daniel T Gillespie,et al.  Stochastic simulation of chemical kinetics. , 2007, Annual review of physical chemistry.

[25]  Anthony A. Philippakis,et al.  Predicting the binding preference of transcription factors to individual DNA k-mers , 2009, Bioinform..

[26]  Conrad Steenberg,et al.  NUPACK: Analysis and design of nucleic acid systems , 2011, J. Comput. Chem..

[27]  J. Downing,et al.  Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells , 2003, Nature Genetics.

[28]  Bei Wang,et al.  A Framework for Modeling DNA Based Molecular Systems , 2006, DNA.

[29]  Konrad Rieck,et al.  Linear-Time Computation of Similarity Measures for Sequential Data , 2008, J. Mach. Learn. Res..

[30]  Kiyoshi Asai,et al.  Marginalized kernels for biological sequences , 2002, ISMB.

[31]  Y. Benenson Biomolecular computing systems: principles, progress and potential , 2012, Nature Reviews Genetics.

[32]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[33]  Byoung-Tak Zhang,et al.  Hypernetworks: A Molecular Evolutionary Architecture for Cognitive Learning and Memory , 2008, IEEE Computational Intelligence Magazine.

[34]  Anne Condon,et al.  A new algorithm for RNA secondary structure design. , 2004, Journal of molecular biology.

[35]  William Stafford Noble,et al.  Kernels for gene regulatory regions , 2005, NIPS.

[36]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[37]  J. SantaLucia,et al.  A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Yoav Freund,et al.  Predicting genetic regulatory response using classification , 2004, ISMB/ECCB.

[39]  W. Beyer CRC Standard Mathematical Tables and Formulae , 1991 .

[40]  Gunnar Rätsch,et al.  Support Vector Machines and Kernels for Computational Biology , 2008, PLoS Comput. Biol..

[41]  Bernard Yurke,et al.  Experimental aspects of DNA neural network computation , 2001, Soft Comput..

[42]  James Ting-Ho Lo,et al.  A Low-Order Model of Biological Neural Networks , 2011, Neural Computation.

[43]  Alexander J. Smola,et al.  Learning with non-positive kernels , 2004, ICML.

[44]  Erik Winfree,et al.  Thermodynamic Analysis of Interacting Nucleic Acid Strands , 2007, SIAM Rev..

[45]  Kalim U. Mir A restricted genetic alphabet for DNA computing , 1996, DNA Based Computers.

[46]  J. Ross,et al.  Experiments on Pattern Recognition by Chemical Kinetics , 1995 .

[47]  Gunnar Rätsch,et al.  Learning Interpretable SVMs for Biological Sequence Classification , 2005, BMC Bioinformatics.

[48]  Zhengdong Lu,et al.  Kernels for Longitudinal Data with Variable Sequence Length and Sampling Intervals , 2011, Neural Computation.

[49]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[50]  Yao Lu,et al.  Traveling Bumps and Their Collisions in a Two-Dimensional Neural Field , 2011, Neural Computation.

[51]  Byoung-Tak Zhang,et al.  In vitro molecular pattern classification via DNA-based weighted-sum operation , 2010, Biosyst..