Knowledge Discovery and Emergent Complexity in Bioinformatics

In February 1943, the Austrian physicist Erwin Schrodinger, one of the founding fathers of quantum mechanics, gave a series of lectures at the Trinity College in Dublin, entitled “What Is Life? The Physical Aspect of the Living Cell and Mind”. In these lectures Schrodinger stressed the fundamental differences encountered between observing animate and inanimate matter, and advanced some at the time audacious hypotheses about the nature and molecular structure of genes, some ten years before the discoveries of Watson and Crick.

[1]  Przemyslaw Prusinkiewicz,et al.  The Algorithmic Beauty of Plants , 1990, The Virtual Laboratory.

[2]  Farren J. Isaacs,et al.  Computational studies of gene regulatory networks: in numero molecular biology , 2001, Nature Reviews Genetics.

[3]  Steven Salzberg,et al.  Efficient decoding algorithms for generalized hidden Markov model gene finders , 2005, BMC Bioinformatics.

[4]  J. Ross,et al.  Computational functions in biochemical reaction networks. , 1994, Biophysical journal.

[5]  Marta Simeoni,et al.  Modeling Cellular Behavior with Hybrid Automata: Bisimulation and Collapsing , 2003, CMSB.

[6]  Jotun Hein,et al.  Using hidden Markov models and observed evolution to annotate viral genomes , 2006, Bioinform..

[7]  Maarten Peeters,et al.  Learning Automata as a Basis for Multi Agent Reinforcement Learning , 2005, EUMAS.

[8]  J. Fuchs More on sparse representations in arbitrary bases , 2003 .

[9]  M. Verhaegen,et al.  Subspace identification of piecewise linear systems , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[10]  Tom Fawcett,et al.  Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions , 1997, KDD.

[11]  N. Kampen,et al.  Stochastic processes in physics and chemistry , 1981 .

[12]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[13]  Chris Melhuish,et al.  Stigmergy, Self-Organization, and Sorting in Collective Robotics , 1999, Artificial Life.

[14]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[15]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[16]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[17]  R. Linsker,et al.  A measure of DNA periodicity. , 1986, Journal of theoretical biology.

[18]  V. Anne Smith,et al.  Evaluating functional network inference using simulations of complex biological systems , 2002, ISMB.

[19]  Jeffrey L. Krichmar,et al.  Computer generation and quantitative morphometric analysis of virtual neurons , 2001, Anatomy and Embryology.

[20]  P. Swain,et al.  Gene Regulation at the Single-Cell Level , 2005, Science.

[21]  A. Valencia,et al.  Text-mining and information-retrieval services for molecular biology , 2005, Genome Biology.

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  Brett Ninness,et al.  On Gradient-Based Search for Multivariable System Estimates , 2008, IEEE Transactions on Automatic Control.

[25]  Liqun Luo,et al.  How do dendrites take their shape? , 2001, Nature Neuroscience.

[26]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[27]  Simon Cawley,et al.  HMM sampling and applications to gene finding and alternative splicing , 2003, ECCB.

[28]  Ralf Peeters,et al.  System identification based on riemannian geometry : theory and algorithms , 1993 .

[29]  Marti A. Hearst,et al.  TREC 2004 Genomics Track Overview , 2005, TREC.

[30]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[31]  Ash A. Alizadeh,et al.  'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns , 2000, Genome Biology.

[32]  Christopher B. Burge,et al.  Recognition of Unknown Conserved Alternatively Spliced Exons , 2005, PLoS Comput. Biol..

[33]  D. Sherrington Stochastic Processes in Physics and Chemistry , 1983 .

[34]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[35]  Steven Salzberg,et al.  An empirical analysis of training protocols for probabilistic gene finders , 2005, BMC Bioinformatics.

[36]  Hendrik Van Brussel,et al.  Multi-agent Coordination and Control Using Stigmergy Applied to Manufacturing Control , 2001, EASSS.

[37]  Thomas Mestl,et al.  A methodological basis for description and analysis of systems with complex switch-like interactions , 1998, Journal of mathematical biology.

[38]  Guy Theraulaz,et al.  A Brief History of Stigmergy , 1999, Artificial Life.

[39]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[40]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[41]  S. Salzberg,et al.  Improved microbial gene identification with GLIMMER. , 1999, Nucleic acids research.

[42]  Feng Gao,et al.  Comparison of various algorithms for recognizing short coding sequences of human genes , 2004, Bioinform..

[43]  P. Rouzé,et al.  Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[44]  Martin Romacker,et al.  An Integrated Model of Semantic and Conceptual Interpretation from Dependency Structures , 2022 .

[45]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[46]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[47]  R. Somogyi,et al.  The gene expression matrix: towards the extraction of genetic network architectures , 1997 .

[48]  Yves Normandin Maximum Mutual Information Estimation of Hidden Markov Models , 1996 .

[49]  Richard Wheeler,et al.  Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.

[50]  L. Glass,et al.  The logical analysis of continuous, non-linear biochemical control networks. , 1973, Journal of theoretical biology.

[51]  A. Reymond,et al.  Tandem chimerism as a means to increase protein complexity in the human genome. , 2005, Genome research.

[52]  A. Goldbeter Computational approaches to cellular rhythms , 2002, Nature.

[53]  Erik Plahte,et al.  Targeted reduction of complex models with time scale hierarchy--a case study. , 2003, Mathematical biosciences.

[54]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[55]  Jesper Tegnér,et al.  Reverse engineering gene networks using singular value decomposition and robust regression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Eric O. Postma,et al.  Shaping Realistic Neuronal Morphologies: An Evolutionary Computation Method , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[57]  G A Ascoli,et al.  Generation, description and storage of dendritic morphology data. , 2001, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[58]  T. Sejnowski,et al.  The Monetary Transmission Mechanism in the United Kingdom: Pass-Through and Policy Rules. manuscript , 1996 .

[59]  J. H. Schuppen,et al.  System theory of rational positive systems for cell reaction networks , 2004 .

[60]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[61]  Haixu Tang,et al.  Splicing graphs and EST assembly problem , 2002, ISMB.

[62]  Piero Fariselli,et al.  The posterior-Viterbi: a new decoding algorithm for hidden Markov models , 2005 .

[63]  Ian Korf,et al.  MaskerAid : a performance enhancement to RepeatMasker , 2000, Bioinform..

[64]  Tomaso Poggio,et al.  Identification and analysis of alternative splicing events conserved in human and mouse. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[65]  Christian Commault,et al.  Generic properties and control of linear structured systems: a survey , 2003, Autom..

[66]  Michael Q. Zhang,et al.  A weight array method for splicing signal analysis , 1993, Comput. Appl. Biosci..

[67]  Michael R. Brent,et al.  Using Multiple Alignments to Improve Gene Prediction , 2005, RECOMB.

[68]  Burkhard Morgenstern,et al.  Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources , 2006, BMC Bioinformatics.

[69]  Lorraine K. Tanabe,et al.  GENETAG: a tagged corpus for gene/protein named entity recognition , 2005, BMC Bioinformatics.

[70]  Jonathan E. Allen,et al.  JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the features of human genes in the ENCODE regions , 2006, Genome Biology.

[71]  T. Speed,et al.  Biological Sequence Analysis , 1998 .

[72]  Carsten Duch,et al.  Behavioral transformations during metamorphosis: remodeling of neural and motor systems , 2000, Brain Research Bulletin.

[73]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[74]  Gabriel Wittum,et al.  NeuGen: A tool for the generation of realistic morphology of cortical neurons and neural networks in 3D , 2006, Neurocomputing.

[75]  Idan Segev,et al.  Sound grounds for computing dendrites , 1998, Nature.

[76]  David Haussler,et al.  Improved splice site detection in Genie , 1997, RECOMB '97.

[77]  R. Pfeifer,et al.  A mobile robot employing insect strategies for navigation , 2000, Robotics Auton. Syst..

[78]  R. Brent,et al.  Modelling cellular behaviour , 2001, Nature.

[79]  T. Mestl,et al.  Periodic solutions in systems of piecewise- linear differential equations , 1995 .

[80]  Jean-Jacques Fuchs,et al.  On sparse representations in arbitrary redundant bases , 2004, IEEE Transactions on Information Theory.

[81]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[82]  P. Lio’,et al.  Periodic gene expression program of the fission yeast cell cycle , 2004, Nature Genetics.

[83]  P. Swain,et al.  Stochastic Gene Expression in a Single Cell , 2002, Science.

[84]  Dietmar Bauer,et al.  Asymptotic properties of subspace estimators , 2005, Autom..

[85]  Thangavel Alphonse Thanaraj,et al.  ASD: the Alternative Splicing Database , 2004, Nucleic Acids Res..

[86]  Ethem Alpaydın,et al.  Combined 5 x 2 cv F Test for Comparing Supervised Classification Learning Algorithms , 1999, Neural Comput..

[87]  Felix L. Chernousko,et al.  Finding prokaryotic genes by the 'frame-by-frame' algorithm: targeting gene starts and overlapping genes , 1999, Bioinform..

[88]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[89]  Ron Shamir,et al.  Accurate identification of alternatively spliced exons using support vector machine , 2005, Bioinform..

[90]  Anders Krogh,et al.  Two Methods for Improving Performance of a HMM and their Application for Gene Finding , 1997, ISMB.

[91]  Yvan Saeys,et al.  Selecting relevant features for gene structure prediction , 2004 .

[92]  Gene W. Yeo,et al.  Systematic Identification and Analysis of Exonic Splicing Silencers , 2004, Cell.

[93]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[94]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[95]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[96]  C. Rao,et al.  Control, exploitation and tolerance of intracellular noise , 2002, Nature.

[97]  E. Uberbacher,et al.  Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[98]  H. D. Jong,et al.  Qualitative simulation of genetic regulatory networks using piecewise-linear models , 2004, Bulletin of mathematical biology.

[99]  R Staden Computer methods to locate signals in nucleic acid sequences , 1984, Nucleic Acids Res..

[100]  T. Sejnowski,et al.  Mapping function onto neuronal morphology. , 2007, Journal of neurophysiology.

[101]  Yan-Da Li,et al.  Identifying splicing sites in eukaryotic RNA: support vector machine approach , 2003, Comput. Biol. Medicine.

[102]  E. Birney,et al.  EGASP: the human ENCODE Genome Annotation Assessment Project , 2006, Genome Biology.

[103]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[104]  David Haussler,et al.  A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA , 1996, ISMB.

[105]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[106]  Ann Nowé,et al.  Colonies of learning automata , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[107]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[108]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[109]  Sean Luke,et al.  A pheromone-based utility model for collaborative foraging , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[110]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[111]  R. Laubenbacher,et al.  A computational algebra approach to the reverse engineering of gene regulatory networks. , 2003, Journal of theoretical biology.

[112]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[113]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[114]  I Segev,et al.  Untangling dendrites with quantitative models. , 2000, Science.

[115]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[116]  David Haussler,et al.  Computational identification of evolutionarily conserved exons , 2004, RECOMB.

[117]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[118]  Halima Bensmail,et al.  Data Mining in Genomics and Proteomics , 2005, Journal of biomedicine & biotechnology.

[119]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[120]  James M. Bower,et al.  Computational modeling of genetic and biochemical networks , 2001 .

[121]  D. Chklovskii,et al.  Neurogeometry and potential synaptic connectivity , 2005, Trends in Neurosciences.

[122]  E. Davidson,et al.  A view from the genome: spatial control of transcription in sea urchin development. , 1999, Current opinion in genetics & development.

[123]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[124]  Gunnar Rätsch,et al.  RASE: recognition of alternatively spliced exons in C.elegans , 2005, ISMB.

[125]  Günther Ruske,et al.  Discriminative training for continuous speech recognition , 1995, EUROSPEECH.

[126]  R. Mehra,et al.  Computational aspects of maximum likelihood estimation and reduction in sensitivity function calculations , 1974 .

[127]  Guy Theraulaz,et al.  Modelling the Collective Building of Complex Architectures in Social Insects with Lattice Swarms , 1995 .

[128]  W. Steiger,et al.  Least Absolute Deviations: Theory, Applications and Algorithms , 1984 .

[129]  Michael L. Hines,et al.  The NEURON Book , 2006 .

[130]  Nancy M. Amato,et al.  Neuron PRM: a framework for constructing cortical networks , 2003, Neurocomputing.

[131]  P. Swain,et al.  Intrinsic and extrinsic contributions to stochasticity in gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[132]  Erik I. Verriest,et al.  A geometric approach to the minimum sensitivity design problem , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[133]  Gail D. Baura,et al.  Nonlinear System Identification , 2002 .

[134]  S. Salzberg,et al.  Computational gene finding in plants , 2004, Plant Molecular Biology.

[135]  Christof Koch,et al.  The role of single neurons in information processing , 2000, Nature Neuroscience.

[136]  Peter S Swain,et al.  Efficient attenuation of stochasticity in gene expression through post-transcriptional control. , 2004, Journal of molecular biology.

[137]  William H. Majoros,et al.  Efficient implementation of a generalized pair hidden Markov model for comparative gene finding , 2005, Bioinform..

[138]  Steven Salzberg,et al.  TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders , 2004, Bioinform..

[139]  R. Fletcher Practical Methods of Optimization , 1988 .

[140]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[141]  M. Thathachar,et al.  Networks of Learning Automata: Techniques for Online Stochastic Optimization , 2003 .

[142]  Yvan Saeys,et al.  Digging into Acceptor Splice Site Prediction: An Iterative Feature Selection Approach , 2004, PKDD.

[143]  Aristid Lindenmayer,et al.  Mathematical Models for Cellular Interactions in Development , 1968 .

[144]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[145]  Peter Vrancx,et al.  Multi-type Ant Colony: The Edge Disjoint Paths Problem , 2004, ANTS Workshop.

[146]  J. Tyson,et al.  Modeling the control of DNA replication in fission yeast. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[147]  W. Gray,et al.  Optimality properties of balanced realizations: Minimum sensitivity , 1987, 26th IEEE Conference on Decision and Control.

[148]  M. London,et al.  Dendritic computation. , 2005, Annual review of neuroscience.

[149]  C Duch,et al.  Remodeling of Membrane Properties and Dendritic Architecture Accompanies the Postembryonic Conversion of a Slow into a Fast Motoneuron , 2000, The Journal of Neuroscience.

[150]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[151]  S. Salzberg,et al.  Interpolated Markov models for eukaryotic gene finding. , 1999, Genomics.

[152]  Hava T. Siegelmann,et al.  Computation in Gene Networks , 2001, MCU.

[153]  T. Mestl,et al.  A mathematical framework for describing and analysing gene regulatory networks. , 1995, Journal of theoretical biology.

[154]  W. Larimore System Identification, Reduced-Order Filtering and Modeling via Canonical Variate Analysis , 1983, 1983 American Control Conference.

[155]  Michel Verhaegen,et al.  Identification of the deterministic part of MIMO state space models given in innovations form from input-output data , 1994, Autom..

[156]  Bart De Moor,et al.  Biclustering microarray data by Gibbs sampling , 2003, ECCB.

[157]  R. Wehner,et al.  Pinpointing food sources: olfactory and anemotactic orientation in desert ants, Cataglyphis fortis. , 2000, The Journal of experimental biology.

[158]  R. Steuer Effects of stochasticity in models of the cell cycle: from quantized cycle times to noise-induced oscillations. , 2004, Journal of theoretical biology.

[159]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[160]  J. Hasty,et al.  Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[161]  Ralf Peeters,et al.  On the identification of sparse gene regulatory networks , 2004 .

[162]  Stuart D. Washington,et al.  Effects of dendritic morphology on CA3 pyramidal cell electrophysiology: a simulation study , 2002, Brain Research.

[163]  J. C. Clemens,et al.  Alternative Splicing of Drosophila Dscam Generates Axon Guidance Receptors that Exhibit Isoform-Specific Homophilic Binding , 2004, Cell.

[164]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[165]  Michael Ruogu Zhang,et al.  Computational identification of promoters and first exons in the human genome , 2002, Nature Genetics.

[166]  Charles E. Chapple,et al.  Diversity and functional plasticity of eukaryotic selenoproteins: identification and characterization of the SelJ family. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[167]  L. Pachter,et al.  SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. , 2003, Genome research.

[168]  S. Salzberg,et al.  Microbial gene identification using interpolated Markov models. , 1998, Nucleic acids research.

[169]  Brett Ninness,et al.  The university of Newcastle identification toolbox (UNIT) , 2005 .

[170]  InzaIñaki,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004 .

[171]  José Halloy,et al.  Stochastic models for circadian oscillations: Emergence of a biological rhythm , 2004 .

[172]  Mark Borodovsky,et al.  GENMARK: Parallel Gene Recognition for Both DNA Strands , 1993, Comput. Chem..

[173]  J. Fickett,et al.  Assessment of protein coding measures. , 1992, Nucleic acids research.

[174]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[175]  Arjen van Ooyen,et al.  The effect of dendritic topology on firing patterns in model neurons , 2002, Network.