Mathematics and evolutionary biology make bioinformatics education comprehensible

The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

[1]  A. Dress The Mathematical Basis of Molecular Phylogenetics , 1995 .

[2]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[3]  Judy Perry,et al.  Evaluating two approaches to helping college students understand evolutionary trees through diagramming tasks. , 2008, CBE life sciences education.

[4]  Anne-Mieke Vandamme,et al.  The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing , 2009 .

[5]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[6]  Building Mathematical Models and Biological Insight in an Introductory Biology Course , 2011 .

[7]  Sam S Donovan,et al.  The Tree-Thinking Challenge , 2005, Science.

[8]  Matthew He,et al.  Mathematics of Bioinformatics : Theory, Practice, and Applications , 2010 .

[9]  Sean B. Carroll,et al.  "Development, Plasticity and Evolution of Butterfly Eyespot Patterns" (1996), by Paul M. Brakefield et al. , 2013 .

[10]  S Subramaniam,et al.  The biology workbench—A seamless database and analysis environment for the biologist , 1998, Proteins.

[11]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[12]  B. White,et al.  Teaching undergraduate students to draw phylogenetic trees: performance measures and partial successes , 2013, Evolution: Education and Outreach.

[13]  Hanno Sandvik,et al.  Tree thinking cannot taken for granted: challenges for teaching phylogenetics , 2008, Theory in Biosciences.

[14]  M. Crisp,et al.  Tree thinking for all biology: the problem with reading phylogenies as ladders of progress. , 2008, BioEssays : news and reviews in molecular, cellular and developmental biology.

[15]  Brian Hayes Source GRAPH THEORY IN PRACTICE : PART I , 1999 .

[16]  Joel E. Cohen,et al.  Mathematics Is Biology's Next Microscope, Only Better; Biology Is Mathematics' Next Physics, Only Better , 2004, PLoS biology.

[17]  Sander Greenland,et al.  An overview of relations among causal modelling methods. , 2002, International journal of epidemiology.

[18]  D. Hillis,et al.  Analysis and visualization of tree space. , 2005, Systematic biology.

[19]  M. Waterman Mathematical Methods for DNA Sequences , 1989 .

[20]  B. Hayes Graph Theory in Practice: Part II , 2000, American Scientist.

[21]  John R. Jungck,et al.  Bioinformatics education dissemination with an evolutionary problem solving perspective , 2010, Briefings Bioinform..

[22]  Benno Schwikowski,et al.  Graph-based methods for analysing networks in cell biology , 2006, Briefings Bioinform..

[23]  J. Jungck Mathematical Biology Education: Modeling Makes Meaning , 2011 .

[24]  J. Mullins,et al.  Molecular Epidemiology of HIV Transmission in a Dental Practice , 1992, Science.

[25]  A. Lemmon,et al.  The metapopulation genetic algorithm: An efficient solution for the problem of large phylogeny estimation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Daniel H. Huson,et al.  SplitsTree: analyzing and visualizing evolutionary data , 1998, Bioinform..

[27]  R. Chiodini,et al.  The impact of next-generation sequencing on genomics. , 2011, Journal of genetics and genomics = Yi chuan xue bao.

[28]  A. Dress,et al.  Split decomposition: a new and useful approach to phylogenetic analysis of distance data. , 1992, Molecular phylogenetics and evolution.

[29]  John R. Jungck,et al.  Ten Equations That Changed Biology: Mathematics in Problem-Solving Biology Curricula. , 1997 .

[30]  David A. Baum,et al.  Phylogenics & Tree-Thinking , 2008 .

[31]  Y. Peer Phylogenetic inference based on distance methods : theory , 2009 .

[32]  Y. Peer,et al.  The Phylogenetic Handbook: Phylogenetic inference based on distance methods , 2009 .

[33]  Andreas Wagner,et al.  Evolutionary constraints permeate large metabolic networks , 2009, BMC Evolutionary Biology.

[34]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.