Parametric inference for biological sequence analysis.

One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

[1]  L. Pachter,et al.  SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. , 2003, Genome research.

[2]  Michael Joswig,et al.  polymake: a Framework for Analyzing Convex Polytopes , 2000 .

[3]  David Fernández-Baca,et al.  Parametric multiple sequence alignment and phylogeny construction , 2004, J. Discrete Algorithms.

[4]  D Gusfield,et al.  Parametric and inverse-parametric sequence alignment with XPARAL. , 1996, Methods in enzymology.

[5]  Dan Gusfield,et al.  Parametric optimization of sequence alignment , 1992, SODA '92.

[6]  E. Lander,et al.  Parametric sequence comparisons. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Thomas M. Liebling,et al.  Extended convex hull , 2001, Comput. Geom..

[8]  David Fernández-Baca,et al.  Parametric Multiple Sequence Alignment and Phylogeny Construction , 2000, CPM.

[9]  Daiya Takai,et al.  Comprehensive analysis of CpG islands in human chromosomes 21 and 22 , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[11]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[12]  R. Stanley What Is Enumerative Combinatorics , 1986 .

[13]  L. Pachter,et al.  Tropical geometry of statistical models. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Dan Geiger,et al.  Asymptotic Model Selection for Naive Bayesian Networks , 2002, J. Mach. Learn. Res..

[15]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[16]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[17]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[18]  Eric H. Kuo Viterbi sequences and polytopes , 2006, J. Symb. Comput..

[19]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[20]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[21]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[22]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[23]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[24]  Michael I. Jordan,et al.  Graphical models: Probabilistic inference , 2002 .

[25]  Philipp Bucher,et al.  A Sequence Similarity Search Algorithm Based on a Probabilistic Interpretation of an Alignment Scoring System , 1996, ISMB.

[26]  Michael Joswig,et al.  Polymake: an approach to modular software design in computational geometry , 2001, SCG '01.

[27]  B. Sturmfels Gröbner bases and convex polytopes , 1995 .

[28]  M. Frommer,et al.  CpG islands in vertebrate genomes. , 1987, Journal of molecular biology.

[29]  Simon Parsons,et al.  Bioinformatics: The Machine Learning Approach by P. Baldi and S. Brunak, 2nd edn, MIT Press, 452 pp., $60.00, ISBN 0-262-02506-X , 2004, The Knowledge Engineering Review.