Probabilistic Methods for Bioinformatics: with an Introduction to Bayesian Networks

The Bayesian network is one of the most important architectures for representing and reasoning with multivariate probability distributions. When used in conjunction with specialized informatics, possibilities of real-world applications are achieved. Probabilistic Methods for BioInformatics explains the application of probability and statistics, in particular Bayesian networks, to genetics. This book provides background material on probability, statistics, and genetics, and then moves on to discuss Bayesian networks and applications to bioinformatics. Rather than getting bogged down in proofs and algorithms, probabilistic methods used for biological information and Bayesian networks are explained in an accessible way using applications and case studies. The many useful applications of Bayesian networks that have been developed in the past 10 years are discussed. Forming a review of all the significant work in the field that will arguably become the most prevalent method in biological data analysis. Unique coverage of probabilistic reasoning methods applied to bioinformatics data--those methods that are likely to become the standard analysis tools for bioinformatics. Shares insights about when and why probabilistic methods can and cannot be used effectively; Complete review of Bayesian networks and probabilistic methods with a practical approach.

[1]  M. Hasegawa,et al.  Relative efficiencies of the maximum likelihood, maximum parsimony, and neighbor-joining methods for estimating protein phylogeny. , 1993, Molecular phylogenetics and evolution.

[2]  L. Zadeh,et al.  Probability theory and fuzzy logic are complementary rather than competitive , 1995 .

[3]  Gregory F. Cooper,et al.  An Entropy-driven System for Construction of Probabilistic Expert Systems from Databases , 1990, UAI.

[4]  Marek J Druzdzel,et al.  Canonical Probabilistic Models for Knowledge Engineering , 2007 .

[5]  Malcolm Pradhan,et al.  Optimal Monte Carlo Estimation of Belief Network Inference , 1996, UAI.

[6]  Zhaoyu Li,et al.  Efficient inference in Bayes networks as a combinatorial optimization problem , 1994, Int. J. Approx. Reason..

[7]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[8]  Christopher Meek,et al.  Strong completeness and faithfulness in Bayesian networks , 1995, UAI.

[9]  Richard Scheines,et al.  Tetrad II: User's Manual , 1994 .

[10]  A. Tversky,et al.  Additive similarity trees , 1977 .

[11]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[12]  Prakash P. Shenoy Inference in Hybrid Bayesian Networks Using Mixtures of Gaussians , 2006, UAI.

[13]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[14]  V. Guacci,et al.  A Direct Link between Sister Chromatid Cohesion and Chromosome Condensation Revealed through the Analysis of MCD1 in S. cerevisiae , 1997, Cell.

[15]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[16]  Dan Geiger,et al.  Optimizing Exact Genetic Linkage Computations , 2004, J. Comput. Biol..

[17]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  B. Blaisdell A method of estimating from two aligned present-day DNA sequences their ancestral composition and subsequent rates of substitution, possibly different in the two lineages, corrected for multiple and parallel substitutions at the same site , 2005, Journal of Molecular Evolution.

[19]  R. Mises Grundlagen der Wahrscheinlichkeitsrechnung , 1919 .

[20]  N. E. Savin,et al.  The Bonferroni and the Scheffé multiple comparison procedures , 1980 .

[21]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[22]  Kuo-Chu Chang,et al.  Weighing and Integrating Evidence for Stochastic Simulation in Bayesian Networks , 2013, UAI.

[23]  Dan Geiger,et al.  Exact genetic linkage computations for general pedigrees , 2002, ISMB.

[24]  David Maxwell Chickering,et al.  A Bayesian Approach to Learning Bayesian Networks with Local Structure , 1997, UAI.

[25]  G. Serio,et al.  A new method for calculating evolutionary substitution rates , 2005, Journal of Molecular Evolution.

[26]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[27]  A. Zharkikh Estimation of evolutionary distances between nucleotide sequences , 1994, Journal of Molecular Evolution.

[28]  L. Jin,et al.  Limitations of the evolutionary parsimony method of phylogenetic analysis. , 1990, Molecular biology and evolution.

[29]  A. Griffiths Introduction to Genetic Analysis , 1976 .

[30]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[31]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[32]  Richard E. Korf,et al.  Linear-Space Best-First Search , 1993, Artif. Intell..

[33]  Z. Yang,et al.  Among-site rate variation and its impact on phylogenetic analyses. , 1996, Trends in ecology & evolution.

[34]  W. Wong,et al.  Learning Causal Bayesian Network Structures From Experimental Data , 2008 .

[35]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[36]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[37]  Robert V. Hogg,et al.  Introduction to Mathematical Statistics. , 1966 .

[38]  D. Keefer,et al.  Three-Point Approximations for Continuous Random Variables , 1983 .

[39]  David Lindley,et al.  Introduction to Probability and Statistics from a Bayesian Viewpoint , 1966 .

[40]  K. Nasmyth,et al.  Yeast G1 cyclins CLN1 and CLN2 and a GAP‐like protein have a role in bud formation. , 1993, The EMBO journal.

[41]  Yang Xiang,et al.  Critical Remarks on Single Link Search in Learning Belief Networks , 1996, UAI.

[42]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[43]  Nir Friedman,et al.  Data Analysis with Bayesian Networks: A Bootstrap Approach , 1999, UAI.

[44]  B. D. Finetti La prévision : ses lois logiques, ses sources subjectives , 1937 .

[45]  R. W. Robinson Counting unlabeled acyclic digraphs , 1977 .

[46]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[47]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[48]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[49]  W. Li,et al.  Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites. , 1995, Molecular biology and evolution.

[50]  R. Martin Chavez,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  M Silberstein,et al.  Online system for faster multipoint linkage analysis via parallel execution on thousands of personal computers. , 2006, American journal of human genetics.

[52]  Steffen L. Lauritzen,et al.  aHUGIN: A System Creating Adaptive Causal Probabilistic Networks , 1992, UAI.

[53]  Sampath Srinivas,et al.  A Generalization of the Noisy-Or Model , 1993, UAI.

[54]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[55]  Thomas E. McKee,et al.  Genetic programming and rough sets: A hybrid approach to bankruptcy classification , 2002, Eur. J. Oper. Res..

[56]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[57]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[58]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[59]  Frederick Mosteller,et al.  Bias and runs in dice throwing and recording: A few million throws , 1971 .

[60]  D. Haussler,et al.  Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. , 2003, Molecular biology and evolution.

[61]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[62]  Richard E. Neapolitan,et al.  Probabilistic reasoning in expert systems - theory and algorithms , 2012 .

[63]  Walter R. Gilks,et al.  Introduction to general state-space Markov chain theory , 1995 .

[64]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[65]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[66]  Donald A. Berry,et al.  Statistics: A Bayesian Perspective , 1995 .

[67]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[68]  Roy Billinton,et al.  Basic probability theory , 1992 .

[69]  J. Huelsenbeck The robustness of two phylogenetic methods: four-taxon simulations reveal a slight superiority of maximum likelihood over neighbor joining. , 1995, Molecular biology and evolution.

[70]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Richard E. Neapolitan,et al.  Is higher-order uncertainty needed? , 1996, IEEE Trans. Syst. Man Cybern. Part A.

[72]  Richard E. Neapolitan,et al.  Probabilistic Methods for Financial and Marketing Informatics , 2007 .

[73]  Wray Buntine Tree Classification Software , 1993 .

[74]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[75]  Tal Pupko,et al.  A Structural EM Algorithm for Phylogenetic Inference , 2002, J. Comput. Biol..

[76]  Y. Tateno,et al.  Robustness of maximum likelihood tree estimation against different patterns of base substitutions , 2005, Journal of Molecular Evolution.

[77]  Michael S. Waterman,et al.  General methods of sequence comparison , 1984 .

[78]  Wesley C. Salmon,et al.  Causality and Explanation , 1998 .

[79]  J. Rajfer,et al.  Dihydrotestosterone is the active androgen in the maintenance of nitric oxide-mediated penile erection in the rat. , 1995, Endocrinology.

[80]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[81]  Ross D. Shachter,et al.  Simulation Approaches to General Probabilistic Inference on Belief Networks , 2013, UAI.

[82]  Paul Gardner-Stephen,et al.  DASH: localising dynamic programming for order of magnitude faster, accurate sequence alignment , 2004 .

[83]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[84]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[85]  M. Nei,et al.  Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum-parsimony methods when substitution rate varies with site. , 1994, Molecular biology and evolution.