Graphical models for inferring single molecule dynamics

BackgroundThe recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM). The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET) versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM) with Gaussian observables. A detailed description of smFRET is provided as well.ResultsThe VBEM algorithm returns the model’s evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME), and the latter a description of the model’s parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML) optimized by the expectation maximization (EM) algorithm, the most important being a natural form of model selection and a well-posed (non-divergent) optimization problem.ConclusionsThe results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.

[1]  P. Lugol Annalen der Physik , 1906 .

[2]  Th. Förster Zwischenmolekulare Energiewanderung und Fluoreszenz , 1948 .

[3]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[4]  L. Stryer,et al.  Energy transfer: a spectroscopic ruler. , 1967, Proceedings of the National Academy of Sciences of the United States of America.

[5]  H. Akaike A new look at the statistical model identification , 1974 .

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[8]  D. Eisenberg Proteins. Structures and molecular properties, T.E. Creighton. W. H. Freeman and Company, New York (1984), 515, $36.95 , 1985 .

[9]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[10]  L. Goldstein,et al.  Bead movement by single kinesin molecules studied with optical tweezers , 1990, Nature.

[11]  S. Quake,et al.  Relaxation of a single DNA molecule observed by optical microscopy. , 1994, Science.

[12]  D. F. Ogletree,et al.  Probing the interaction between single molecules: fluorescence resonance energy transfer between a single donor and a single acceptor , 1996, Summaries of Papers Presented at the Quantum Electronics and Laser Science Conference.

[13]  A. Auerbach,et al.  Maximum likelihood estimation of aggregated Markov processes , 1997, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[14]  Andrew B. Martin,et al.  Single-molecule protein folding: diffusion fluorescence resonance energy transfer studies of the denaturation of chymotrypsin inhibitor 2. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[16]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[17]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[18]  David J. Spiegelhalter,et al.  VIBES: A Variational Inference Engine for Bayesian Networks , 2002, NIPS.

[19]  X. Zhuang,et al.  Correlating Structural Dynamics and Function in Single Ribozyme Molecules , 2002, Science.

[20]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[21]  R. Levy,et al.  Direct Determination of Kinetic Rates from Single-Molecule Photon Arrival Trajectories Using Hidden Markov Models. , 2003, The journal of physical chemistry. A.

[22]  Jie Yan,et al.  Near-field-magnetic-tweezer manipulation of single DNA molecules. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[24]  Alex M. Andrew,et al.  Information Theory, Inference, and Learning Algorithms , 2004 .

[25]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[26]  W. Eaton,et al.  Polyproline and the "spectroscopic ruler" revisited with single-molecule fluorescence. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Matthew J. Beal,et al.  Variational Bayesian learning of directed graphical models with hidden variables , 2006 .

[28]  S. McKinney,et al.  Analysis of single-molecule FRET trajectories using hidden Markov modeling. , 2006, Biophysical journal.

[29]  I. Tinoco,et al.  RNA translocation and unwinding mechanism of HCV NS3 helicase and its coordination by ATP , 2006, Nature.

[30]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[31]  C. Dekker,et al.  Single-molecule studies of nucleic acid motors. , 2007, Current opinion in structural biology.

[32]  R. Vale,et al.  How kinesin waits between steps , 2007, Nature.

[33]  C. Joo,et al.  Advances in single-molecule fluorescence methods for molecular biology. , 2008, Annual review of biochemistry.

[34]  R. L. Gonzalez,et al.  Coupling of ribosomal L1 stalk and tRNA dynamics during translation elongation. , 2008, Molecular cell.

[35]  T. Ha,et al.  SSB protein diffusion on single-stranded DNA stimulates RecA filament formation , 2009, Nature.

[36]  M. Visnapuu,et al.  Single-molecule imaging of DNA curtains reveals intrinsic energy landscapes for nucleosome deposition , 2009, Nature Structural &Molecular Biology.

[37]  Chris H Wiggins,et al.  Learning rates and states from biophysical time series: a Bayesian approach to model selection and single-molecule FRET data. , 2009, Biophysical journal.

[38]  T. Ha,et al.  SSB diffusion on single stranded DNA stimulates RecA filament formation , 2009, Nature.

[39]  Dwight L. Anderson,et al.  Substrate Interactions and Promiscuity in a Viral DNA Packaging Motor , 2009, Nature.

[40]  Jake M. Hofman,et al.  Allosteric collaboration between elongation factor G and the ribosomal L1 stalk directs tRNA movements during translation , 2009, Proceedings of the National Academy of Sciences.

[41]  T. Ha,et al.  Erratum: SSB protein diffusion on single-stranded DNA stimulates RecA filament formation (Nature (2009) 461 (1092-1097)) , 2009 .

[42]  T. Ha,et al.  Stepwise translocation of nucleic acid motors. , 2010, Current opinion in structural biology.

[43]  K. Dahmen,et al.  A comparative study of multivariate and univariate hidden Markov modelings in time-binned single-molecule FRET data analysis. , 2010, The journal of physical chemistry. B.

[44]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .