Allosteric mechanism of the circadian protein Vivid resolved through Markov state model and machine learning analysis

The fungal circadian clock photoreceptor Vivid (VVD) contains a photosensitive allosteric light, oxygen, voltage (LOV) domain that undergoes a large N-terminal conformational change. The mechanism by which a blue-light driven covalent bond formation leads to a global conformational change remains unclear, which hinders the further development of VVD as an optogenetic tool. We answered this question through a novel computational platform integrating Markov state models, machine learning methods, and newly developed community analysis algorithms. Applying this new integrative approach, we provided a quantitative evaluation of the contribution from the covalent bond to the protein global conformational change, and proposed an atomistic allosteric mechanism leading to the discovery of the unexpected importance of A’α/Aβ and previously overlooked Eα/Fα loops in the conformational change. This approach could be applicable to other allosteric proteins in general to provide interpretable atomistic representations of their otherwise elusive allosteric mechanisms.

[1]  B. Zoltowski,et al.  Structural biochemistry of a fungal LOV domain photoreceptor reveals an evolutionarily conserved pathway integrating light and oxidative stress. , 2015, Structure.

[2]  B. Zoltowski,et al.  Revealing Hidden Conformational Space of LOV Protein VIVID Through Rigid Residue Scan Simulations , 2017, Scientific Reports.

[3]  Peter L. Freddolino,et al.  Signaling mechanisms of LOV domains: new insights from molecular dynamics studies , 2013, Photochemical & photobiological sciences : Official journal of the European Photochemistry Association and the European Society for Photobiology.

[4]  B. Zoltowski,et al.  Characterization of a Vivid Homolog in Botrytis cinerea , 2018, Photochemistry and photobiology.

[5]  George F. Hepner,et al.  Artificial neural network classification using a minimal training set - Comparison to conventional supervised classification , 1990 .

[6]  Peng Tao,et al.  Identifying key residues for protein allostery through rigid residue scan. , 2015, The journal of physical chemistry. A.

[7]  Carl Kingsford,et al.  What are decision trees? , 2008, Nature Biotechnology.

[8]  Juergen C. Jung,et al.  Optimized light-inducible transcription in mammalian cells using Flavin Kelch-repeat F-box1/GIGANTEA and CRY2/CIB1 , 2017, Nucleic acids research.

[9]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[10]  Kevin H. Gardner,et al.  Structural Basis of a Phototropin Light Switch , 2003, Science.

[11]  Vijay S. Pande,et al.  OpenMM: A Hardware-Independent Framework for Molecular Simulations , 2010, Computing in Science & Engineering.

[12]  B. Zoltowski,et al.  LOV-based optogenetic devices: light-driven modules to impart photoregulated control of cellular signaling , 2015, Front. Mol. Biosci..

[13]  J.A. Anderson,et al.  Neural Network Models for Pattern Recognition and Associative Memory , 2002 .

[14]  M. Karplus,et al.  Signaling pathways of PDZ2 domain: A molecular dynamics interaction correlation analysis , 2009, Proteins.

[15]  Jorge Haddock,et al.  Simulation optimization using simulated annealing , 1992 .

[16]  Jure Zupan,et al.  Neural networks in chemistry , 1993 .

[17]  S. Kay,et al.  Photoactive yellow protein: a structural prototype for the three-dimensional fold of the PAS domain superfamily. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Daniel M. Zuckerman,et al.  Accurate Estimation of Protein Folding and Unfolding Times: Beyond Markov State Models , 2016, Journal of chemical theory and computation.

[19]  Zheng Yang,et al.  Allosteric Transitions of Supramolecular Systems Explored by Network Models: Application to Chaperonin GroEL , 2009, PLoS Comput. Biol..

[20]  Mohammad M. Sultan,et al.  MSMBuilder: Statistical Models for Biomolecular Dynamics , 2016, bioRxiv.

[21]  Peng Tao,et al.  Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily , 2018, Protein science : a publication of the Protein Society.

[22]  D. Loo,et al.  Stochastic steps in secondary active sugar transport , 2016, Proceedings of the National Academy of Sciences.

[23]  Dina Schneidman-Duhovny,et al.  Formation of a repressive complex in the mammalian circadian clock is mediated by the secondary pocket of CRY1 , 2017, Proceedings of the National Academy of Sciences.

[24]  Feng Wang,et al.  t-Distributed Stochastic Neighbor Embedding Method with the Least Information Loss for Macromolecular Simulations. , 2018, Journal of chemical theory and computation.

[25]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[26]  Jianpeng Ma,et al.  CHARMM: The biomolecular simulation program , 2009, J. Comput. Chem..

[27]  M. Karplus,et al.  Hidden complexity of free energy surfaces for peptide (protein) folding. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[28]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Lewis E. Kay,et al.  Proteasome allostery as a population shift between interchanging conformers , 2012, Proceedings of the National Academy of Sciences.

[30]  M. Jones,et al.  Mutational Analysis of Phototropin 1 Provides Insights into the Mechanism Underlying LOV2 Signal Transmission* , 2007, Journal of Biological Chemistry.

[31]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  P. Sassone-Corsi,et al.  Crystal structure and interactions of the PAS repeat region of the Drosophila clock protein PERIOD. , 2005, Molecular cell.

[33]  N. Go,et al.  Investigating protein dynamics in collective coordinate space. , 1999, Current opinion in structural biology.

[34]  Francisco Herrera,et al.  An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes , 2011, Pattern Recognit..

[35]  Diwakar Shukla,et al.  Markov State Models Provide Insights into Dynamic Modulation of Protein Function , 2015, Accounts of chemical research.

[36]  Zaida Luthey-Schulten,et al.  NetworkView: 3D display and analysis of protein·RNA interaction networks , 2012, Bioinform..

[37]  Brian D Zoltowski,et al.  Light activation of the LOV protein vivid generates a rapidly exchanging dimer. , 2008, Biochemistry.

[38]  K. Gardner,et al.  Disruption of the LOV-Jalpha helix interaction activates phototropin kinase activity. , 2004, Biochemistry.

[39]  H. Berendsen,et al.  Collective protein dynamics in relation to function. , 2000, Current opinion in structural biology.

[40]  Jennifer J. Loros,et al.  Conformational Switching in the Fungal Light Sensor Vivid , 2007, Science.

[41]  Keith Moffat,et al.  N- and C-terminal flanking regions modulate light-induced signal transduction in the LOV2 domain of the blue light sensor phototropin 1 from Avena sativa. , 2007, Biochemistry.

[42]  Darren V. S. Green,et al.  The Reduced Graph Descriptor in Virtual Screening and Data-Driven Clustering of High-Throughput Screening Data , 2005, J. Chem. Inf. Model..

[43]  Brian K. Shoichet,et al.  Molecular docking using shape descriptors , 1992 .

[44]  T. Darden,et al.  A smooth particle mesh Ewald method , 1995 .

[45]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[46]  Rommie E. Amaro,et al.  Allostery through the computational microscope: cAMP activation of a canonical signaling domain , 2015, Nature Communications.

[47]  F. Noé,et al.  Transition networks for modeling the kinetics of conformational change in macromolecules. , 2008, Current opinion in structural biology.

[48]  D. M. F. Aalten,et al.  PRODRG, a program for generating molecular topologies and unique molecular descriptors from coordinates of small molecules , 1996, J. Comput. Aided Mol. Des..

[49]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[50]  Brian W. Kernighan,et al.  An Effective Heuristic Algorithm for the Traveling-Salesman Problem , 1973, Oper. Res..

[51]  P. Deuflhard,et al.  Robust Perron cluster analysis in conformation dynamics , 2005 .

[52]  Andrej Sali,et al.  Structure-based model of allostery predicts coupling between distant sites , 2012, Proceedings of the National Academy of Sciences.

[53]  Ramaswamy Nilakantan,et al.  Topological torsion: a new molecular descriptor for SAR applications. Comparison with other descriptors , 1987, J. Chem. Inf. Comput. Sci..

[54]  Xuhui Huang,et al.  Using generalized ensemble simulations and Markov state models to identify conformational states. , 2009, Methods.

[55]  Young Hun Song,et al.  Kinetics of the LOV domain of ZEITLUPE determine its circadian function in Arabidopsis , 2017, eLife.

[56]  Peng Tao,et al.  Dynamics Sampling in Transition Pathway Space. , 2018, Journal of chemical theory and computation.

[57]  F. Noé,et al.  Projected and hidden Markov models for calculating kinetics and metastable states of complex molecules. , 2013, The Journal of chemical physics.

[58]  C. Schütte,et al.  Supplementary Information for “ Constructing the Equilibrium Ensemble of Folding Pathways from Short Off-Equilibrium Simulations ” , 2009 .

[59]  Hongyu Zhou,et al.  REDAN: relative entropy-based dynamical allosteric network model , 2018, Molecular physics.

[60]  T. Bhat,et al.  The Protein Data Bank and the challenge of structural genomics , 2000, Nature Structural Biology.

[61]  Frank Noé,et al.  Markov models of molecular kinetics: generation and validation. , 2011, The Journal of chemical physics.

[62]  J. Dunlap,et al.  White Collar-1, a Circadian Blue Light Photoreceptor, Binding to the frequency Promoter , 2002, Science.

[63]  Hongyu Zhou,et al.  Recognition of protein allosteric states and residues: Machine learning approaches , 2018, J. Comput. Chem..

[64]  M. Nakasako,et al.  Quaternary structure of LOV‐domain containing polypeptide of Arabidopsis FKF1 protein , 2005, FEBS letters.

[65]  Jin Liu,et al.  Rigid Residue Scan Simulations Systematically Reveal Residue Entropic Roles in Protein Allostery , 2016, PLoS Comput. Biol..

[66]  Sotaro Fuchigami,et al.  Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: the case of domain motions. , 2011, The Journal of chemical physics.

[67]  Donald Hamelberg,et al.  Dynamical network of residue–residue contacts reveals coupled allosteric effects in recognition, catalysis, and mutation , 2016, Proceedings of the National Academy of Sciences.

[68]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[69]  J. Freed,et al.  Signal transduction in light–oxygen–voltage receptors lacking the adduct-forming cysteine residue , 2015, Nature Communications.

[70]  K. Gardner,et al.  Tripping the light fantastic: blue-light photoreceptors as examples of environmentally modulated protein-protein interactions. , 2011, Biochemistry.

[71]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[72]  J. Christie,et al.  Photochemical and mutational analysis of the FMN-binding domains of the plant blue light receptor, phototropin. , 2000, Biochemistry.

[73]  B. Zoltowski,et al.  Mechanism-based tuning of a LOV domain photoreceptor. , 2009, Nature chemical biology.

[74]  Brian D Zoltowski,et al.  Blue light-induced dimerization of a bacterial LOV-HTH DNA-binding protein. , 2013, Biochemistry.

[75]  Frank Noé,et al.  Markov state models of biomolecular conformational dynamics. , 2014, Current opinion in structural biology.

[76]  Vijay S Pande,et al.  tICA-Metadynamics: Accelerating Metadynamics by Using Kinetically Selected Collective Variables. , 2017, Journal of chemical theory and computation.

[77]  Eric Vanden-Eijnden,et al.  Transition Path Theory for Markov Jump Processes , 2009, Multiscale Model. Simul..

[78]  J. Dunlap,et al.  Genetic and molecular analysis of circadian rhythms in Neurospora. , 2001, Annual review of physiology.