Can Protein Structure Prediction Methods Capture Alternative Conformations of Membrane Proteins?

Understanding the conformational dynamics of proteins, such as the inward-facing (IF) and outward-facing (OF) transition observed in transporters, is vital for elucidating their functional mechanisms. Despite significant advances in protein structure prediction (PSP) over the past three decades, most efforts have been focused on single-state prediction, leaving multi-state or alternative conformation prediction (ACP) relatively unexplored. This discrepancy has led to the development of highly accurate PSP methods such as AlphaFold, yet their capabilities for ACP remain limited. To investigate the performance of current PSP methods in ACP, we curated a dataset, named IOMemP, consisting of 32 experimentally determined high-resolution IF and OF structures of 16 membrane proteins. We benchmarked 12 representative PSP methods, along with two recent multi-state methods based on AlphaFold, against this dataset. Our findings reveal an escalating bias towards one specific state in deep learning-based methods and a remarkably consistent preference for specific states across various PSP methods. We elucidated how coevolution information in MSAs influences the state preference. Moreover, we showed that AlphaFold, when excluding coevolution information, estimated similar energies between the experimental IF and OF conformations, indicating that the energy model learned by AlphaFold is not biased towards any particular state. Our IOMemP dataset and benchmark results are anticipated to advance the development of robust ACP methods.

[1]  F. Noé,et al.  Improved protein complex prediction with AlphaFold-multimer by denoising the MSA profile , 2023, bioRxiv.

[2]  K. Fidelis,et al.  New prediction categories in CASP15 , 2023, Proteins.

[3]  Joseph W. Schafer,et al.  Evolutionary selection of proteins with two folds , 2023, bioRxiv.

[4]  Zeming Lin,et al.  Evolutionary-scale prediction of atomic level protein structure with a language model , 2022, bioRxiv.

[5]  L. Swint-Kruse,et al.  Identification of a covert evolutionary pathway between two protein folds , 2022, bioRxiv.

[6]  Lucy J. Colwell,et al.  Prediction of multiple conformational states by combining sequence clustering with AlphaFold2 , 2022, bioRxiv.

[7]  R. Stein,et al.  SPEACH_AF: Sampling protein ensembles and conformational heterogeneity with Alphafold2 , 2022, PLoS Comput. Biol..

[8]  Jian Peng,et al.  High-resolution de novo structure prediction from primary sequence , 2022, bioRxiv.

[9]  S. Ovchinnikov,et al.  State-of-the-art estimation of protein model accuracy using AlphaFold , 2022, bioRxiv.

[10]  M. Feig,et al.  Direct generation of protein conformational ensembles via machine learning , 2022, bioRxiv.

[11]  J. Meiler,et al.  Sampling alternative conformational states of transporters and receptors with AlphaFold2 , 2022, eLife.

[12]  M. Feig,et al.  Multi‐state modeling of G‐protein coupled receptors at experimental accuracy , 2021, bioRxiv.

[13]  S. Zeng,et al.  VARIDT 2.0: structural variability of drug transporter , 2021, Nucleic Acids Res..

[14]  G. Makhatadze Faculty Opinions recommendation of Accurate prediction of protein structures and interactions using a three-track neural network. , 2021, Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature.

[15]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[16]  Jinbo Xu,et al.  Improved protein structure prediction by deep learning irrespective of co-evolution information , 2020, Nature Machine Intelligence.

[17]  Jiangyan Feng,et al.  FingerprintContacts: Predicting Alternative Conformations of Proteins from Coevolution , 2020, bioRxiv.

[18]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[19]  Jianyi Yang,et al.  Improved protein structure prediction using predicted interresidue orientations , 2019, Proceedings of the National Academy of Sciences.

[20]  Robert D. Finn,et al.  MGnify: the microbiome analysis resource in 2020 , 2019, Nucleic Acids Res..

[21]  Jun Hu,et al.  ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks , 2019, Bioinform..

[22]  Myle Ott,et al.  Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences , 2019, Proceedings of the National Academy of Sciences.

[23]  Jinbo Xu Distance-based protein folding powered by deep learning , 2018, Proceedings of the National Academy of Sciences.

[24]  Loren L Looger,et al.  Extant fold-switching proteins are widespread , 2018, Proceedings of the National Academy of Sciences.

[25]  R. Reithmeier,et al.  Structural biology of solute carrier (SLC) membrane transport proteins , 2017, Molecular membrane biology.

[26]  Maria Jesus Martin,et al.  Uniclust databases of clustered and deeply annotated protein sequences and alignments , 2016, Nucleic Acids Res..

[27]  Zhen Li,et al.  Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model , 2016, bioRxiv.

[28]  R. Nussinov,et al.  Protein Ensembles: How Does Nature Harness Thermodynamic Fluctuations for Life? The Diverse Functional Roles of Conformational Ensembles in the Cell. , 2016, Chemical reviews.

[29]  A. Valencia,et al.  From residue coevolution to protein conformational ensembles and functional dynamics , 2015, Proceedings of the National Academy of Sciences.

[30]  Peter B. McGarvey,et al.  UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches , 2014, Bioinform..

[31]  Markus Gruber,et al.  CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations , 2014, Bioinform..

[32]  A. S. Ethayathulla,et al.  Structure-based mechanism for Na+/melibiose symport by MelB , 2014, Nature Communications.

[33]  Terence Hwa,et al.  Coevolutionary signals across protein lineages help capture multiple protein conformations , 2013, Proceedings of the National Academy of Sciences.

[34]  David T. Jones,et al.  Membrane protein orientation and refinement using a knowledge-based statistical potential , 2013, BMC Bioinformatics.

[35]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[36]  E. Aurell,et al.  Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Massimiliano Pontil,et al.  PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments , 2012, Bioinform..

[38]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[39]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[40]  Su-In Lee,et al.  Learning generative models for protein fold families , 2011, Proteins.

[41]  Sean R. Eddy,et al.  Hidden Markov model speed heuristic and iterative HMM search procedure , 2010, BMC Bioinformatics.

[42]  Arne Elofsson,et al.  TOPCONS: consensus prediction of membrane protein topology , 2009, Nucleic Acids Res..

[43]  Gregory B. Gloor,et al.  Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction , 2008, Bioinform..

[44]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[45]  J. Onuchic,et al.  Theory of protein folding. , 2004, Current opinion in structural biology.

[46]  R. Burton,et al.  Genetic Architecture of Physiological Phenotypes: Empirical Evidence for Coadapted Gene Complexes , 1999 .

[47]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[48]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.