Protein Tertiary Structure Modeling Driven by Deep Learning and Contact Distance Prediction in CASP13

Ab initio prediction of protein structure from sequence is one of the most challenging and important problems in bioinformatics and computational biology. After a long period of stagnancy, ab initio protein structure prediction is undergoing a revolution driven by inter-residue contact distance prediction empowered by deep learning. In this talk, I will present the deep learning and contact distance prediction methods of our MULTICOM protein structure prediction system that was ranked among the top three best methods in the 13th community-wide Critical Assessment of Techniques for Protein Structure Prediction (CASP13) in 2018 [1]. MULTICOM was able to correctly fold structures of numerous hard protein targets from scratch in CASP13, which was an unprecedented progress. The success clearly demonstrates that contact distance prediction is the key direction to tackle the protein structure prediction challenge and deep learning is the key technology to solve it. However, to completely solve the problem, more advanced deep learning methods are needed to accurately predict inter-residue distances when few homologous sequences are available to calculate residue-residue co-evolution scores, fold proteins from noisy inter-residue distances, and rank the structural models of hard protein targets.