In silico prediction of in vitro protein liquid-liquid phase separation experiments outcomes with multi-head neural attention

MOTIVATION Proteins able to undergo Liquid-Liquid Phase Separation (LLPS) in-vivo and in-vitro are drawing a lot of interest, due to their functional relevance for cell life. Nevertheless, the proteome-scale experimental screening of these proteins seems unfeasible, because besides being expensive and time consuming, LLPS is heavily influenced by multiple environmental conditions such as concentration, pH and temperature, thus requiring a combinatorial number of experiments for each protein. RESULTS To overcome this problem, we propose an Neural Network model able to predict the LLPS behavior of proteins given specified experimental conditions, effectively predicting the outcome of in-vitro experiments. Our model can be used to rapidly screen proteins and experimental conditions searching for LLPS, thus reducing the search space that needs to be covered experimentally. We experimentally validate Droppler's prediction on the the TAR DNA-binding protein in different experimental conditions, showing the consistency of its predictions. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Anthony A. Hyman,et al.  Biomolecular condensates: organizers of cellular biochemistry , 2017, Nature Reviews Molecular Cell Biology.

[2]  Diana M. Mitrea,et al.  Coexisting Liquid Phases Underlie Nucleolar Subcompartments , 2016, Cell.

[3]  H. Hermjakob,et al.  PhaSepDB: a database of liquid–liquid phase separation related proteins , 2019, Nucleic Acids Res..

[4]  G. Orlando,et al.  Observation selection bias in contact prediction and its implications for structural bioinformatics , 2016, Scientific Reports.

[5]  Silvio C. E. Tosatto,et al.  PhaSePro: the database of proteins driving liquid–liquid phase separation , 2019, Nucleic Acids Res..

[6]  Youjun Xu,et al.  Prediction of liquid-liquid phase separation proteins using machine learning , 2019, bioRxiv.

[7]  Yves Moreau,et al.  Computational identification of prion-like RNA-binding proteins that form liquid phase-separated condensates , 2019, Bioinform..

[8]  Piero Fariselli,et al.  Insight into the protein solubility driving forces with neural attention , 2020, PLoS Comput. Biol..

[9]  Robert M Vernon,et al.  First-generation predictors of biological protein phase separation. , 2019, Current opinion in structural biology.

[10]  Tuomas P. J. Knowles,et al.  Machine learning models for predicting protein condensate formation from sequence determinants and embeddings , 2020, bioRxiv.

[11]  Hong Lin,et al.  Pi-Pi contacts are an overlooked protein feature relevant to phase separation , 2018, eLife.

[12]  C. Brangwynne,et al.  Liquid phase condensation in cell physiology and disease , 2017, Science.

[13]  Dietmar Riedel,et al.  Liquid–liquid phase separation of the microtubule-binding repeats of the Alzheimer-related protein Tau , 2017, Nature Communications.

[14]  Zhuqing Zhang,et al.  LLPSDB: a database of proteins undergoing liquid–liquid phase separation in vitro , 2019, Nucleic Acids Res..

[15]  A. Yamaguchi,et al.  FUS interacts with nuclear matrix-associated protein SAFB1 as well as Matrin3 to regulate splicing and ligand-mediated transcription , 2016, Scientific Reports.

[16]  C. Brangwynne,et al.  Getting RNA and Protein in Phase , 2012, Cell.

[17]  Yves Moreau,et al.  Ultra-fast global homology detection with Discrete Cosine Transform and Dynamic Time Warping , 2018, Bioinform..

[18]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[19]  A. Kanagaraj,et al.  Phase Separation by Low Complexity Domains Promotes Stress Granule Assembly and Drives Pathological Fibrillization , 2015, Cell.

[20]  Yves Moreau,et al.  Exploring the limitations of biophysical propensity scales coupled with machine learning for protein sequence analysis , 2019, Scientific Reports.

[21]  Hua Tang,et al.  Identification of Secretory Proteins in Mycobacterium tuberculosis Using Pseudo Amino Acid Composition , 2016, BioMed research international.

[22]  Timothy D. Craggs,et al.  Phase Transition of a Disordered Nuage Protein Generates Environmentally Responsive Membraneless Organelles , 2015, Molecular cell.

[23]  Vladimir N Uversky,et al.  Protein intrinsic disorder-based liquid-liquid phase transitions in biological systems: Complex coacervates and membrane-less organelles. , 2017, Advances in colloid and interface science.

[24]  R. Pappu,et al.  A Molecular Grammar Governing the Driving Forces for Phase Separation of Prion-like RNA Binding Proteins , 2018, Cell.

[25]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.