Discovering putative prion sequences in complete proteomes using probabilistic representations of Q/N-rich domains

BackgroundPrion proteins conform a special class among amyloids due to their ability to transmit aggregative folds. Prions are known to act as infectious agents in neurodegenerative diseases in animals, or as key elements in transcription and translation processes in yeast. It has been suggested that prions contain specific sequential domains with distinctive amino acid composition and physicochemical properties that allow them to control the switch between soluble and β-sheet aggregated states. Those prion-forming domains are low complexity segments enriched in glutamine/asparagine and depleted in charged residues and prolines. Different predictive methods have been developed to discover novel prions by either assessing the compositional bias of these stretches or estimating the propensity of protein sequences to form amyloid aggregates. However, the available algorithms hitherto lack a thorough statistical calibration against large sequence databases, which makes them unable to accurately predict prions without retrieving a large number of false positives.ResultsHere we present a computational strategy to predict putative prion-forming proteins in complete proteomes using probabilistic representations of prionogenic glutamine/asparagine rich regions. After benchmarking our predictive model against large sets of non-prionic sequences, we were able to filter out known prions with high precision and accuracy, generating prediction sets with few false positives. The algorithm was used to scan all the proteomes annotated in public databases for the presence of putative prion proteins. We analyzed the presence of putative prion proteins in all taxa, from viruses and archaea to plants and higher eukaryotes, and found that most organisms encode evolutionarily unrelated proteins with susceptibility to behave as prions.ConclusionsTo our knowledge, this is the first wide-ranging study aiming to predict prion domains in complete proteomes. Approaches of this kind could be of great importance to identify potential targets for further experimental testing and to try to reach a deeper understanding of prions’ functional and regulatory mechanisms.

[1]  R. Kopito,et al.  Cytoplasmic penetration and persistent infection of mammalian cells by polyglutamine aggregates , 2009, Nature Cell Biology.

[2]  M. Tuite,et al.  Fungal prions. , 2012, Progress in molecular biology and translational science.

[3]  David L. Steffen,et al.  The genome of the social amoeba Dictyostelium discoideum , 2005, Nature.

[4]  E. Kandel,et al.  A Neuronal Isoform of the Aplysia CPEB Has Prion-Like Properties , 2003, Cell.

[5]  H. True,et al.  A yeast prion provides a mechanism for genetic variation and phenotypic diversity , 2000, Nature.

[6]  D. Walsh,et al.  Exogenous Induction of Cerebral ß-Amyloidogenesis Is Governed by Agent and Host , 2006, Science.

[7]  L. Taubner,et al.  Prion protein misfolding and disease. , 2009, Current opinion in structural biology.

[8]  Eric D. Ross,et al.  Prion domains: sequences, structures and interactions , 2005, Nature Cell Biology.

[9]  Tsippi Iny Stein,et al.  In-silico human genomics with GeneCards , 2011, Human Genomics.

[10]  Asa Ben-Hur,et al.  De novo design of synthetic prion domains , 2012, Proceedings of the National Academy of Sciences.

[11]  J. Weissman,et al.  A census of glutamine/asparagine-rich regions: implications for their conserved function and the prediction of novel prions. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[12]  S. Choudhry,et al.  CAG repeat instability at SCA2 locus: anchoring CAA interruptions and linked single nucleotide polymorphisms. , 2001, Human molecular genetics.

[13]  M. Nishizawa,et al.  Local‐scale repetitiveness in amino acid use in eukaryote protein sequences: A genomic factor in protein evolution , 1999, Proteins.

[14]  Adriano Aguzzi,et al.  Prions: protein aggregation and infectious diseases. , 2009, Physiological reviews.

[15]  A. Fisahn,et al.  α-Helix targeting reduces amyloid-β peptide toxicity , 2009, Proceedings of the National Academy of Sciences.

[16]  Christopher M Dobson,et al.  The behaviour of polyamino acids reveals an inverse side chain effect in amyloid structure formation , 2002, The EMBO journal.

[17]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[18]  C. Ross,et al.  Protein aggregation and neurodegenerative disease , 2004, Nature Medicine.

[19]  David Eisenberg,et al.  In Brief , 2009, Nature Reviews Neuroscience.

[20]  Fredric C. Gey,et al.  The Relationship between Recall and Precision , 1994, J. Am. Soc. Inf. Sci..

[21]  David Eisenberg,et al.  The structural biology of protein aggregation diseases: Fundamental questions and some answers. , 2006, Accounts of chemical research.

[22]  D. Eliezer,et al.  Biophysical characterization of intrinsically disordered proteins. , 2009, Current opinion in structural biology.

[23]  Susan Lindquist,et al.  Prions, protein homeostasis, and phenotypic diversity. , 2010, Trends in cell biology.

[24]  M. Gerstein,et al.  A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes , 2003, Genome Biology.

[25]  P. Romero,et al.  Sequence complexity of disordered protein , 2001, Proteins.

[26]  S. Lindquist,et al.  Plasmodium falciparum heat shock protein 110 stabilizes the asparagine repeat-rich parasite proteome during malarial fevers , 2012, Nature Communications.

[27]  F. Cohen,et al.  Prion Protein Biology , 1998, Cell.

[28]  O. King,et al.  A Systematic Survey Identifies Prions and Illuminates Sequence Features of Prionogenic Proteins , 2009, Cell.

[29]  L. Serrano,et al.  Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins , 2004, Nature Biotechnology.

[30]  A. Bertolotti,et al.  Propagation of the prion phenomenon: beyond the seeding principle. , 2012, Journal of molecular biology.

[31]  J R Glover,et al.  Support for the Prion Hypothesis for Inheritance of a Phenotypic Trait in Yeast , 1996, Science.

[32]  J. Weissman,et al.  Conformational diversity in a yeast prion dictates its seeding specificity , 2001, Nature.

[33]  E. Pizzi,et al.  Low-complexity regions in Plasmodium falciparum proteins. , 2001, Genome research.

[34]  Zoran Obradovic,et al.  DisProt: the Database of Disordered Proteins , 2006, Nucleic Acids Res..

[35]  Nicholas H. Putnam,et al.  Comparative genomics of the social amoebae Dictyostelium discoideum and Dictyostelium purpureum , 2011, Genome Biology.

[36]  Y. Chernoff,et al.  Prions in Yeast , 2012, Genetics.

[37]  Michele Vendruscolo,et al.  Prediction of "aggregation-prone" and "aggregation-susceptible" regions in proteins associated with neurodegenerative diseases. , 2005, Journal of molecular biology.

[38]  S. Lindquist,et al.  The Schizosaccharomyces pombe Hsp104 Disaggregase Is Unable to Propagate the [PSI +] Prion , 2009, PloS one.

[39]  P. Brundin,et al.  α-Synuclein propagates from mouse brain to grafted dopaminergic neurons and seeds aggregation in cultured human cells. , 2011, The Journal of clinical investigation.

[40]  S. Lindquist,et al.  Protein-only mechanism induces self-perpetuating changes in the activity of neuronal Aplysia cytoplasmic polyadenylation element binding protein (CPEB) , 2011, Proceedings of the National Academy of Sciences.

[41]  O. Namy,et al.  Epigenetic control of polyamines by the prion [PSI+] , 2008, Nature Cell Biology.

[42]  Susan Lindquist,et al.  Prions as adaptive conduits of memory and inheritance , 2005, Nature Reviews Genetics.

[43]  Ronald Melki,et al.  Prion-like transmission of protein aggregates in neurodegenerative diseases , 2010, Nature Reviews Molecular Cell Biology.

[44]  Eric D Ross,et al.  Primary sequence independence for prion formation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[46]  J. Carpenter,et al.  Survival of water stress in annual fish embryos: dehydration avoidance and egg envelope amyloid fibers. , 2001, American journal of physiology. Regulatory, integrative and comparative physiology.

[47]  D. Grzybicki Correction for Couthouis et al., A yeast functional screen predicts new candidate ALS disease genes , 2022, Proceedings of the National Academy of Sciences of the United States of America.

[48]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[49]  A. Dunker,et al.  Predicting intrinsic disorder in proteins: an overview , 2009, Cell Research.

[50]  Johan T den Dunnen,et al.  Strong aggregation and increased toxicity of polyleucine over polyglutamine stretches in mammalian cells. , 2002, Human molecular genetics.

[51]  Bonnie Berger,et al.  BETASCAN: Probable beta-amyloids Identified by Pairwise Probabilistic Analysis , 2009 .

[52]  C. Dobson,et al.  Protein misfolding, functional amyloid, and human disease. , 2006, Annual review of biochemistry.

[53]  G. Vriend,et al.  Amyloids protect the silkmoth oocyte and embryo , 2000, FEBS letters.

[54]  B. Sykes,et al.  Freezing of a fish antifreeze protein results in amyloid fibril formation. , 2003, Biophysical journal.

[55]  B. Strooper,et al.  The amyloid cascade hypothesis for Alzheimer's disease: an appraisal for the development of therapeutics , 2011, Nature Reviews Drug Discovery.

[56]  Golding Gb,et al.  Simple sequence is abundant in eukaryotic proteins. , 1999 .

[57]  M. Tuite,et al.  The [PSI+] Prion of Saccharomyces cerevisiae Can Be Propagated by an Hsp104 Orthologue from Candida albicans , 2006, Eukaryotic Cell.

[58]  L. Cardon,et al.  The Relationship Between CAG Repeat Length and Age of Onset Differs for Huntington's Disease Patients with Juvenile Onset or Adult Onset , 2007, Annals of human genetics.

[59]  Heather L. True,et al.  Epigenetic regulation of translation reveals hidden genetic variation to produce complex traits , 2004, Nature.

[60]  D. Otzen,et al.  We find them here, we find them there: Functional bacterial amyloid , 2008, Cellular and Molecular Life Sciences.

[61]  Bonnie Berger,et al.  Opposing effects of glutamine and asparagine govern prion formation by intrinsically disordered proteins. , 2011, Molecular cell.

[62]  Fredric C. Gey,et al.  The relationship between recall and precision , 1994 .

[63]  S. Lindquist,et al.  Screening for Amyloid Aggregation by Semi-Denaturing Detergent-Agarose Gel Electrophoresis , 2008, Journal of visualized experiments : JoVE.

[64]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[65]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[66]  María Martín,et al.  The Universal Protein Resource (UniProt) in 2010 , 2010 .

[67]  Michel Goedert,et al.  A simple algorithm locates beta-strands in the amyloid fibril core of alpha-synuclein, Abeta, and tau using the amino acid sequence alone. , 2007, Protein science : a publication of the Protein Society.

[68]  David S. Goodsell,et al.  The RCSB Protein Data Bank: redesigned web site and web services , 2010, Nucleic Acids Res..

[69]  Jonathan S. Weissman,et al.  The physical basis of how prion conformations determine strain phenotypes , 2006, Nature.

[70]  P. Westermark,et al.  Protein fibrils in nature can enhance amyloid protein A amyloidosis in mice: Cross-seeding as a disease mechanism , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[71]  Stephen P Bottomley,et al.  Multi-domain misfolding: understanding the aggregation pathway of polyglutamine proteins. , 2009, Protein engineering, design & selection : PEDS.

[72]  Lenore Cowen,et al.  BETASCAN: Probable β-amyloids Identified by Pairwise Probabilistic Analysis , 2009, PLoS Comput. Biol..

[73]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[74]  S. Lindquist,et al.  Hsp104 Catalyzes Formation and Elimination of Self-Replicating Sup35 Prion Conformers , 2004, Science.

[75]  N. Graham,et al.  Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation , 2002 .

[76]  M. Siegal,et al.  Robustness: mechanisms and consequences. , 2009, Trends in genetics : TIG.

[77]  Sheena E Radford,et al.  The Yin and Yang of protein folding , 2005, The FEBS journal.

[78]  Louise C. Serpell,et al.  A simple algorithm locates β‐strands in the amyloid fibril core of α‐synuclein, Aβ, and tau using the amino acid sequence alone , 2007 .

[79]  M. Bolognesi,et al.  Function and Structure of Inherently Disordered Proteins This Review Comes from a Themed Issue on Proteins Edited Prediction of Non-folding Proteins and Regions Frequency of Disordered Regions Protein Evolution Partitioning Unstructured Proteins and Regions into Groups Involvement of Inherently Diso , 2022 .

[80]  Silvio C. E. Tosatto,et al.  The PASTA server for protein aggregation prediction. , 2007, Protein engineering, design & selection : PEDS.

[81]  Eric D. Ross,et al.  Scrambled Prion Domains Form Prions and Amyloid , 2004, Molecular and Cellular Biology.

[82]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[83]  L. Serrano,et al.  A comparative study of the relationship between protein structure and beta-aggregation in globular and intrinsically disordered proteins. , 2004, Journal of molecular biology.

[84]  F. Chiti,et al.  Amyloidogenesis in its biological environment: challenging a fundamental issue in protein misfolding diseases. , 2008, Current opinion in structural biology.

[85]  Scott J. Hultgren,et al.  Role of Escherichia coli Curli Operons in Directing Amyloid Fiber Formation , 2002, Science.

[86]  Patrice Koehl,et al.  The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..

[87]  Atanas V Koulov,et al.  Functional Amyloid Formation within Mammalian Tissue , 2005, PLoS biology.

[88]  J. Whisstock,et al.  Functional insights from the distribution and role of homopeptide repeat-containing proteins. , 2005, Genome research.

[89]  G. B. Golding,et al.  Simple sequence is abundant in eukaryotic proteins , 1999, Protein science : a publication of the Protein Society.

[90]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[91]  V. Coustou,et al.  The protein product of the het-s heterokaryon incompatibility gene of the fungus Podospora anserina behaves as a prion analog. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[92]  D. Selkoe,et al.  α-Synuclein occurs physiologically as a helically folded tetramer that resists aggregation , 2011, Nature.

[93]  S. Lindquist,et al.  Rnq1: an epigenetic modifier of protein function in yeast. , 2000, Molecular cell.

[94]  A. Zagari,et al.  A Structural Overview of the Vertebrate Prion Proteins , 2007, Prion.

[95]  D. Selkoe Folding proteins in fatal ways , 2003, Nature.

[96]  Y. Chernoff,et al.  Biological Roles of Prion Domains , 2007, Prion.

[97]  Aaron J. Bell,et al.  Short- and Long-Term Memory Are Modulated by Multiple Isoforms of the Fragile X Mental Retardation Protein , 2010, The Journal of Neuroscience.

[98]  A.K. Dunker,et al.  Identifying disordered regions in proteins from amino acid sequence , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[99]  James A. Toombs,et al.  Compositional Determinants of Prion Formation in Yeast , 2009, Molecular and Cellular Biology.