Viral Proteins Acquired from a Host Converge to Simplified Domain Architectures

The infection cycle of viruses creates many opportunities for the exchange of genetic material with the host. Many viruses integrate their sequences into the genome of their host for replication. These processes may lead to the virus acquisition of host sequences. Such sequences are prone to accumulation of mutations and deletions. However, in rare instances, sequences acquired from a host become beneficial for the virus. We searched for unexpected sequence similarity among the 900,000 viral proteins and all proteins from cellular organisms. Here, we focus on viruses that infect metazoa. The high-conservation analysis yielded 187 instances of highly similar viral-host sequences. Only a small number of them represent viruses that hijacked host sequences. The low-conservation sequence analysis utilizes the Pfam family collection. About 5% of the 12,000 statistical models archived in Pfam are composed of viral-metazoan proteins. In about half of Pfam families, we provide indirect support for the directionality from the host to the virus. The other families are either wrongly annotated or reflect an extensive sequence exchange between the viruses and their hosts. In about 75% of cross-taxa Pfam families, the viral proteins are significantly shorter than their metazoan counterparts. The tendency for shorter viral proteins relative to their related host proteins accounts for the acquisition of only a fragment of the host gene, the elimination of an internal domain and shortening of the linkers between domains. We conclude that, along viral evolution, the host-originated sequences accommodate simplified domain compositions. We postulate that the trimmed proteins act by interfering with the fundamental function of the host including intracellular signaling, post-translational modification, protein-protein interaction networks and cellular trafficking. We compiled a collection of hijacked protein sequences. These sequences are attractive targets for manipulation of viral infection.

[1]  E. Koonin,et al.  Viruses with More Than 1,000 Genes: Mamavirus, a New Acanthamoeba polyphaga mimivirus Strain, and Reannotation of Mimivirus Genes , 2011, Genome biology and evolution.

[2]  T. Zeev-Ben-Mordehai,et al.  Conserved Eukaryotic Fusogens Can Fuse Viral Envelopes to Cells , 2011, Science.

[3]  A. Katzourakis,et al.  Endogenous Viral Elements in Animal Genomes , 2010, PLoS genetics.

[4]  W. Johnson Endless Forms Most Viral , 2010, PLoS genetics.

[5]  C. Suttle,et al.  Giant virus with a remarkable complement of genes infects marine zooplankton , 2010, Proceedings of the National Academy of Sciences.

[6]  J. V. Van Etten,et al.  DNA viruses: the really big ones (giruses). , 2010, Annual review of microbiology.

[7]  A. Levine,et al.  Sequences from Ancestral Single-Stranded DNA Viruses in Vertebrate Genomes: the Parvoviridae and Circoviridae Are More than 40 to 50 Million Years Old , 2010, Journal of Virology.

[8]  K. Garcia,et al.  Structural Basis of Semaphorin-Plexin Recognition and Viral Mimicry from Sema7A and A39R Complexes with PlexinC1 , 2010, Cell.

[9]  C. Feschotte,et al.  Genomic Fossils Calibrate the Long-Term Evolution of Hepadnaviruses , 2010, PLoS biology.

[10]  Arnold J. Levine,et al.  Unexpected Inheritance: Multiple Integrations of Ancient Bornavirus and Ebolavirus/Marburgvirus Sequences in Vertebrate Genomes , 2010, PLoS pathogens.

[11]  Jeremy Bruenn,et al.  Filoviruses are ancient and integrated into mammalian genomes , 2010, BMC Evolutionary Biology.

[12]  Fulvio Reggiori,et al.  Coronaviruses Hijack the LC3-I-Positive EDEMosomes, ER-Derived Vesicles Exporting Short-Lived ERAD Regulators, for Replication , 2010, Cell Host & Microbe.

[13]  T. Gojobori,et al.  Endogenous non-retroviral RNA virus elements in mammalian genomes , 2010, Nature.

[14]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[15]  Nels C. Elde,et al.  The evolutionary conundrum of pathogen mimicry , 2009, Nature Reviews Microbiology.

[16]  Michal Linial,et al.  Viral adaptation to host: a proteome-based analysis of codon usage and amino acid preferences , 2009, Molecular systems biology.

[17]  J. Claverie,et al.  Horizontal gene transfer of an entire metabolic pathway between a eukaryotic alga and its DNA virus. , 2009, Genome research.

[18]  D. Moreira,et al.  Ten reasons to exclude viruses from the tree of life , 2009, Nature Reviews Microbiology.

[19]  J. Marto,et al.  Viral Mimicry of Cdc2/Cyclin-Dependent Kinase 1 Mediates Disruption of Nuclear Lamina during Human Cytomegalovirus Nuclear Egress , 2009, PLoS pathogens.

[20]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[21]  J. Stewart,et al.  A captured viral interleukin 10 gene with cellular exon structure. , 2008, The Journal of general virology.

[22]  Gyan Bhanot,et al.  Patterns of Evolution and Host Gene Mimicry in Influenza and Other RNA Viruses , 2008, PLoS pathogens.

[23]  E. Riley,et al.  IL-10: The Master Regulator of Immunity to Infection , 2008, The Journal of Immunology.

[24]  Joshua B. Plotkin,et al.  Genome Landscapes and Bacteriophage Codon Usage , 2007, PLoS Comput. Biol..

[25]  Junichi Nakai,et al.  Vaccinia Virus Uses Macropinocytosis and Apoptotic Mimicry to Enter Host Cells , 2008 .

[26]  Shibu Yooseph,et al.  Viral photosynthetic reaction center genes and transcripts in the marine environment , 2007, The ISME Journal.

[27]  Peter B. McGarvey,et al.  UniRef: comprehensive and non-redundant UniProt reference clusters , 2007, Bioinform..

[28]  Igor N. Berezovsky,et al.  Positive and Negative Design in Stability and Thermal Adaptation of Natural Proteins , 2006, PLoS Comput. Biol..

[29]  Peer Bork,et al.  Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation , 2007, Bioinform..

[30]  E. Koonin,et al.  The ancient Virus World and evolution of cells , 2006, Biology Direct.

[31]  David A. Lee,et al.  Gene3D: modelling protein structure, function and evolution , 2005, Nucleic Acids Res..

[32]  Ching‐Hwa Tsai,et al.  Reactivation of Epstein-Barr virus can be triggered by an Rta protein mutated at the nuclear localization signal. , 2005, The Journal of general virology.

[33]  J. Claverie,et al.  The 1.2-Megabase Genome Sequence of Mimivirus , 2004, Science.

[34]  Peter Uetz,et al.  From ORFeomes to Protein Interaction Maps in Viruses Material Supplemental , 2004 .

[35]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[36]  C. Chothia,et al.  Evolution of the Protein Repertoire , 2003, Science.

[37]  D. Bamford Do viruses form lineages across different domains of life? , 2003, Research in microbiology.

[38]  W. Doolittle,et al.  Prokaryotic evolution in light of gene transfer. , 2002, Molecular biology and evolution.

[39]  Paul Kellam,et al.  Identification of new herpesvirus gene homologs in the human genome. , 2002, Genome research.

[40]  C. Ponting,et al.  The natural history of protein domains. , 2002, Annual review of biophysics and biomolecular structure.

[41]  P. Forterre,et al.  Evolution of DNA Polymerase Families: Evidences for Multiple Gene Exchange Between Cellular and Viral Proteins , 2002, Journal of Molecular Evolution.

[42]  E. Koonin,et al.  Horizontal gene transfer in prokaryotes: quantification and classification. , 2001, Annual review of microbiology.

[43]  J. Eisen Horizontal gene transfer among microbial genomes: new insights from complete genome analysis. , 2000, Current opinion in genetics & development.

[44]  K. Kirkegaard,et al.  Remodeling the Endoplasmic Reticulum by Poliovirus Infection and by Individual Viral Proteins: an Autophagy-Like Origin for Virus-Induced Vesicles , 2000, Journal of Virology.

[45]  D Baker,et al.  Topology, stability, sequence, and length: defining the determinants of two-state protein folding kinetics. , 2000, Biochemistry.

[46]  E. Thiry,et al.  A multipotential beta -1,6-N-acetylglucosaminyl-transferase is encoded by bovine herpesvirus type 4. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[47]  J. Drake,et al.  Mutation rates among RNA viruses. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[48]  E. Holmes,et al.  Evolutionary aspects of recombination in RNA viruses. , 1999, The Journal of general virology.

[49]  M. A. McClure,et al.  Evolution and Horizontal Transfer of dUTPase-Encoding Genes in Viruses and Their Hosts , 1999, Journal of Virology.

[50]  C. Ponting,et al.  Eukaryotic signalling domain homologues in archaea and bacteria. Ancient ancestry and horizontal gene transfer. , 1999, Journal of molecular biology.

[51]  S E Brenner,et al.  Distribution of protein folds in the three superkingdoms of life. , 1999, Genome research.

[52]  M. Orlich,et al.  Ribosomal S27a Coding Sequences Upstream of Ubiquitin Coding Sequences in the Genome of a Pestivirus , 1998, Journal of Virology.

[53]  M. Skinner,et al.  Fowlpox Virus Encodes Nonessential Homologs of Cellular Alpha-SNAP, PC-1, and an Orphan Human Homolog of a Secreted Nematode Protein , 1998, Journal of Virology.

[54]  T. Chatila,et al.  Reactivation of Epstein-Barr virus: regulation and function of the BZLF1 gene. , 1997, Trends in microbiology.

[55]  M. G. Kidwell Lateral transfer in natural populations of eukaryotes. , 1993, Annual review of genetics.

[56]  P. Bork,et al.  Proposed acquisition of an animal protein domain by bacteria. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[57]  M. Oldstone Virus-induced autoimmunity: Molecular mimicry as a route to autoimmune disease , 1989, Journal of Autoimmunity.