Identification and investigation of ORFans in the viral world

BackgroundGenome-wide studies have already shed light into the evolution and enormous diversity of the viral world. Nevertheless, one of the unresolved mysteries in comparative genomics today is the abundance of ORFans – ORFs with no detectable sequence similarity to any other ORF in the databases. Recently, studies attempting to understand the origin and functions of bacterial ORFans have been reported. Here we present a first genome-wide identification and analysis of ORFans in the viral world, with focus on bacteriophages.ResultsAlmost one-third of all ORFs in 1,456 complete virus genomes correspond to ORFans, a figure significantly larger than that observed in prokaryotes. Like prokaryotic ORFans, viral ORFans are shorter and have a lower GC content than non-ORFans. Nevertheless, a statistically significant lower GC content is found only on a minority of viruses. By focusing on phages, we find that 38.4% of phage ORFs have no homologs in other phages, and 30.1% have no homologs neither in the viral nor in the prokaryotic world. Phages with different host ranges have different percentages of ORFans, reflecting different sampling status and suggesting various diversities. Similarity searches of the phage ORFeome (ORFans and non-ORFans) against prokaryotic genomes shows that almost half of the phage ORFs have prokaryotic homologs, suggesting the major role that horizontal transfer plays in bacterial evolution. Surprisingly, the percentage of phage ORFans with prokaryotic homologs is only 18.7%. This suggests that phage ORFans play a lesser role in horizontal transfer to prokaryotes, but may be among the major players contributing to the vast phage diversity.ConclusionAlthough the current sampling of viral genomes is extremely low, ORFans and near-ORFans are likely to continue to grow in number as more genomes are sequenced. The abundance of phage ORFans may be partially due to the expected vast viral diversity, and may be instrumental in understanding viral evolution. The functions, origins and fates of the majority of viral ORFans remain a mystery. Further computational and experimental studies are likely to shed light on the mechanisms that have given rise to so many bacterial and viral ORFans.

[1]  S. Casjens,et al.  Comparative genomics and evolution of the tailed-bacteriophages. , 2005, Current opinion in microbiology.

[2]  Daniel Fischer,et al.  Unravelling the ORFan Puzzle , 2003, Comparative and functional genomics.

[3]  Jing Liu,et al.  The complete genomes and proteomes of 27 Staphylococcus aureus bacteriophages. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  D. Fischer,et al.  Analysis of singleton ORFans in fully sequenced microbial genomes , 2003, Proteins.

[5]  R. Edwards,et al.  Viral metagenomics , 2005, Nature Reviews Microbiology.

[6]  B. Barrell,et al.  A Re-Annotation of the Saccharomyces Cerevisiae Genome , 2001, Comparative and functional genomics.

[7]  Andrew C. Tolonen,et al.  Transfer of photosynthesis genes to and from Prochlorococcus viruses. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  H. Ochman,et al.  Bacterial genomes as new gene homes: the genealogy of ORFans in E. coli. , 2004, Genome research.

[9]  Luke R Thompson,et al.  Prevalence and Evolution of Core Photosystem II Genes in Marine Cyanobacterial Viruses and Their Hosts , 2006, PLoS biology.

[10]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[11]  Maureen L. Coleman,et al.  Three Prochlorococcus Cyanophage Genomes: Signature Features and Ecological Interpretations , 2005, PLoS biology.

[12]  S. Casjens,et al.  Prophages and bacterial genomics: what have we learned so far? , 2003, Molecular microbiology.

[13]  Robert Barber,et al.  Prophage Finder: A Prophage Loci Prediction Tool for Prokaryotic Genome Sequences , 2006, Silico Biol..

[14]  Jerry Eichler,et al.  Poorly conserved ORFs in the genome of the archaea Halobacterium sp. NRC-1 correspond to expressed proteins , 2004, Bioinform..

[15]  Ghislain Fournous,et al.  Prophage Genomics , 2003, Microbiology and Molecular Biology Reviews.

[16]  R. Hendrix,et al.  Bacteriophages: evolution of the majority. , 2002, Theoretical population biology.

[17]  R. Hendrix,et al.  Evolutionary relationships among diverse bacteriophages and prophages: all the world's a phage. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[18]  R. Leplae,et al.  A first global analysis of plasmid encoded proteins in the ACLAME database. , 2006, FEMS microbiology reviews.

[19]  David S. Eisenberg,et al.  Finding families for genomic ORFans , 1999, Bioinform..

[20]  C. Aquadro,et al.  The evolutionary analysis of "orphans" from the Drosophila genome identifies rapidly diverging and incorrectly annotated genes. , 2001, Genetics.

[21]  R. L. Charlebois,et al.  Characterization of species-specific genes using a flexible, web-based querying system. , 2003, FEMS microbiology letters.

[22]  Daniel Fischer,et al.  Twenty thousand ORFan microbial protein families for the biologist? , 2003, Structure.

[23]  Deborah Jacobs-Sera,et al.  Exploring the Mycobacteriophage Metaproteome: Phage Genomics as an Educational Platform , 2006, PLoS genetics.

[24]  A. Joachimiak,et al.  Structure of phage protein BC1872 from Bacillus cereus, a singleton with new fold , 2006, Proteins.

[25]  B. Dujon The yeast genome project: what did we learn? , 1996, Trends in genetics : TIG.

[26]  Laura S. Frost,et al.  Mobile genetic elements: the agents of open source evolution , 2005, Nature Reviews Microbiology.

[27]  D. Field,et al.  Orphans as taxonomically restricted and ecologically important genes. , 2005, Microbiology.

[28]  D. Moreira,et al.  Multiple independent horizontal transfers of informational genes from bacteria to plasmids and phages: implications for the origin of bacterial replication machinery , 2000, Molecular microbiology.

[29]  Daniel Fischer,et al.  On the origin of microbial ORFans: quantifying the strength of the evidence for viral lateral transfer , 2006, BMC Evolutionary Biology.

[30]  Benjamin J. Raphael,et al.  The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families , 2007, PLoS biology.

[31]  Forest Rohwer,et al.  Global Phage Diversity , 2003, Cell.

[32]  E. Koonin,et al.  Evolution of complexity in the viral world: the dawn of a new vision. , 2006, Virus research.

[33]  J. Pelletier,et al.  Comparative Genomic Analysis of 18 Pseudomonas aeruginosa Bacteriophages , 2006, Journal of bacteriology.

[34]  Ghislain Fournous,et al.  The impact of prophages on bacterial chromosomes , 2004, Molecular microbiology.

[35]  Martin J Blaser,et al.  Evidence of host-virus co-evolution in tetranucleotide usage patterns of bacteriophages and eukaryotic viruses , 2006, BMC Genomics.

[36]  C Abergel,et al.  Escherichia coli ykfE ORFan Gene Encodes a Potent Inhibitor of C-type Lysozyme* , 2001, The Journal of Biological Chemistry.

[37]  F Lopez,et al.  Reverse transcriptase-polymerase chain reaction validation of 25 "orphan" genes from Escherichia coli K-12 MG1655. , 2000, Genome research.

[38]  Daniel Fischer,et al.  Servers for protein structure prediction. , 2006, Current opinion in structural biology.

[39]  R. Hendrix,et al.  Genomic sequences of bacteriophages HK97 and HK022: pervasive genetic mosaicism in the lambdoid bacteriophages. , 2000, Journal of molecular biology.

[40]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[41]  C. Burch,et al.  Horizontal Gene Transfer and the Evolution of Microvirid Coliphage Genomes , 2006, Journal of bacteriology.

[42]  S Brunak,et al.  On the total number of genes and their length distribution in complete microbial genomes. , 2001, Trends in genetics : TIG.

[43]  R. Doolittle A bug with excess gastric avidity , 1997, Nature.

[44]  D. Fischer,et al.  A putative novel alpha/beta hydrolase ORFan family in Bacillus , 2005, FEBS letters.

[45]  Florent E. Angly,et al.  The Marine Viromes of Four Oceanic Regions , 2006, PLoS biology.