The ORFanage: an ORFan database

As each newly sequenced genome contains a significant number of protein-coding ORFs that are species-, family- or lineage-specific, many interesting questions arise about the evolution and role of these ORFs and of the genomes they are part of. We refer to these poorly conserved ORFs as singleton or paralogous ORFans if they are unique to one genome, or as orthologous ORFans if they appear only in a family of closely related organisms and have no homolog in other genomes. In order to study and classify ORFans we have constructed the ORFanage, an ORFan database. This database consists of the predicted ORFs in fully sequenced microbial genomes, and enables searching for the three types of ORFans in any subset of the genomes chosen by the user. The ORFanage could help in choosing interesting targets for further genomic and evolutionary studies. The ORFanage is accessible via http://www.bioinformatics.buffalo. edu/ORFanage.

[1]  Daniel Fischer,et al.  Twenty thousand ORFan microbial protein families for the biologist? , 2003, Structure.

[2]  W. Doolittle,et al.  Microbial genomes: dealing with diversity. , 2001, Current opinion in microbiology.

[3]  C. Sander,et al.  Characterization of new proteins found by analysis of short open reading frames from the full yeast genome , 1997, Yeast.

[4]  S. Cebrat,et al.  Origin and properties of non-coding ORFs in the yeast genome. , 1999, Nucleic acids research.

[5]  Anton J. Enright,et al.  Myriads of protein families, and still counting , 2003, Genome Biology.

[6]  Howard Ochman,et al.  Distinguishing the ORFs from the ELFs: short bacterial genes and the annotation of genomes. , 2002, Trends in genetics : TIG.

[7]  D Fischer,et al.  Rational structural genomics: affirmative action for ORFans and the growth in our structural knowledge. , 1999, Protein engineering.

[8]  David S. Eisenberg,et al.  Finding families for genomic ORFans , 1999, Bioinform..

[9]  S Brunak,et al.  On the total number of genes and their length distribution in complete microbial genomes. , 2001, Trends in genetics : TIG.

[10]  B. Dujon The yeast genome project: what did we learn? , 1996, Trends in genetics : TIG.

[11]  Jonathan E. Allen,et al.  Genome sequence of the human malaria parasite Plasmodium falciparum , 2002, Nature.

[12]  J. Boeke,et al.  Small open reading frames: beautiful needles in the haystack. , 1997, Genome research.

[13]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[14]  D. Fischer,et al.  Analysis of singleton ORFans in fully sequenced microbial genomes , 2003, Proteins.