Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation

The Pseudogene.org knowledgebase serves as a comprehensive repository for pseudogene annotation. The definition of a pseudogene varies within the literature, resulting in significantly different approaches to the problem of identification. Consequently, it is difficult to maintain a consistent collection of pseudogenes in detail necessary for their effective use. Our database is designed to address this issue. It integrates a variety of heterogeneous resources and supports a subset structure that highlights specific groups of pseudogenes that are of interest to the research community. Tools are provided for the comparison of sets and the creation of layered set unions, enabling researchers to derive a current ‘consensus’ set of pseudogenes. Additional features include versatile search, the capacity for robust interaction with other databases, the ability to reconstruct older versions of the database (accounting for changing genome builds) and an underlying object-oriented interface designed for researchers with a minimal knowledge of programming. At the present time, the database contains more than 100 000 pseudogenes spanning 64 prokaryote and 11 eukaryote genomes, including a collection of human annotations compiled from 16 sources.

[1]  M. Gerstein,et al.  A computational approach for identifying pseudogenes in the ENCODE regions , 2006, Genome Biology.

[2]  Mark Gerstein,et al.  PseudoPipe: an automated pseudogene identification pipeline , 2006, Bioinform..

[3]  Jianzhi Zhang,et al.  Gene Losses during Human Origins , 2006, PLoS biology.

[4]  M. Gerstein,et al.  Design optimization methods for genomic DNA tiling arrays. , 2005, Genome research.

[5]  Mark Gerstein,et al.  Integrated pseudogene annotation for human chromosome 22: evidence for transcription. , 2005, Journal of molecular biology.

[6]  M. Gerstein,et al.  Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability , 2005, Nucleic acids research.

[7]  Jonathan Karro,et al.  Patient perceptions of privacy infringements in an emergency department , 2005, Emergency medicine Australasia : EMA.

[8]  Damian Smedley,et al.  Ensembl 2005 , 2004, Nucleic Acids Res..

[9]  Adel Khelifi,et al.  HOPPSIGEN: a database of human and mouse processed pseudogenes , 2005, Nucleic Acids Res..

[10]  Mouchiroud Dominique,et al.  HOPPSIGEN: a database of human and mouse processed pseudogenes , 2004, Nucleic Acids Res..

[11]  M. Gerstein,et al.  Comprehensive analysis of pseudogenes in prokaryotes: widespread gene decay and failure of putative horizontally transferred genes , 2004, Genome Biology.

[12]  M. Gerstein,et al.  Large-scale analysis of pseudogenes in the human genome. , 2004, Current opinion in genetics & development.

[13]  M. Gerstein,et al.  Comparative analysis of processed pseudogenes in the mouse and human genomes. , 2004, Trends in genetics : TIG.

[14]  M. Gerstein,et al.  TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics. , 2004, Nucleic acids research.

[15]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[16]  M. Suyama,et al.  A genome-wide survey of human pseudogenes. , 2003, Genome research.

[17]  Mark Gerstein,et al.  Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome. , 2003, Genome research.

[18]  Mark Gerstein,et al.  Reconstructing genetic networks in yeast , 2003, Nature Biotechnology.

[19]  Yoshiyuki Sakaki,et al.  Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates , 2003, Genome Biology.

[20]  Mark Gerstein,et al.  Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. , 2003, Nucleic acids research.

[21]  M. Gerstein,et al.  The human genome has 49 cytochrome c pseudogenes, including a relic of a primordial gene that still functions in mouse. , 2003, Gene.

[22]  M. Gerstein,et al.  Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements , 2003, Journal of biology.

[23]  M. Gerstein,et al.  Identification and characterization of over 100 mitochondrial ribosomal protein pseudogenes in the human genome. , 2003, Genomics.

[24]  Mark Gerstein,et al.  Identification of pseudogenes in the Drosophila melanogaster genome. , 2003, Nucleic acids research.

[25]  E. Birney Ensembl: a genome infrastructure. , 2003, Cold Spring Harbor symposia on quantitative biology.

[26]  Ian Dunham,et al.  Reevaluating human gene annotation: a second-generation analysis of chromosome 22. , 2003, Genome research.

[27]  M. Gerstein,et al.  Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome. , 2002, Genome research.

[28]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[29]  Mark Gerstein,et al.  Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22. , 2002, Genome research.

[30]  J. Wixon Featured organism: Danio rerio, the zebrafish , 2000, Yeast.

[31]  Perry L. Miller,et al.  Application of Information Technology: Organization of Heterogeneous Scientific Data Using the EAV/CR Representation , 1999, J. Am. Medical Informatics Assoc..

[32]  K. Benirschke,et al.  Mus musculus (Mouse) , 1967 .

[33]  T. C. Hsu,et al.  Rattus norvegicus (Rat) , 1967 .

[34]  K. Benirschke,et al.  Canis familiaris (Dog) , 1967 .