Reference based annotation with GeneMapper

We introduce GeneMapper, a program for transferring annotations from a well annotated genome to other genomes. Drawing on high quality curated annotations, GeneMapper enables rapid and accurate annotation of newly sequenced genomes and is suitable for both finished and draft genomes. GeneMapper uses a profile based approach for mapping genes into multiple species, improving upon the standard pairwise approach. GeneMapper is freely available for academic use.

[1]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[2]  Michael R. Brent,et al.  Using Multiple Alignments to Improve Gene Prediction , 2005, RECOMB.

[3]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[4]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[5]  M. Boguski,et al.  dbEST — database for “expressed sequence tags” , 1993, Nature Genetics.

[6]  I-Min A. Dubchak,et al.  Active conservation of noncoding sequences revealed by three-way species comparisons. , 2000, Genome research.

[7]  Lior Pachter,et al.  Large Multiple Organism Gene Finding by Collapsed Gibbs Sampling , 2005, J. Comput. Biol..

[8]  Lior Pachter,et al.  Multiple-sequence functional annotation and the generalized hidden Markov phylogeny , 2004, Bioinform..

[9]  L. Pachter,et al.  SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. , 2003, Genome research.

[10]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[11]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[12]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[13]  P. Pevzner,et al.  Gene recognition via spliced sequence alignment. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[14]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[15]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[16]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[17]  Jon D. McAuliffe,et al.  Phylogenetic Shadowing of Primate Sequences to Find Functional Regions of the Human Genome , 2003, Science.

[18]  Lior Pachter,et al.  Multiple organism gene finding by collapsed gibbs sampling , 2004, RECOMB.

[19]  David Haussler,et al.  Computational identification of evolutionarily conserved exons , 2004, RECOMB.

[20]  M. Brent,et al.  Leveraging the mouse genome for gene prediction in human: from whole-genome shotgun reads to a global synteny map. , 2003, Genome research.

[21]  Daniel G. Brown,et al.  ExonHunter: a comprehensive approach to gene finding , 2005, ISMB.

[22]  Madeline A. Crosby,et al.  FlyBase: genes and gene models , 2004, Nucleic Acids Res..

[23]  Simon Cawley,et al.  Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat. , 2004, Genome research.

[24]  Steven Salzberg,et al.  JIGSAW: integration of multiple sources of evidence for gene prediction , 2005, Bioinform..

[25]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[26]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[27]  Irmtraud M. Meyer,et al.  Gene structure conservation aids similarity based gene prediction. , 2004, Nucleic acids research.

[28]  Michael R. Brent,et al.  Eval: A software package for analysis of genome annotations , 2003, BMC Bioinformatics.

[29]  X Huang,et al.  Fast comparison of a DNA sequence with a protein sequence database. , 1996, Microbial & comparative genomics.

[30]  Namshin Kim,et al.  ECgene: genome-based EST clustering and gene modeling for alternative splicing. , 2005, Genome research.

[31]  David Haussler,et al.  A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA , 1996, ISMB.

[32]  C. Burge,et al.  Computational inference of homologous gene structures in the human genome. , 2001, Genome research.

[33]  M. O. Dayhoff A model of evolutionary change in protein , 1978 .

[34]  Simon C. Potter,et al.  An overview of Ensembl. , 2004, Genome research.

[35]  R. Guigó,et al.  Comparative gene prediction in human and mouse. , 2003, Genome research.

[36]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..