Genome cartography through domain annotation

The evolutionary history of eukaryotic proteins involves rapid sequence divergence, addition and deletion of domains, and fusion and fission of genes. Although the protein repertoires of distantly related species differ greatly, their domain repertoires do not. To account for the great diversity of domain contexts and an unexpected paucity of ortholog conservation, we must categorize the coding regions of completely sequenced genomes into domain families, as well as protein families.

[1]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[2]  Ron D. Appel,et al.  The 1999 SWISS-2DPAGE database update , 2000, Nucleic Acids Res..

[3]  S. Sprang,et al.  Three-dimensional structure of human basic fibroblast growth factor, a structural homolog of interleukin 1β , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[4]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[5]  Alex Bateman,et al.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites , 2001, Nucleic Acids Res..

[6]  Amos Bairoch,et al.  The PROSITE database, its status in 1997 , 1997, Nucleic Acids Res..

[7]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[8]  M. Krasnow,et al.  branchless Encodes a Drosophila FGF Homolog That Controls Tracheal Cell Migration and the Pattern of Branching , 1996, Cell.

[9]  Amos Bairoch,et al.  The PROSITE database, its status in 1999 , 1999, Nucleic Acids Res..

[10]  K. Hofmann Sensitive Protein Comparisons with Profiles and Hidden Markov Models , 2000, Briefings Bioinform..

[11]  Annabel E. Todd,et al.  Evolution of function in protein superfamilies, from a structural perspective. , 2001, Journal of molecular biology.

[12]  R. Russell,et al.  Analysis and prediction of functional sub-types from protein sequence alignments. , 2000, Journal of molecular biology.

[13]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[14]  Fan Yang,et al.  TIGRFAMs: a protein family resource for the functional identification of proteins , 2001, Nucleic Acids Res..

[15]  S. Sprang,et al.  Three-dimensional structure of human basic fibroblast growth factor, a structural homolog of interleukin 1 beta. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[16]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[17]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[18]  Bernard Jacq,et al.  Protein Function From the Perspective of Molecular Interactions and Genetic Networks , 2001, Briefings Bioinform..

[19]  I. Jackson Mouse genomics: Making sense of the sequence , 2001, Current Biology.

[20]  David Botstein,et al.  The Stanford Microarray Database , 2001, Nucleic Acids Res..

[21]  J Schultz,et al.  SMART, a simple modular architecture research tool: identification of signaling domains. , 1998, Proceedings of the National Academy of Sciences of the United States of America.