Database resources of the National Center for Biotechnology Information: update

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website. NCBI resources include Entrez, PubMed, PubMed Central, LocusLink, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SARS Coronavirus Resource, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.

[1]  W SEWELL,et al.  MEDICAL SUBJECT HEADINGS IN MEDLARS. , 1964, Bulletin of the Medical Library Association.

[2]  C. Carter Mendelian Inheritance in Man , 1967 .

[3]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[4]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[5]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[6]  V. McKusick Mendelian Inheritance in Man: A Catalog of Human Genes and Genetic Disorders , 1997 .

[7]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[8]  A. V. Grimstone Molecular biology of the cell (3rd edn) , 1995 .

[9]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[10]  J. Naylor,et al.  Mendelian inheritance in man: A catalog of human genes and genetic disorders , 1996 .

[11]  P. Deloukas,et al.  A Gene Map of the Human Genome , 1996, Science.

[12]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[13]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[14]  G. Schuler Pieces of the puzzle: expressed sequence tags and the catalog of human genes , 1997, Journal of Molecular Medicine.

[15]  B. Johansson,et al.  A breakpoint map of recurrent chromosomal rearrangements in human neoplasia , 1997, Nature Genetics.

[16]  P. Lijnzaad,et al.  A physical map of 30,000 human genes. , 1998, Science.

[17]  D. Riddle C. Elegans II , 1998 .

[18]  M. Bittner,et al.  Data management and analysis for gene expression arrays , 1998, Nature Genetics.

[19]  S. Antonarakis Recommendations for a nomenclature system for human gene mutations , 1998 .

[20]  Thomas L. Madden,et al.  Protein sequence similarity searches using patterns as seeds. , 1998, Nucleic acids research.

[21]  Tatiana A. Tatusova,et al.  Complete genomes in WWW Entrez: data representation and analysis , 1999, Bioinform..

[22]  G. Schuler,et al.  Making effective use of human genomic sequence data. , 1999, Trends in genetics : TIG.

[23]  Thomas L. Madden,et al.  BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. , 1999, FEMS microbiology letters.

[24]  Yanli Wang,et al.  MMDB: Entrez's 3D-structure database , 2003, Nucleic Acids Res..

[25]  George A. Williams,et al.  The Age-Related Eye Disease Study (AREDS): design implications. AREDS report no. 1. , 1999, Controlled clinical trials.

[26]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[27]  S. Altschul,et al.  SAGEmap: a public gene expression resource. , 2000, Genome research.

[28]  Elizabeth M. Smigielski,et al.  dbSNP: a database of single nucleotide polymorphisms , 2000, Nucleic Acids Res..

[29]  Peter B. McGarvey,et al.  The Protein Information Resource (PIR) , 2000, Nucleic Acids Res..

[30]  Yanli Wang,et al.  MMDB: 3D structure data in Entrez , 2000, Nucleic Acids Res..

[31]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[32]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[33]  Peer Bork,et al.  SMART: a web-based tool for the study of genetically mobile domains , 2000, Nucleic Acids Res..

[34]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[35]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[36]  S H Bryant,et al.  Cn3D: sequence and structure views for Entrez. , 2000, Trends in biochemical sciences.

[37]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[38]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[39]  Sarah A. Douglas,et al.  The Zebrafish Information Network (ZFIN): a resource for genetic, genomic and developmental research , 2001, Nucleic Acids Res..

[40]  Thomas L. Madden,et al.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[41]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[42]  Rolf Apweiler,et al.  CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins , 2001, Nucleic Acids Res..

[43]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[44]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[45]  S. Bryant,et al.  CDART: protein homology by domain architecture. , 2002, Genome research.

[46]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[47]  Bin Ma,et al.  PatternHunter: faster and more sensitive homology search , 2002, Bioinform..

[48]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[49]  Peer Bork,et al.  Recent improvements to the SMART domain-based sequence annotation resource , 2002, Nucleic Acids Res..

[50]  John B. Anderson,et al.  MMDB: Entrez's 3D-structure database , 2002, Nucleic Acids Res..

[51]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence Project: update and current status , 2003, Nucleic Acids Res..

[52]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[53]  John B. Anderson,et al.  CDD: a curated Entrez database of conserved domain alignments , 2003, Nucleic Acids Res..

[54]  Judith A. Blake,et al.  MGD: the Mouse Genome Database , 2003, Nucleic Acids Res..

[55]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[56]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[57]  Shiaoching Gong,et al.  A gene expression atlas of the central nervous system based on bacterial artificial chromosomes , 2003, Nature.

[58]  Zukang Feng,et al.  The Protein Data Bank and structural genomics , 2003, Nucleic Acids Res..

[59]  Robert S. Ledley,et al.  The Protein Information Resource , 2003, Nucleic Acids Res..

[60]  Peer Bork,et al.  SMART 4.0: towards genomic data integration , 2004, Nucleic Acids Res..

[61]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[62]  D. Geschwind GENSAT: a genomic resource for neuroscience research , 2004, Lancet Neurology.

[63]  Philip E. Bourne,et al.  The distribution and query systems of the RCSB Protein Data Bank , 2004, Nucleic Acids Res..

[64]  Wolfgang Helmberg,et al.  The sequencing-based typing tool of dbMHC: typing highly polymorphic gene sequences , 2004, Nucleic Acids Res..

[65]  Thomas L. Madden,et al.  BLAST: at the core of a powerful and diverse set of sequence analysis tools , 2004, Nucleic Acids Res..

[66]  N. Heintz Gene Expression Nervous System Atlas (GENSAT) , 2004, Nature Neuroscience.

[67]  O. Blumenfeld,et al.  Allelic genes of blood group antigens: A source of human mutations and cSNPs documented in the Blood Group Antigen Gene Mutation Database , 2004, Human mutation.

[68]  Dennis B. Troup,et al.  NCBI GEO: mining millions of expression profiles—database and tools , 2004, Nucleic Acids Res..

[69]  K. Sirotkin,et al.  The interactive online SKY/M‐FISH & CGH Database and the Entrez Cancer Chromosomes search database: Linkage of chromosomal aberrations with the genome sequence , 2005, Genes, chromosomes & cancer.

[70]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[71]  Kara Dolinski,et al.  Fungal BLAST and Model Organism BLASTP Best Hits: new comparison resources at the Saccharomyces Genome Database (SGD) , 2004, Nucleic Acids Res..

[72]  S. Salzberg,et al.  Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution , 2005, Nature.

[73]  Erik L. L. Sonnhammer,et al.  Kalign – an accurate and fast multiple sequence alignment algorithm , 2005, BMC Bioinformatics.

[74]  John B. Anderson,et al.  CDD: a Conserved Domain Database for protein classification , 2004, Nucleic Acids Res..

[75]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[76]  Sonja W. Scholz,et al.  Genome-wide genotyping in Parkinson's disease and neurologically normal controls: first stage analysis and public release of data , 2006, The Lancet Neurology.

[77]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): updates and enhancements , 2005, Nucleic Acids Res..

[78]  Kara Dolinski,et al.  Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome , 2005, Nucleic Acids Res..

[79]  Philip E. Bourne,et al.  The RCSB PDB information portal for structural genomics , 2005, Nucleic Acids Res..

[80]  Hideaki Sugawara,et al.  DDBJ in preparation for overview of research activities behind data submissions , 2005, Nucleic Acids Res..

[81]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[82]  Peer Bork,et al.  SMART 5: domains in the context of genomes and networks , 2005, Nucleic Acids Res..

[83]  Monte Westerfield,et al.  The Zebrafish Information Network: the zebrafish model organism database , 2005, Nucleic Acids Res..

[84]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..