Database resources of the National Center for Biotechnology Information

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. The Entrez system provides search and retrieval operations for most of these data from 37 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include iCn3D, MutaBind, and the Antimicrobial Resistance Gene Reference Database; and resources that were updated in the past year include My Bibliography, SciENcv, the Pathogen Detection Project, Assembly, Genome, the Genome Data Viewer, BLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

[1]  Donna R. Maglott,et al.  Human immunodeficiency virus type 1, human protein interaction database at NCBI , 2008, Nucleic Acids Res..

[2]  Monte Westerfield,et al.  The Zebrafish Information Network: the zebrafish model organism database , 2005, Nucleic Acids Res..

[3]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[4]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[5]  Dennis B. Troup,et al.  NCBI GEO: archive for high-throughput functional genomic data , 2008, Nucleic Acids Res..

[6]  Thomas L. Madden,et al.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[7]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[8]  K. Sirotkin,et al.  The interactive online SKY/M‐FISH & CGH Database and the Entrez Cancer Chromosomes search database: Linkage of chromosomal aberrations with the genome sequence , 2005, Genes, chromosomes & cancer.

[9]  Deanna M. Church,et al.  Assembly: a resource for assembled genomes at NCBI , 2015, Nucleic Acids Res..

[10]  D. Geschwind GENSAT: a genomic resource for neuroscience research , 2004, The Lancet Neurology.

[11]  Narmada Thanki,et al.  CDD: specific functional annotation with the Conserved Domain Database , 2008, Nucleic Acids Res..

[12]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[13]  N. Heintz Gene Expression Nervous System Atlas (GENSAT) , 2004, Nature Neuroscience.

[14]  Tatiana A. Tatusova,et al.  The National Center for Biotechnology Information's Protein Clusters Database , 2008, Nucleic Acids Res..

[15]  Robert S. Ledley,et al.  The Protein Information Resource , 2003, Nucleic Acids Res..

[16]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[17]  Shiaoching Gong,et al.  A gene expression atlas of the central nervous system based on bacterial artificial chromosomes , 2003, Nature.

[18]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[19]  Alan F. Scott,et al.  McKusick's Online Mendelian Inheritance in Man (OMIM®) , 2008, Nucleic Acids Res..

[20]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[21]  Jian Ye,et al.  BLAST: improvements for better sequence analysis , 2006, Nucleic Acids Res..

[22]  Yanli Wang,et al.  MMDB: annotating protein sequences with Entrez's 3D-structure database , 2006, Nucleic Acids Res..

[23]  Judith A. Blake,et al.  The mouse genome database (MGD): new features facilitating a model system , 2006, Nucleic Acids Res..

[24]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[25]  G. Schuler Pieces of the puzzle: expressed sequence tags and the catalog of human genes , 1997, Journal of Molecular Medicine.

[26]  Madeline A. Crosby,et al.  FlyBase: genomes by the dozen , 2006, Nucleic Acids Res..

[27]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[28]  Bin Ma,et al.  PatternHunter: faster and more sensitive homology search , 2002, Bioinform..

[29]  Richa Agarwala,et al.  COBALT: constraint-based alignment tool for multiple protein sequences , 2007, Bioinform..

[30]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[31]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[32]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[33]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[34]  S. Salzberg,et al.  Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution , 2005, Nature.

[35]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[36]  Dennis B. Troup,et al.  NCBI Peptidome: a new repository for mass spectrometry proteomics data , 2009, Nucleic Acids Res..

[37]  Tatiana A. Tatusova,et al.  Complete genomes in WWW Entrez: data representation and analysis , 1999, Bioinform..

[38]  Wolfgang Helmberg,et al.  The sequencing-based typing tool of dbMHC: typing highly polymorphic gene sequences , 2004, Nucleic Acids Res..

[39]  Michael DiCuccio,et al.  Public data archives for genomic structural variation , 2010, Nature Genetics.

[40]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[41]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[42]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[43]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[44]  Anna R. Panchenko,et al.  MutaBind estimates and interprets the effects of sequence variants on protein–protein interactions , 2016, Nucleic Acids Res..

[45]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[46]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[47]  Lewis Y. Geer,et al.  Cn3D: sequence and structure views for Entrez. , 2000, Trends in biochemical sciences.

[48]  O. Blumenfeld,et al.  Allelic genes of blood group antigens: A source of human mutations and cSNPs documented in the Blood Group Antigen Gene Mutation Database , 2004, Human mutation.

[49]  Darrel J Waggoner Internet Resources in Medical Genetics , 2014, Current protocols in human genetics.

[50]  Peter D. Karp,et al.  EcoCyc: A comprehensive view of Escherichia coli biology , 2008, Nucleic Acids Res..

[51]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[52]  Chris F. Taylor,et al.  The MGED Ontology: a resource for semantics-based description of microarray experiments , 2006, Bioinform..

[53]  B. Johansson,et al.  A breakpoint map of recurrent chromosomal rearrangements in human neoplasia , 1997, Nature Genetics.

[54]  Tao Tao,et al.  Education resources of the National Center for Biotechnology Information , 2010, Briefings Bioinform..

[55]  Benjamin A. Shoemaker,et al.  Inferred Biomolecular Interaction Server—a web server to analyze and predict protein interacting partners and binding sites , 2009, Nucleic Acids Res..

[56]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[57]  Peer Bork,et al.  SMART 5: domains in the context of genomes and networks , 2005, Nucleic Acids Res..

[58]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[59]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[60]  Anna R. Panchenko,et al.  HistoneDB 2.0: a histone database with variants—an integrated resource to explore histones and their variants , 2016, Database J. Biol. Databases Curation.

[61]  P. Donnelly,et al.  New models of collaboration in genome-wide association studies: the Genetic Association Information Network , 2007, Nature Genetics.

[62]  Alexander Souvorov,et al.  Splign: algorithms for computing spliced alignments with identification of paralogs , 2008, Biology Direct.

[63]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[64]  M. Gulley,et al.  Clinical laboratory reports in molecular pathology. , 2007, Archives of pathology & laboratory medicine.

[65]  Matthew D. Mailman,et al.  OMIA (Online Mendelian Inheritance in Animals): an enhanced platform and integration into the Entrez search interface at NCBI , 2005, Nucleic Acids Res..

[66]  Kara Dolinski,et al.  Gene Ontology annotations at SGD: new data sources and annotation methods , 2007, Nucleic Acids Res..

[67]  Roberta A Pagon,et al.  GeneTests: an online genetic information resource for health care providers. , 2006, Journal of the Medical Library Association : JMLA.

[68]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences: current status, policy and new initiatives , 2008, Nucleic Acids Res..

[69]  W SEWELL,et al.  MEDICAL SUBJECT HEADINGS IN MEDLARS. , 1964, Bulletin of the Medical Library Association.

[70]  Ning Ma,et al.  IgBLAST: an immunoglobulin variable domain sequence analysis tool , 2013, Nucleic Acids Res..