论文信息 - H-InvDB in 2009: extended database and data mining resources for human genes and transcripts - 字舞流文

H-InvDB in 2009: extended database and data mining resources for human genes and transcripts

We report the extended database and data mining resources newly released in the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). H-InvDB is a comprehensive annotation resource of human genes and transcripts, and consists of two main views and six sub-databases. The latest release of H-InvDB (release 6.2) provides the annotation for 219 765 human transcripts in 43 159 human gene clusters based on human full-length cDNAs and mRNAs. H-InvDB now provides several new annotation features, such as mapping of microarray probes, new gene models, relation to known ncRNAs and information from the Glycogene database. H-InvDB also provides useful data mining resources—‘Navigation search’, ‘H-InvDB Enrichment Analysis Tool (HEAT)’ and web service APIs. ‘Navigation search’ is an extended search system that enables complicated searches by combining 16 different search options. HEAT is a data mining tool for automatically identifying features specific to a given human gene set. HEAT searches for H-InvDB annotations that are significantly enriched in a user-defined gene set, as compared with the entire H-InvDB representative transcripts. H-InvDB now has web service APIs of SOAP and REST to allow the use of H-InvDB data in programs, providing the users extended data accessibility.

Katsuhiko Murakami | Yoshiharu Sato | Takuya Habara | Akihiro Matsuya | Hajime Nakaoka | Jun-ichi Takeda | Fusano Todokoro | Chisato Yamasaki | Tadashi Imanishi | Takashi Gojobori | Akiko Ogura Noda | Ryuichi Sakate | Yoshiharu Sato | K. Murakami | T. Gojobori | T. Imanishi | Chisato Yamasaki | Jun-ichi Takeda | Ryuichi Sakate | Takuya Habara | Akihiro Matsuya | Fusano Todokoro | H. Nakaoka | A. O. Noda | A. Noda

[1] Jonathan M. Mudge,et al. The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. , 2009, Genome research.

[2] E. Giglia. Medline/PubMed revisited: new, semantic tools to explore the biomedical literature. , 2009, European journal of physical and rehabilitation medicine.

[3] Hajime Nakaoka,et al. Hyperlink Management System and ID Converter System: enabling maintenance-free hyperlinks among major biological databases , 2009, Nucleic Acids Res..

[4] Rodrigo Lopez,et al. Web services at the European Bioinformatics Institute-2009 , 2009, Nucleic Acids Res..

[5] Andrew M. Jenkinson,et al. Ensembl 2009 , 2008, Nucleic Acids Res..

[6] Hideaki Sugawara,et al. The GTOP database in 2009: updated content and novel features to expand and deepen insights into protein structures and functions , 2008, Nucleic Acids Res..

[7] Rachael P. Huntley,et al. The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[8] Robert D. Finn,et al. InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[9] Alan F. Scott,et al. McKusick's Online Mendelian Inheritance in Man (OMIM®) , 2008, Nucleic Acids Res..

[10] The UniProt Consortium,et al. The Universal Protein Resource (UniProt) 2009 , 2008, Nucleic Acids Res..

[11] Yoshiharu Sato,et al. Low conservation and species-specific evolution of alternative splicing in humans and mice: comparative genomics analysis using well-annotated full-length cDNAs , 2008, Nucleic acids research.

[12] Yoshio Tateno,et al. [International collaboration among DDBJ, EMBL Bank and GenBank]. , 2008, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[13] Teruyoshi Hishiki,et al. The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts , 2007, Nucleic Acids Res..

[14] Sue Povey,et al. The HGNC Database in 2008: a resource for the human genome , 2007, Nucleic Acids Res..

[15] Katsuhiko Murakami,et al. Evola: Ortholog database of all human genes in H-InvDB with manual curation of phylogenetic trees , 2007, Nucleic Acids Res..

[16] S. Minoshima,et al. MutationView/KMcancerDB: A database for cancer gene mutations , 2007, Cancer science.

[17] Aya Kojima,et al. fRNAdb: a platform for mining/annotating functional RNA candidates from non-coding RNA sequences , 2006, Nucleic Acids Res..

[18] Mark Gerstein,et al. Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation , 2006, Nucleic Acids Res..

[19] Tatiana Tatusova,et al. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[20] Tatiana A. Tatusova,et al. Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[21] Takuro Tamura,et al. Investigation of protein functions through data-mining on integrated human transcriptome database, H-Invitational database (H-InvDB). , 2005, Gene.

[22] Steven Salzberg,et al. JIGSAW: integration of multiple sources of evidence for gene prediction , 2005, Bioinform..

[23] Rasmus Wernersson. FeatureExtract—extraction of sequence annotation made easy , 2005, Nucleic Acids Res..

[24] Teruyoshi Hishiki,et al. The Human Anatomic Gene Expression Library (H-ANGEL), the H-Inv integrative display of human gene expression across disparate technologies and platforms , 2004, Nucleic Acids Res..

[25] Kanako O. Koyanagi,et al. Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones , 2004, PLoS Biology.

[26] Hisashi Narimatsu,et al. Construction of a human glycogene library and comprehensive functional analysis , 2004, Glycoconjugate Journal.

[27] Tsviya Olender,et al. Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE , 2003, Nucleic Acids Res..

[28] Elizabeth M. Smigielski,et al. dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[29] M. Kozak. Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs. , 1984, Nucleic acids research.