H-DBAS: Alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational

The Human-transcriptome DataBase for Alternative Splicing (H-DBAS) is a specialized database of alternatively spliced human transcripts. In this database, each of the alternative splicing (AS) variants corresponds to a completely sequenced and carefully annotated human full-length cDNA, one of those collected for the H-Invitational human-transcriptome annotation meeting. H-DBAS contains 38 664 representative alternative splicing variants (RASVs) in 11 744 loci, in total. The data is retrievable by various features of AS, which were annotated according to manual annotations, such as by patterns of ASs, consequently invoked alternations in the encoded amino acids and affected protein motifs, GO terms, predicted subcellular localization signals and transmembrane domains. The database also records recently identified very complex patterns of AS, in which two distinct genes seemed to be bridged, nested or degenerated (multiple CDS): in all three cases, completely unrelated proteins are encoded by a single locus. By using AS Viewer, each AS event can be analyzed in the context of full-length cDNAs, enabling the user's empirical understanding of the relation between AS event and the consequent alternations in the encoded amino acid sequences together with various kinds of affected protein motifs. H-DBAS is accessible at .

[1]  Thangavel Alphonse Thanaraj,et al.  ASD: a bioinformatics resource on alternative splicing , 2005, Nucleic Acids Res..

[2]  K. Nakai,et al.  Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes. , 2005, Genome research.

[3]  Kanako O. Koyanagi,et al.  Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones , 2004, PLoS Biology.

[4]  Alex Bateman,et al.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites , 2001, Nucleic Acids Res..

[5]  Paul Horton,et al.  PROTEIN SUBCELLULAR LOCALIZATION PREDICTION WITH WOLF PSORT , 2005 .

[6]  Jean Thierry-Mieg,et al.  Large-scale identification and characterization of alternative splicing variants of human gene transcripts using 56 419 completely sequenced and manually annotated full-length cDNAs , 2006, Nucleic acids research.

[7]  Yi Xing,et al.  ASAP: the Alternative Splicing Annotation Project , 2003, Nucleic Acids Res..

[8]  Shigeki Mitaku,et al.  SOSUI: classification and secondary structure prediction system for membrane proteins , 1998, Bioinform..

[9]  C. Glover,et al.  Gene expression profiling for hematopoietic cell culture , 2006 .

[10]  Christopher J. Lee,et al.  A genomic view of alternative splicing , 2002, Nature Genetics.

[11]  Takuro Tamura,et al.  Investigation of protein functions through data-mining on integrated human transcriptome database, H-Invitational database (H-InvDB). , 2005, Gene.

[12]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[13]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[14]  Kenta Nakai,et al.  Large-scale analysis of human alternative protein isoforms: pattern classification and correlation with subcellular localization signals , 2005, Nucleic acids research.

[15]  Y. Hayashizaki,et al.  Mouse‐centric comparative transcriptomics of protein coding and non‐coding RNAs , 2004, BioEssays : news and reviews in molecular, cellular and developmental biology.

[16]  Teruyoshi Hishiki,et al.  The Human Anatomic Gene Expression Library (H-ANGEL), the H-Inv integrative display of human gene expression across disparate technologies and platforms , 2004, Nucleic Acids Res..

[17]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.