Inferring alternative splicing patterns in mouse from a full-length cDNA library and microarray data.

Although many studies on alternative splicing of specific genes have been reported in the literature, the general mechanism that regulates alternative splicing has not been clearly understood. In this study, we systematically aligned each pair of the 21,076 cDNA sequences of Mus musculus, searched for putative alternative splicing patterns, and constructed a list of potential alternative splicing sites. Two cDNAs are suspected to be alternatively spliced and originating from a common gene if they share most of their region with a high degree of sequence homology, but parts of the sequences are very distinctive or deleted in either cDNA. The list contains the following information: (1) tissue, (2) developmental stage, (3) sequences around splice sites, (4) the length of each gapped region, and (5) other comments. The list is available at http://www.bioinfo.sfc.keio.ac.jp/intron. Our results have predicted a number of unreported alternatively spliced genes, some of which are expressed only in a specific tissue or at a specific developmental stage.

[1]  T G Wolfsberg,et al.  A comparison of expressed sequence tags (ESTs) to human genomic sequences. , 1997, Nucleic acids research.

[2]  M. Regan,et al.  Full-length single-gene cDNA libraries: applications in splice variant analysis. , 2000, Analytical biochemistry.

[3]  T. Jatkoe,et al.  Predicting splice variant from DNA chip expression data. , 2001, Genome research.

[4]  John Quackenbush,et al.  Gene Index analysis of the human genome estimates approximately 120,000 genes , 2000, Nature Genetics.

[5]  B. Nadal-Ginard,et al.  Alternative splicing: a ubiquitous mechanism for the generation of multiple protein isoforms from single genes. , 1987, Annual review of biochemistry.

[6]  B. Wold,et al.  Molecular cloning of cDNA for the nuclear ribonucleoprotein particle C proteins: a conserved gene family. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[7]  R. Fleischmann,et al.  Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. , 1995, Nature.

[8]  T. Südhof,et al.  Neurexins: three genes and 1001 products. , 1998, Trends in genetics : TIG.

[9]  P. Connell,et al.  Identification of CHIP, a Novel Tetratricopeptide Repeat-Containing Protein That Interacts with Heat Shock Proteins and Negatively Regulates Chaperone Functions , 1999, Molecular and Cellular Biology.

[10]  P. Green,et al.  Analysis of expressed sequence tags indicates 35,000 human genes , 2000, Nature Genetics.

[11]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[12]  D B Davison,et al.  Alternative gene form discovery and candidate gene selection from gene indexing projects. , 1998, Genome research.

[13]  P Bork,et al.  EST comparison indicates 38% of human mRNAs contain possible alternative splice forms , 2000, FEBS letters.

[14]  Qing Zhou,et al.  AsMamDB: an alternative splice database of mammals , 2001, Nucleic Acids Res..

[15]  Inna Dubchak,et al.  ASDB: database of alternatively spliced genes , 1999, Nucleic Acids Res..

[16]  J. Bell,et al.  Genomic structure of DNA encoding the lymphocyte homing receptor CD44 reveals at least 12 alternatively spliced exons. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Al Stutz,et al.  A draft annotation and overview of the human genome , 2001, Genome Biology.

[18]  P. Latour,et al.  Alternative exon 3 splicing of the human major protein zero gene in white blood cells and peripheral nerve tissue , 1999, FEBS letters.

[19]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[20]  J. Manley,et al.  Regulation of pre-mRNA splicing in metazoa. , 1997, Current opinion in genetics & development.

[21]  A. Kerlavage,et al.  Complementary DNA sequencing: expressed sequence tags and human genome project , 1991, Science.

[22]  S. Bernstein,et al.  Alternative RNA splicing generates transcripts encoding a thorax-specific isoform of Drosophila melanogaster myosin heavy chain , 1986, Molecular and cellular biology.

[23]  P. Herrlich,et al.  Splicing choice from ten variant exons establishes CD44 variability. , 1993, Nucleic acids research.

[24]  P. Sharp,et al.  Splicing of messenger RNA precursors. , 1987, Science.

[25]  M. Mckeown,et al.  Alternative mRNA splicing. , 1992, Annual review of cell biology.

[26]  B. Chabot Directing alternative splicing: cast and scenarios. , 1996, Trends in genetics : TIG.

[27]  M. Gelfand,et al.  Frequent alternative splicing of human genes. , 1999, Genome research.

[28]  C. Bult,et al.  Functional annotation of a full-length mouse cDNA collection , 2001, Nature.

[29]  T A Thanaraj,et al.  Positional characterisation of false positives from computational prediction of human splice sites. , 2000, Nucleic acids research.

[30]  C. Fizames,et al.  Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence , 2000, Nature Genetics.

[31]  Stephen M. Mount,et al.  A catalogue of splice junction sequences. , 1982, Nucleic acids research.

[32]  M B Eisen,et al.  Delineating developmental and metabolic pathways in vivo by expression profiling using the RIKEN set of 18,816 full-length enriched mouse cDNA arrays , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[33]  K. Ishibashi,et al.  Identification of four new members of the rat prolactin/growth hormone gene family. , 1999, Biochemical and biophysical research communications.

[34]  Michael Ruogu Zhang,et al.  An alternative-exon database and its statistical analysis. , 2000, DNA and cell biology.

[35]  Christopher J. Lee,et al.  Genome-wide detection of alternative splicing in expressed sequences of human genes , 2001, Nucleic Acids Res..

[36]  J. Tschopp,et al.  Expression of the CTL-associated protein TIA-1 during murine embryogenesis. , 1996, Journal of Immunology.

[37]  Y. Hayashizaki,et al.  Amino acid translation program for full-length cDNA sequences with frameshift errors. , 2001, Physiological genomics.

[38]  Jiwang Zhang,et al.  Cloning and functional analysis of cDNAs with open reading frames for 300 previously undefined genes expressed in CD34+ hematopoietic stem/progenitor cells. , 2000, Genome research.

[39]  J. York,et al.  Cloning and Characterization of a Mammalian Lithium-sensitive Bisphosphate 3′-Nucleotidase Inhibited by Inositol 1,4-Bisphosphate* , 1999, The Journal of Biological Chemistry.

[40]  J. G. Patton,et al.  Alternative splicing in the control of gene expression. , 1989, Annual review of genetics.