Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogaster testis.

Identification and annotation of all the genes in the sequenced Drosophila genome is a work in progress. Wild-type testis function requires many genes and is thus of potentially high value for the identification of transcription units. We therefore undertook a survey of the repertoire of genes expressed in the Drosophila testis by computational and microarray analysis. We generated 3141 high-quality testis expressed sequence tags (ESTs). Testis ESTs computationally collapsed into 1560 cDNA set used for further analysis. Of those, 11% correspond to named genes, and 33% provide biological evidence for a predicted gene. A surprising 47% fail to align with existing ESTs and 16% with predicted genes in the current genome release. EST frequency and microarray expression profiles indicate that the testis mRNA population is highly complex and shows an extended range of transcript abundance. Furthermore, >80% of the genes expressed in the testis showed onefold overexpression relative to ovaries, or gonadectomized flies. Additionally, >3% showed more than threefold overexpression at p <0.05. Surprisingly, 22% of the genes most highly overexpressed in testis match Drosophila genomic sequence, but not predicted genes. These data strongly support the idea that sequencing additional cDNA libraries from defined tissues, such as testis, will be important tools for refined annotation of the Drosophila genome. Additionally, these data suggest that the number of genes in Drosophila will significantly exceed the conservative estimate of 13,601.

[1]  E. M. Eddy Regulation of gene expression during spermatogenesis. , 1998, Seminars in cell & developmental biology.

[2]  T G Wolfsberg,et al.  A comparison of expressed sequence tags (ESTs) to human genomic sequences. , 1997, Nucleic acids research.

[3]  A. Santel,et al.  The Drosophila don juan (dj) gene encodes a novel sperm specific protein component characterized by an unusual domain of a repetitive amino acid motif , 1997, Mechanisms of Development.

[4]  S. King,et al.  Identification of the t Complex–encoded Cytoplasmic Dynein Light Chain Tctex1 in Inner Arm I1 Supports the Involvement of Flagellar Dyneins in Meiotic Drive , 1998, The Journal of cell biology.

[5]  R. Durbin,et al.  Using GeneWise in the Drosophila annotation experiment. , 2000, Genome research.

[6]  P. Gönczy,et al.  Toward a molecular genetic analysis of spermatogenesis in Drosophila melanogaster: characterization of male-sterile mutants generated by single P element mutagenesis. , 1993, Genetics.

[7]  C. Kuhn,et al.  A cluster of four genes selectively expressed in the male germ line of Drosophila melanogaster , 1991, Mechanisms of Development.

[8]  G M Rubin,et al.  A Drosophila complementary DNA resource. , 2000, Science.

[9]  W. Watkins,et al.  The question of the total gene number in Drosophila melanogaster. , 1986, Genetics.

[10]  A. Kerlavage,et al.  Complementary DNA sequencing: expressed sequence tags and human genome project , 1991, Science.

[11]  G. Lanfranchi,et al.  Identification of 4370 expressed sequence tags from a 3'-end-specific cDNA library of human skeletal muscle by DNA sequencing and filter hybridization. , 1996, Genome research.

[12]  D. Haussler,et al.  Genie--gene finding in Drosophila melanogaster. , 2000, Genome research.

[13]  R. Guigó,et al.  GeneID in Drosophila. , 2000, Genome research.

[14]  A. Vincent,et al.  Transcriptional and posttranscriptional regulation contributes to the sex-regulated expression of two sequence-related genes at the janus locus of Drosophila melanogaster , 1989, Molecular and cellular biology.

[15]  S. Lewis,et al.  Genome annotation assessment in Drosophila melanogaster. , 2000, Genome research.

[16]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[17]  S. Persengiev,et al.  Transcription of the TATA binding protein gene is highly up-regulated during spermatogenesis. , 1996, Molecular endocrinology.

[18]  W. Watkins,et al.  The exuperantia gene is required for Drosophila spermatogenesis as well as anteroposterior polarity of the developing oocyte, and encodes overlapping sex-specific transcripts. , 1990, Genetics.

[19]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[20]  U. Schibler,et al.  High accumulation of components of the RNA polymerase II transcription machinery in rodent spermatids. , 1995, Development.

[21]  M. Schäfer,et al.  Structure and regulation of a gene cluster for male accessory gland transcripts in Drosophila melanogaster. , 1995, Insect biochemistry and molecular biology.

[22]  R. Fleischmann,et al.  Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. , 1995, Nature.

[23]  J. Rowley,et al.  Screening poly(dA/dT)- cDNAs for gene identification. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[24]  E. Mardis,et al.  Generation and analysis of 280,000 human expressed sequence tags. , 1996, Genome research.

[25]  J. Claverie,et al.  The significance of digital gene expression profiles. , 1997, Genome research.

[26]  E. Kirkness,et al.  cDNA sequencing: a means of understanding cellular physiology. , 1994, Current opinion in biotechnology.

[27]  B. S. Baker,et al.  Sequences expressed sex-specifically in Drosophila melanogaster adults. , 1987, Developmental biology.

[28]  M. Jacobs-Lorena,et al.  Antisense ribosomal protein gene expression specifically disrupts oogenesis in Drosophila melanogaster. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[30]  M. Bate,et al.  The development of Drosophila melanogaster , 1993 .

[31]  J. Trent,et al.  Analysis of gene expression in multiple sclerosis lesions using cDNA microarrays , 1999 .

[32]  J. Claverie Computational methods for the identification of differential and coordinated gene expression. , 1999, Human molecular genetics.

[33]  Kousaku Okubo,et al.  Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression , 1992, Nature Genetics.

[34]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[35]  M. Ashburner,et al.  A biologist's view of the Drosophila genome annotation assessment project. , 2000, Genome research.

[36]  W. Huttner,et al.  Prominin, a novel microvilli-specific polytopic membrane protein of the apical surface of epithelial cells, is targeted to plasmalemmal protrusions of non-epithelial cells. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[37]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[38]  S. Henikoff,et al.  Drosophila genomic sequence annotation using the BLOCKS+ database. , 2000, Genome research.