Navigating the human transcriptome

The potential coding capacity of the human genome is currently a topic of great interest. The number of genes predicted from the recent human-genome analysis was at the lower end of previous estimates, which had ranged between about 30,000 and 120,000 (1, 2). Whereas estimates of gene number are likely to increase based on additional experimental evidence and improved gene-finding algorithms, it is clear that gene number is only one mechanism for creating the genetic diversity required to encode the full complement of human proteins. The scientific literature richly describes the presence and functional significance of alternatively processed forms of human transcripts that are derived from different transcription initiation sites, alternative exon splicing, and multiple polyadenylation sites (3–5). Determining the various transcript forms and investigating the purpose of these complex mixtures of instructions will be the next great endeavor toward understanding human biology.

[1]  J. Craig Venter,et al.  Sequence identification of 2,375 human brain genes , 1992, Nature.

[2]  Williamson The Merck Gene Index project. , 1999, Drug discovery today.

[3]  Emmanuel Dias-Neto,et al.  The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  S. Altschul,et al.  A public database for gene expression in human cancers. , 1999, Cancer research.

[5]  G. Edwalds-Gilbert,et al.  Alternative poly(A) site selection in complex transcription units: means to an end? , 1997, Nucleic acids research.

[6]  C. Auffray,et al.  The I.M.A.G.E. Consortium: an integrated molecular analysis of genomes and their expression. , 1996, Genomics.

[7]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[8]  C. Bult,et al.  Functional annotation of a full-length mouse cDNA collection , 2001, Nature.

[9]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[10]  Matthew W. Pennington,et al.  Thirteen UDPglucuronosyltransferase genes are encoded at the human UGT1 gene complex locus. , 2001, Pharmacogenetics.

[11]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[12]  R. Strausberg,et al.  The cancer genome anatomy project: building an annotated gene index. , 2000, Trends in genetics : TIG.

[13]  Bernhard Korn,et al.  Toward a Catalog of Human Genes and Proteins: Sequencing and Analysis of 500 Novel Complete Protein Coding Human cDNAs , 2001 .

[14]  K. Kinzler,et al.  Serial Analysis of Gene Expression , 1995, Science.

[15]  M. Boguski,et al.  dbEST — database for “expressed sequence tags” , 1993, Nature Genetics.

[16]  J. Shay,et al.  An alternate splicing variant of the human telomerase catalytic subunit inhibits telomerase activity. , 2000, Neoplasia.

[17]  M. Bittner,et al.  Human prostate cancer and benign prostatic hyperplasia: molecular dissection by gene expression profiling. , 2001, Cancer research.

[18]  R D Klausner,et al.  The mammalian gene collection. , 1999, Science.

[19]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.