The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags

Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define ≈23,500 genes, of which only ≈1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes reveals that <1% do not have corresponding ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body. More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants (although the one-pass nature of the data necessitates careful validation) and many alternatively spliced transcripts. Although widely exploited by the scientific community, vindicating our totally open source policy, the EST data generated still provide extensive information that remains to be systematically explored, and that may further facilitate progress toward both the understanding and treatment of human cancers.

Marcio Luis Acencio | C. V. Jongeneel | John Quackenbush | Janet Kelso | Christian Iseli | Sergio Verjovski-Almeida | Brian J. Stevenson | Helena Brentani | Christine Hackel | Paulo Lee Ho | Winston Hide | Alan Mackay | Mari Cleide Sogayar | Luiz Paulo Kowalski | Milton Faria | Katlin Brauer Massirer | Maria Aparecida Nagai | Arthur Gruber | Robert L Strausberg | Helaine Carrer | Marco Grivet | Edna Teruko Kimura | Xin Lu | Fernando Costa | Daniel Giannella-Neto | Maria de Fátima Sonati | Liliane A. T. Arnaldi | Mário Henrique Bengtson | Carlos Alberto Mestriner | Valéria Valente | Heloisa Zalcberg | Dirce Maria Carraro | Paula Rahal | Ademar Lopes | Fernando Augusto Soares | Fabio Passetti | Fabiana Bettoni | Eloiza H Tajara | Anamaria A Camargo | R. Strausberg | John Quackenbush | J. Kelso | A. Camargo | G. Riggins | K. Massirer | W. Bodmer | Winston A Hide | M. Acencio | A. Gruber | N. Pereira da Silva | W. Cavenee | F. Furnari | J. Krieger | E. Tajara | M. Estecio | M. Sogayar | C. Mestriner | C. Iseli | A. Simpson | S. D. de Souza | H. Carrer | M. Nagai | M. Zago | Mário Mourão Neto | E. Miracca | L. Kowalski | Xin Lu | S. Verjovski-Almeida | A. Mackay | A. Neville | M. O'hare | O. Caballero | G. Goldman | H. Brentani | F. Costa | A. Paquola | D. Carraro | A. Lopes | S. Rogatto | F. Passetti | F. Soares | A. J. Holanda | A. D. da Silva | E. Kimura | S. Valentini | J. R. Pandolfi | M. Bengtson | V. Valente | M. L. Paço-Larson | E. M. Espreafico | M. H. Goldman | E. A. Martins | P. E. Guimarães | E. Ojopi | P. Ho | A. L. Nascimento | E. M. Reis | M. Briones | L. C. Leite | F. Nóbrega | J. Pesquero | A. Vettore | D. Giannella‐Neto | M. Sonati | H. El-Dorry | D. Bicknell | A. Carvalho | Paulo Sergio Lopes Oliveira | C. Romano | P. Rahal | A Munro Neville | Aline M da Silva | J. E. Souza | Francisco G Nobrega | Claudia Aparecida Rainho | Silvia Regina Rogatto | Frank B Furnari | I. T. da Silva | Luiz Paulo Camargo | Marcelo R S Briones | Janete M Cerutti | Marina P Nobrega | Vanderlei Rodrigues | Janaína G Romeiro | Sandro R Valentini | A. Montagnini | Camila Malta Romano | Walter F Bodmer | Webster Cavenee | J. Cerutti | R. M. Maciel | Brian J Stevenson | C Victor Jongeneel | Gregory J Riggins | F. Bettoni | Elisson C Osorio | David C Bicknell | S. A. de Bessa | Marco Antonio Zago | L. Camargo | Rui M B Maciel | Alex F Carvalho | Christian Colin | Waleska K Martins | Gustavo Henrique Goldman | André Luiz Vettore | Sandro de Souza | Benedito Mauro Rossi | Otávia L Caballero | Wilson Araújo da Silva | Emmanuel Dias Neto | Pedro Edson Moreira Guimaraes | Elida Paula Benquique Ojopi | Eduardo M R Reis | Andrew John George Simpson | Luis Eduardo Coelho Andrade | Paulo César Costa dos Santos | Maria Cristina Ramos Costa | Israel Tojal da Silva | Marcos Roberto H Estécio | Karine Sa Ferreira | Pedro A F Galante | Gustavo S Guimaraes | Adriano Jesus Holanda | Maarten R Leerkes | Elizabeth A L Martins | Analy S A Melo | Elisabete Cristina Miracca | Leandro Lorenco Miranda | Paulo S Oliveira | Apua C M Paquola | José Rodrigo C Pandolfi | Maria Ines de Moura Campos Pardini | Beatriz Schnabel | Jorge E Souza | Andre C Zaiats | Elisabete Jorge Amaral | Liliane A T Arnaldi | Amelia Goes de Araújo | Simone Aparecida de Bessa | Maria Eugenia Ribeiro de Camaro | Cyntia Curcio | Ismael Dale Cotrim Guerreiro da Silva | Neusa Pereira da Silva | Márcia Dellamano | Hamza El-Dorry | Enilza Maria Espreafico | Ari José Scattone Ferreira | Cristiane Ayres Ferreira | Maria Angela H Z Fortes | Angelita Habr Gama | Maria Lúcia C C Giannella | Ricardo R Giorgi | Maria Helena S Goldman | Elza Myiuki Kimura | Jose E Krieger | Luciana C C Leite | Ana Mercedes S C Luna | Suely Kazue Nagahashi Mari | Adriana Aparecida Marques | André Montagnini | Mario Mourão Neto | Ana Lucia T O Nascimento | Mike J O'Hare | Audrey Yumi Otsuka | Anna Izabel Ruas de Melo | Maria Luisa Paco-Larson | Gonçalo Guimarães Pereira | Joao Bosco Pesquero | Juliana Gilbert Pessoa | Janaina Gusmao Romeiro | Monica Rusticci | Renata Guerra de Sá | Simone Cristina Sant' Anna | Miriam L Sarmazo | Teresa Cristina de Lima E Silva | Josane de Freitas Sousa | Diana Queiroz | Fabiola Elizabeth Villanova | C. Hackel | W. D. da Silva | C. Rainho | V. Rodrigues | C. Curcio | A. S. Melo | T. C. Silva | B. Schnabel | E. Osório | R. Giorgi | M. Grivet | M. Dellamano | Waleska K. Martins | B. Rossi | M. Fortes | I. D. C. Guerreiro da Silva | F. Villanova | M. Leerkes | G. Guimarães | J. de Freitas Sousa | E. Dias Neto | Elisabete Amaral | L. Arnaldi | P. Galante | A. H. Gama | Karine Sá Ferreira | A. C. Zaiats | E. Kimura | A. Y. Otsuka | C. Colín | G. G. Guimarães Pereira | Luís Eduardo Coelho Andrade | Paulo Cesar Costa dos Santos | Milton Faria | A. G. De Araujo | Cristiane Ayres Ferreira | M. L. Giannella | Ana M Luna | A. Marques | M. Nóbrega | J. G. Pessoa | M. Rusticci | Renata Guerra de Sá | D. Queiroz | H. Zalcberg | R. Maciel | A. D. da- Silva | M. Giannella | G. S. Guimarães | Diana Queiroz | I. G. D. Guerreiro da Silva | M. Goldman | C. Jongeneel | J. Krieger | Neusa Pereira da Silva | L. P. Kowalski | Maria Inês de Moura Campos Pardini | Vanderlei Rodrigues | W. A. da Silva | Josane de Freitas Sousa

[1]  C. V. Jongeneel,et al.  Comprehensive sampling of gene expression in human cell lines with massively parallel signature sequencing , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  David States,et al.  Selecting for functional alternative splices in ESTs. , 2002, Genome research.

[3]  Christopher J. Lee,et al.  Genome-wide detection of tissue-specific alternative splicing in the human transcriptome. , 2002, Nucleic acids research.

[4]  Andrew J. Olson,et al.  Computational analysis of alternative splicing using EST tissue information. , 2002, Genomics.

[5]  Kenneth H Buetow,et al.  An anatomy of normal and malignant gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[6]  C. V. Jongeneel,et al.  Long-range heterogeneity at the 3' ends of human mRNAs. , 2002, Genome research.

[7]  C. V. Jongeneel,et al.  Nineteen additional unpredicted transcripts from human chromosome 21. , 2002, Genomics.

[8]  S. Eddy Computational Genomics of Noncoding RNA Genes , 2002, Cell.

[9]  P. Bork,et al.  Alternative splicing and genome complexity , 2002, Nature Genetics.

[10]  R. Strausberg,et al.  An international database and integrated analysis tools for the study of cancer gene expression , 2002, The Pharmacogenomics Journal.

[11]  A. Simpson,et al.  Alternative Spliced Transcripts as Cancer Markers , 2002, Disease markers.

[12]  J. Kelso,et al.  The contribution of exon-skipping events on chromosome 22 to protein coding diversity. , 2001, Genome research.

[13]  Emmanuel Dias-Neto,et al.  The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Christopher J. Lee,et al.  Genome-wide detection of alternative splicing in expressed sequences of human genes , 2001, Nucleic Acids Res..

[15]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[16]  É. Álvarez,et al.  Splice variant expression of CD44 in patients with breast and ovarian cancer. , 2001, Oncology reports.

[17]  S. Chun,et al.  The significance of CD44 variants expression in colorectal cancer and its regional lymph nodes. , 2000, Journal of Korean medical science.

[18]  Christopher J. Lee,et al.  Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences , 2000, Nature Genetics.

[19]  K H Buetow,et al.  Expression-based genetic/physical maps of single-nucleotide polymorphisms identified by the cancer genome anatomy project. , 2000, Genome research.

[20]  C. Fizames,et al.  Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence , 2000, Nature Genetics.

[21]  P. Green,et al.  Analysis of expressed sequence tags indicates 35,000 human genes , 2000, Nature Genetics.

[22]  Rithy K. Roth,et al.  Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays , 2000, Nature Biotechnology.

[23]  F F Costa,et al.  Shotgun sequencing of the human transcriptome with ORF expressed sequence tags. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[24]  R. Strausberg,et al.  The cancer genome anatomy project: building an annotated gene index. , 2000, Trends in genetics : TIG.

[25]  S. Altschul,et al.  A public database for gene expression in human cancers. , 1999, Cancer research.

[26]  Ricardo Bonalume Neto,et al.  Brazilian scientists team up for cancer genome project , 1999, Nature.

[27]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[28]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[29]  Carol A. Dahl,et al.  New opportunities for uncovering the molecular basis of cancer , 1997, Nature Genetics.

[30]  L. Penland,et al.  Use of a cDNA microarray to analyse gene expression patterns in human cancer , 1996, Nature Genetics.