DDEC: Dragon database of genes implicated in esophageal cancer

BackgroundEsophageal cancer ranks eighth in order of cancer occurrence. Its lethality primarily stems from inability to detect the disease during the early organ-confined stage and the lack of effective therapies for advanced-stage disease. Moreover, the understanding of molecular processes involved in esophageal cancer is not complete, hampering the development of efficient diagnostics and therapy. Efforts made by the scientific community to improve the survival rate of esophageal cancer have resulted in a wealth of scattered information that is difficult to find and not easily amendable to data-mining. To reduce this gap and to complement available cancer related bioinformatic resources, we have developed a comprehensive database (Dragon Database of Genes Implicated in Esophageal Cancer) with esophageal cancer related information, as an integrated knowledge database aimed at representing a gateway to esophageal cancer related data.DescriptionManually curated 529 genes differentially expressed in EC are contained in the database. We extracted and analyzed the promoter regions of these genes and complemented gene-related information with transcription factors that potentially control them. We further, precompiled text-mined and data-mined reports about each of these genes to allow for easy exploration of information about associations of EC-implicated genes with other human genes and proteins, metabolites and enzymes, toxins, chemicals with pharmacological effects, disease concepts and human anatomy. The resulting database, DDEC, has a useful feature to display potential associations that are rarely reported and thus difficult to identify. Moreover, DDEC enables inspection of potentially new 'association hypotheses' generated based on the precompiled reports.ConclusionWe hope that this resource will serve as a useful complement to the existing public resources and as a good starting point for researchers and physicians interested in EC genetics. DDEC is freely accessible to academic and non-profit users at http://apps.sanbi.ac.za/ddec/. DDEC will be updated twice a year.

[1]  Gopal R. Gopinath,et al.  Correction: Reactome: a knowledge base of biologic pathways and processes , 2009, Genome Biology.

[2]  Vladimir B. Bajic,et al.  Database for exploration of functional context of genes implicated in ovarian cancer , 2008, Nucleic Acids Res..

[3]  Derek E. Wildman,et al.  New Onto-Tools: Promoter-Express, nsSNPCounter and Onto-Translate , 2006, Nucleic Acids Res..

[4]  A Bairoch,et al.  SWISS-PROT: connecting biomolecular knowledge via a protein database. , 2001, Current issues in molecular biology.

[5]  T. Speed,et al.  Statistical issues in cDNA microarray data analysis. , 2003, Methods in molecular biology.

[6]  Vladimir B. Bajic,et al.  Dragon TF Association Miner: a system for exploring transcription factor associations through text-mining , 2004, Nucleic Acids Res..

[7]  Shinzaburo Noguchi,et al.  Cancer gene expression database (CGED): a database for gene expression profiling with accompanying clinical information of human cancer tissues , 2004, Nucleic Acids Res..

[8]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[9]  Bart De Moor,et al.  TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis , 2005, Nucleic Acids Res..

[10]  T. Cremer,et al.  Chromosome territories, nuclear architecture and gene regulation in mammalian cells , 2001, Nature Reviews Genetics.

[11]  Xin Chen,et al.  The TRANSFAC system on gene expression regulation , 2001, Nucleic Acids Res..

[12]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[13]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[14]  V. Bajic,et al.  Dragon Plant Biology Explorer. A Text-Mining Tool for Integrating Associations between Genetic and Biochemical Entities with Genome Annotation and Biochemical Terms Lists[w] , 2005, Plant Physiology.

[15]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[16]  Alan Christoffels,et al.  DDESC: Dragon database for exploration of sodium channels in human , 2008, BMC Genomics.

[17]  Alexander E. Kel,et al.  MATCHTM: a tool for searching transcription factor binding sites in DNA sequences , 2003, Nucleic Acids Res..

[18]  E. Wingender,et al.  MATCH: A tool for searching transcription factor binding sites in DNA sequences. , 2003, Nucleic acids research.

[19]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[20]  Andreas Prlic,et al.  Ensembl 2008 , 2007, Nucleic Acids Res..

[21]  Dan Wu,et al.  EMBL Nucleotide Sequence Database in 2006 , 2006, Nucleic Acids Res..

[22]  C. V. Jongeneel,et al.  eVOC: a controlled vocabulary for unifying gene expression data. , 2003, Genome research.

[23]  The World Health Report 1997--conquering suffering, enriching humanity. , 1997, World health forum.

[24]  R. Mariani-Costantini,et al.  Analysis of extended genomic rearrangements in oncological research. , 2007, Annals of oncology : official journal of the European Society for Medical Oncology.

[25]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[26]  P. Nelson,et al.  Project normal: Defining normal variance in mouse gene expression , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  M. Frost,et al.  A multifaceted educational approach to increasing awareness and use of physician data query (PDQ). , 2009, Journal of cancer education : the official journal of the American Association for Cancer Education.

[28]  Ramón Díaz-Uriarte,et al.  IDconverter and IDClight: Conversion and annotation of gene and protein IDs , 2007, BMC Bioinformatics.

[29]  Wendy A Bickmore,et al.  Chromatin organization in the mammalian nucleus. , 2005, International review of cytology.

[30]  M Kanehisa,et al.  Organizing and computing metabolic pathway data in terms of binary relations. , 1997, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[31]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..

[32]  Martin S. Taylor,et al.  Genome-wide analysis of mammalian promoter architecture and evolution , 2006, Nature Genetics.

[33]  Tsviya Olender,et al.  GeneCardsTM 2002: towards a complete, object-oriented, human gene compendium , 2002, Bioinform..

[34]  C. Reed,et al.  Surgical management of esophageal carcinoma. , 1999, The oncologist.

[35]  T. Barrette,et al.  ONCOMINE: a cancer microarray database and integrated data-mining platform. , 2004, Neoplasia.