dbGENVOC: database of GENomic Variants of Oral Cancer, with special reference to India

Abstract Oral cancer is highly prevalent in India and is the most frequent cancer type among Indian males. It is also very common in southeast Asia. India has participated in the International Cancer Genome Consortium (ICGC) and some national initiatives to generate large-scale genomic data on oral cancer patients and analyze to identify associations and systematically catalog the associated variants. We have now created an open, web-accessible database of these variants found significantly associated with Indian oral cancer patients, with a user-friendly interface to enable easy mining. We have value added to this database by including relevant data collated from various sources on other global populations, thereby providing opportunities of comparative geographical and/or ethnic analyses. Currently, no other database of similar nature is available on oral cancer. We have developed Database of GENomic Variants of Oral Cancer, a browsable online database framework for storage, retrieval and analysis of large-scale data on genomic variants and make it freely accessible to the scientific community. Presently, the web-accessible database allows potential users to mine data on ∼24 million clinically relevant somatic and germline variants derived from exomes (n = 100) and whole genomes (n = 5) of Indian oral cancer patients; all generated by us. Variant data from The Cancer Genome Atlas and data manually curated from peer-reviewed publications were also incorporated into the database for comparative analyses. It allows users to query the database by a single gene, multiple genes, multiple variant sites, genomic region, patient ID and pathway identities. Database URL: http://research.nibmg.ac.in/dbcares/dbgenvoc/

[1]  David L. Steffen,et al.  OrCGDB: a database of genes involved in oral cancer , 2001, Nucleic Acids Res..

[2]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[3]  A. Dutt,et al.  Genomic characterization of tobacco/nut chewing HPV-negative early stage tongue tumors identify MMP10 as a candidate to predict metastases , 2017, Oral oncology.

[4]  Lei Zhang,et al.  CCGD-ESCC: A Comprehensive Database for Genetic Variants Associated with Esophageal Squamous Cell Carcinoma in Chinese Population , 2018, Genom. Proteom. Bioinform..

[5]  Yong Zhao,et al.  dbDEPC 3.0: the database of differentially expressed proteins in human cancer with multi-level annotation and drug indication , 2018, Database J. Biol. Databases Curation.

[6]  Benjamin E. Gross,et al.  Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal , 2013, Science Signaling.

[7]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[8]  Christopher T. Saunders,et al.  Strelka2: fast and accurate calling of germline and somatic variants , 2018, Nature Methods.

[9]  Costerwell Khyriem,et al.  Integrated analysis of oral tongue squamous cell carcinoma identifies key variants and pathways linked to risk habits, HPV, clinical parameters and tumor recurrence , 2015, bioRxiv.

[10]  Mauricio O. Carneiro,et al.  Scaling accurate genetic variant discovery to tens of thousands of samples , 2017, bioRxiv.

[11]  Nikhil Sureshkumar Gadewal,et al.  Database and interaction network of genes involved in oral cancer: Version II , 2011, Bioinformation.

[12]  klaguia International Network of Cancer Genome Projects , 2010 .

[13]  Chuan Wang,et al.  dbDEPC: a database of Differentially Expressed Proteins in human Cancers , 2009, Nucleic Acids Res..

[14]  Shaoli Das,et al.  HNOCDB: a comprehensive database of genes and miRNAs relevant to head and neck and oral cancer. , 2012, Oral oncology.

[15]  Trevor J Pugh,et al.  Oncotator: Cancer Variant Annotation Tool , 2015, Human mutation.

[16]  R Divya,et al.  OrCa-dB: a complete catalogue of molecular and clinical information in oral carcinogenesis. , 2012, Oral oncology.

[17]  B. Solomon,et al.  Head and neck squamous cell carcinoma: Genomics and emerging biomarkers for immunomodulatory cancer treatments. , 2018, Seminars in cancer biology.

[18]  Patricia Rodriguez-Tomé,et al.  IARC Database of p53 gene mutations in human tumors and cell lines: updated compilation, revised formats and new visualisation tools , 1998, Nucleic Acids Res..

[19]  P. A. Futreal,et al.  MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data , 2016, Genome Biology.

[20]  A. Sivachenko,et al.  Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples , 2013, Nature Biotechnology.

[21]  Jiajie Zhang,et al.  MethyCancer: the database of human DNA methylation and cancer , 2007, Nucleic Acids Res..

[22]  Po-Jung Huang,et al.  APOBEC3A is an oral cancer prognostic biomarker in Taiwanese carriers of an APOBEC deletion polymorphism , 2017, Nature Communications.

[23]  K. Tomczak,et al.  The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge , 2015, Contemporary oncology.

[24]  Rajiv Sarin,et al.  Mutational landscape of gingivo-buccal oral squamous cell carcinoma reveals new recurrently-mutated genes and molecular subgroups , 2013, Nature Communications.

[25]  N. Johnson,et al.  Oral cancer: Indian pandemic , 2017, BDJ.

[26]  M. Olivier,et al.  Impact of mutant p53 functional properties on TP53 mutation patterns and tumor phenotype: lessons from recent developments in the IARC TP53 database , 2007, Human mutation.

[27]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[28]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[29]  Kristian Cibulskis,et al.  Calling Somatic SNVs and Indels with Mutect2 , 2019, bioRxiv.

[30]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[31]  Jing Li,et al.  dbDEPC 2.0: updated database of differentially expressed proteins in human cancers , 2011, Nucleic Acids Res..