Open pipelines for integrated tumor genome profiles reveal differences between pancreatic cancer tumors and cell lines

We describe open, reproducible pipelines that create an integrated genomic profile of a cancer and use the profile to find mutations associated with disease and potentially useful drugs. These pipelines analyze high‐throughput cancer exome and transcriptome sequence data together with public databases to find relevant mutations and drugs. The three pipelines that we have developed are: (1) an exome analysis pipeline, which uses whole or targeted tumor exome sequence data to produce a list of putative variants (no matched normal data are needed); (2) a transcriptome analysis pipeline that processes whole tumor transcriptome sequence (RNA‐seq) data to compute gene expression and find potential gene fusions; and (3) an integrated variant analysis pipeline that uses the tumor variants from the exome pipeline and tumor gene expression from the transcriptome pipeline to identify deleterious and druggable mutations in all genes and in highly expressed genes. These pipelines are integrated into the popular Web platform Galaxy at http://usegalaxy.org/cancer to make them accessible and reproducible, thereby providing an approach for doing standardized, distributed analyses in clinical studies. We have used our pipeline to identify similarities and differences between pancreatic adenocarcinoma cancer cell lines and primary tumors.

[1]  S. Salzberg,et al.  TopHat-Fusion: an algorithm for discovery of novel fusion transcripts , 2011, Genome Biology.

[2]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[3]  M. Cronin,et al.  A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. , 2004, The New England journal of medicine.

[4]  Anton Nekrutenko,et al.  Making whole genome multiple alignments usable for biologists , 2011, Bioinform..

[5]  Anton Nekrutenko,et al.  NGS analyses by visualization with Trackster , 2012, Nature Biotechnology.

[6]  A. Hauschild,et al.  Improved survival with vemurafenib in melanoma with BRAF V600E mutation. , 2011, The New England journal of medicine.

[7]  Jacob A. Tennessen,et al.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes , 2012, Science.

[8]  E. Gamazon,et al.  Identification of novel germline polymorphisms governing capecitabine sensitivity , 2012, Cancer.

[9]  Joshua F. McMichael,et al.  DGIdb - Mining the druggable genome , 2013, Nature Methods.

[10]  Anton Nekrutenko,et al.  Web-based visual analysis for high-throughput genomics , 2013, BMC Genomics.

[11]  Joel Dudley,et al.  Matching Cancer Genomes to Established Cell Lines for Personalized Oncology , 2011, Pacific Symposium on Biocomputing.

[12]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[13]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[14]  Suzanne Schubbert,et al.  Hyperactive Ras in developmental disorders and cancer , 2007, Nature Reviews Cancer.

[15]  L. Trusolino,et al.  Oncogene addiction as a foundational rationale for targeted anti-cancer therapy: promises and perils , 2011, EMBO molecular medicine.

[16]  E. Mardis,et al.  Comprehensive genomic studies: emerging regulatory, strategic, and quality assurance challenges for biorepositories. , 2012, American journal of clinical pathology.

[17]  Anton Nekrutenko,et al.  Harnessing cloud computing with Galaxy Cloud , 2011, Nature Biotechnology.

[18]  B. Rigas,et al.  A novel Ras inhibitor (MDC-1016) reduces human pancreatic tumor growth in mice. , 2013, Neoplasia.

[19]  Winnie S. Liang,et al.  Genome-Wide Characterization of Pancreatic Adenocarcinoma Patients Using Next Generation Sequencing , 2012, PloS one.

[20]  F. Collins,et al.  Policy: NIH plans to enhance reproducibility , 2014, Nature.

[21]  Sharmeela Kaushal,et al.  KRas induces a Src/PEAK1/ErbB2 kinase amplification loop that drives metastatic growth and therapy resistance in pancreatic cancer. , 2012, Cancer research.

[22]  Michael B Atkins,et al.  Which drug, and when, for patients with BRAF-mutant melanoma? , 2013, The Lancet. Oncology.

[23]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[24]  Man Tsuey Tse Anticancer drugs: A new approach for blocking KRAS , 2013, Nature reviews. Drug discovery.

[25]  H. Aburatani,et al.  Identification of the transforming EML4–ALK fusion gene in non-small-cell lung cancer , 2007, Nature.

[26]  Enis Afgan,et al.  BioBlend: automating pipeline analyses within Galaxy and CloudMan , 2013, Bioinform..

[27]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[28]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[29]  Christopher A. Miller,et al.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. , 2012, Genome research.

[30]  Ji Luo,et al.  Principles of Cancer Therapy: Oncogene and Non-oncogene Addiction , 2009, Cell.

[31]  Anton Nekrutenko,et al.  Integrating diverse databases into an unified analysis framework: a Galaxy approach , 2011, Database J. Biol. Databases Curation.

[32]  A. Joe,et al.  Oncogene addiction. , 2008, Cancer research.

[33]  Fumio Nomura,et al.  Serum anti-myomegalin antibodies in patients with esophageal squamous cell carcinoma. , 2007, International journal of oncology.

[34]  M. Stratton Exploring the Genomes of Cancer Cells: Progress and Promise , 2011, Science.

[35]  M. Scaltriti,et al.  Biomarkers of drugs targeting HER‐family signalling in cancer , 2014, The Journal of pathology.

[36]  Joon-Oh Park,et al.  Impact of KRAS Mutations on Clinical Outcomes in Pancreatic Cancer Patients Treated with First-line Gemcitabine-Based Chemotherapy , 2011, Molecular Cancer Therapeutics.

[37]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[38]  Nicole M. Baker,et al.  Cancer: Drug for an 'undruggable' protein , 2013, Nature.

[39]  M. Atkins,et al.  Treatment of BRAF‐Mutant Melanoma: The Role of Vemurafenib and Other Therapies , 2013, Clinical pharmacology and therapeutics.

[40]  S. Gabriel,et al.  EGFR Mutations in Lung Cancer: Correlation with Clinical Response to Gefitinib Therapy , 2004, Science.

[41]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[42]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[43]  P. Bastiaens,et al.  Small molecule inhibition of the KRAS–PDEδ interaction impairs oncogenic KRAS signalling , 2013, Nature.

[44]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[45]  T. Fleming,et al.  Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. , 2001, The New England journal of medicine.

[46]  Lee T. Sam,et al.  Personalized Oncology Through Integrative High-Throughput Sequencing: A Pilot Study , 2011, Science Translational Medicine.

[47]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[48]  Syed Mohsin,et al.  Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer , 2003, The Lancet.

[49]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[50]  Adam A. Margolin,et al.  Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas , 2013, Nature Genetics.

[51]  M. Gill,et al.  Development of Strategies for SNP Detection in RNA-Seq Data: Application to Lymphoblastoid Cell Lines and Evaluation Using 1000 Genomes Data , 2013, PloS one.

[52]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .