Principles and methods of integrative genomic analyses in cancer

Combined analyses of molecular data, such as DNA copy-number alteration, mRNA and protein expression, point to biological functions and molecular pathways being deregulated in multiple cancers. Genomic, metabolomic and clinical data from various solid cancers and model systems are emerging and can be used to identify novel patient subgroups for tailored therapy and monitoring. The integrative genomics methodologies that are used to interpret these data require expertise in different disciplines, such as biology, medicine, mathematics, statistics and bioinformatics, and they can seem daunting. The objectives, methods and computational tools of integrative genomics that are available to date are reviewed here, as is their implementation in cancer research.

[1]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[2]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[3]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  T. Ideker,et al.  A new approach to decoding life: systems biology. , 2001, Annual review of genomics and human genetics.

[5]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[6]  Steven C. Lawlor,et al.  MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data , 2003, Genome Biology.

[7]  Joydeep Ghosh,et al.  Cluster Ensembles A Knowledge Reuse Framework for Combining Partitionings , 2002, AAAI/IAAI.

[8]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[9]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[10]  R. Tibshirani,et al.  Repeated observation of breast tumor subtypes in independent gene expression data sets , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  M. Cronin,et al.  A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. , 2004, The New England journal of medicine.

[12]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[14]  M. Ko,et al.  Global gene expression analysis identifies molecular pathways distinguishing blastocyst dormancy and activation. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15]  D. Koller,et al.  A module map showing conditional activity of expression modules in cancer , 2004, Nature Genetics.

[16]  Satoru Miyano,et al.  Combining Microarrays and Biological Knowledge for Estimating Gene Networks via Bayesian Networks , 2004, J. Bioinform. Comput. Biol..

[17]  Rainer Breitling,et al.  Graph-based iterative Group Analysis enhances microarray interpretation , 2004, BMC Bioinformatics.

[18]  T. Barrette,et al.  ONCOMINE: a cancer microarray database and integrated data-mining platform. , 2004, Neoplasia.

[19]  Michael E Phelps,et al.  Systems Biology and New Technologies Enable Predictive and Preventative Medicine , 2004, Science.

[20]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[21]  D. Koller,et al.  From signatures to models: understanding cancer using microarrays , 2005, Nature Genetics.

[22]  M. Dowsett,et al.  Trastuzumab after adjuvant chemotherapy in HER2-positive breast cancer. , 2005, The New England journal of medicine.

[23]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[24]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[25]  Greg Yothers,et al.  Trastuzumab plus adjuvant chemotherapy for operable HER2-positive breast cancer. , 2005, The New England journal of medicine.

[26]  B. Palsson,et al.  The model organism as a system: integrating 'omics' data sets , 2006, Nature Reviews Molecular Cell Biology.

[27]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.

[28]  Jianmin Wu,et al.  KOBAS server: a web-based platform for automated annotation and pathway identification , 2006, Nucleic Acids Res..

[29]  Michael L. Creech,et al.  Integration of biological networks and gene expression data using Cytoscape , 2007, Nature Protocols.

[30]  A. Frigessi,et al.  Indirect genomic effects on survival from gene expression data , 2008, Genome Biology.

[31]  Andrea Califano,et al.  Theory and Limitations of Genetic Network Inference from Microarray Data , 2007, Annals of the New York Academy of Sciences.

[32]  L. Tanoue,et al.  Erlotinib in Previously Treated Non-Small-Cell Lung Cancer , 2007 .

[33]  Arnoldo Frigessi,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm305 Gene expression Predicting survival from microarray data—a comparative study , 2022 .

[34]  Therese Sørlie,et al.  Presence of bone marrow micrometastasis is associated with different recurrence risk within molecular subtypes of breast cancer , 2007, Molecular oncology.

[35]  Jing Zhu,et al.  Edge-based scoring and searching method for identifying condition-responsive protein-protein interaction sub-network , 2007, Bioinform..

[36]  C Caldas,et al.  Using array-comparative genomic hybridization to define molecular portraits of primary breast cancers , 2007, Oncogene.

[37]  P. Khatri,et al.  A systems biology approach for pathway level analysis. , 2007, Genome research.

[38]  Dongsheng Tu,et al.  Cetuximab for the treatment of colorectal cancer. , 2007, The New England journal of medicine.

[39]  Kenneth H. Buetow,et al.  Identification of Key Processes Underlying Cancer Phenotypes Using Biologic Pathway Analysis , 2007, PloS one.

[40]  B. Kholodenko,et al.  Ligand-dependent responses of the ErbB signaling network: experimental and modeling analyses , 2007, Molecular systems biology.

[41]  Tobias Müller,et al.  Identifying functional modules in protein–protein interaction networks: an integrated exact approach , 2008, ISMB.

[42]  K. Dolinski,et al.  Use and misuse of the gene ontology annotations , 2008, Nature Reviews Genetics.

[43]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[44]  Joshua M. Korn,et al.  Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2008, Nature.

[45]  J. Astola,et al.  Systematic bioinformatic analysis of expression levels of 17,330 human genes across 9,783 samples from 175 types of healthy and pathological tissues , 2008, Genome Biology.

[46]  S A Forbes,et al.  The Catalogue of Somatic Mutations in Cancer (COSMIC) , 2008, Current protocols in human genetics.

[47]  Elmar Bucher,et al.  Genome‐wide analysis identifies 16q deletion associated with survival, molecular subtypes, mRNA expression, and germline haplotypes in breast cancer patients , 2008, Genes, chromosomes & cancer.

[48]  Dongsheng Tu,et al.  K-ras mutations and benefit from cetuximab in advanced colorectal cancer. , 2008, The New England journal of medicine.

[49]  A. Børresen-Dale,et al.  COMPLEX LANDSCAPES OF SOMATIC REARRANGEMENT IN HUMAN BREAST CANCER GENOMES , 2009, Nature.

[50]  Matthew A. Hibbs,et al.  Exploring the human genome with functional maps. , 2009, Genome research.

[51]  Ignacio González,et al.  integrOmics: an R package to unravel relationships between two omics datasets , 2009, Bioinform..

[52]  Dana Pe'er,et al.  Harnessing gene expression to identify the genetic basis of drug resistance , 2009, Molecular systems biology.

[53]  Olga G. Troyanskaya,et al.  Detailing regulatory networks through large scale data integration , 2009, Bioinform..

[54]  Pooja Mittal,et al.  A novel signaling pathway impact analysis , 2009, Bioinform..

[55]  Chunquan Li,et al.  SubpathwayMiner: a software package for flexible identification of pathways , 2009, Nucleic acids research.

[56]  M. Stratton,et al.  The cancer genome , 2009, Nature.

[57]  Hiroki Nagase,et al.  Genetic architecture of murine skin inflammation and tumor susceptibility , 2016 .

[58]  Shi-Hua Zhang,et al.  Detecting disease associated modules and prioritizing active genes based on high throughput data , 2010, BMC Bioinformatics.

[59]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[60]  A. Balmain,et al.  Systems genetics analysis of cancer susceptibility: from mouse models to humans , 2009, Nature Reviews Genetics.

[61]  Jing Zhu,et al.  Edge-based scoring and searching method for identifying condition-responsive protein-protein interaction sub-network , 2007, Bioinform..

[62]  Yangseok Kim,et al.  CHESS (CgHExpreSS): A comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome , 2009, BMC Bioinformatics.

[63]  Georg F. Weiller,et al.  PathExpress update: the enzyme neighbourhood method of associating gene-expression data with metabolic pathways , 2009, Nucleic Acids Res..

[64]  Subha Madhavan,et al.  Rembrandt: Helping Personalized Medicine Become a Reality through Integrative Translational Research , 2009, Molecular Cancer Research.

[65]  Marit Holden,et al.  Gene Dosage, Expression, and Ontology Analysis Identifies Driver Genes in the Carcinogenesis and Chemoradioresistance of Cervical Cancer , 2009, PLoS genetics.

[66]  K. Ovaska,et al.  Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme , 2010, Genome Medicine.

[67]  Adam A. Margolin,et al.  Multivariate dependence and genetic networks inference. , 2010, IET systems biology.

[68]  Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis , 2010, Bioinform..

[69]  A. Krasnitz,et al.  Genomic Architecture Characterizes Tumor Progression Paths and Fate in Breast Cancer Patients , 2010, Science Translational Medicine.

[70]  Wei Sun,et al.  Dynamically weighted clustering with noise set , 2010, Bioinform..

[71]  Raj Chari,et al.  An integrative multi-dimensional genetic and epigenetic strategy to identify aberrant genes and pathways in cancer , 2010, BMC Systems Biology.

[72]  Russ B. Altman,et al.  Independent component analysis: Mining microarray data for fundamental human gene expression modules , 2010, J. Biomed. Informatics.

[73]  Derek Y. Chiang,et al.  The landscape of somatic copy-number alteration across human cancers , 2010, Nature.

[74]  Chunsheng Zhang,et al.  Stromal genes discriminate preinvasive from invasive disease, predict outcome, and highlight inflammatory pathways in digestive cancers , 2010, Proceedings of the National Academy of Sciences.

[75]  Nuria Lopez-Bigas,et al.  IntOGen: integration and data mining of multidimensional oncogenomic data , 2010, Nature Methods.

[76]  David Haussler,et al.  Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM , 2010, Bioinform..

[77]  J. Uhm,et al.  The transcriptional network for mesenchymal transformation of brain tumours , 2010 .

[78]  Yiling Lu,et al.  Identification of optimal drug combinations targeting cellular networks: integrating phospho-proteomics and computational network analysis. , 2010, Cancer research.

[79]  Wei-Shou Hu,et al.  A Scalable Approach for Discovering Conserved Active Subnetworks across Species , 2010, PLoS Comput. Biol..

[80]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[81]  Peter N. Robinson,et al.  GOing Bayesian: model-based gene set analysis of genome-scale data , 2010, Nucleic acids research.

[82]  Adrien Treuille,et al.  Predicting protein structures with a multiplayer online game , 2010, Nature.

[83]  D. Pe’er,et al.  An Integrated Approach to Uncover Drivers of Cancer , 2010, Cell.

[84]  G. Nolan,et al.  Computational solutions to large-scale data management and analysis , 2010, Nature Reviews Genetics.

[85]  Hiroko K. Solvang,et al.  Linear and non-linear dependencies between copy number aberrations and mRNA expression reveal distinct molecular pathways in breast cancer , 2011, BMC Bioinformatics.

[86]  Dirk Repsilber,et al.  ExprEssence - Revealing the essence of differential experimental data in the context of an interaction/regulation net-work , 2010, BMC Systems Biology.

[87]  J. Blay,et al.  Validated prediction of clinical outcome in sarcomas and multiple types of cancer on the basis of a gene expression signature related to genome complexity , 2010, Nature Medicine.

[88]  L. Tanoue,et al.  Gefitinib or Carboplatin–Paclitaxel in Pulmonary Adenocarcinoma , 2010 .

[89]  Mike Martin Semantic Web may be cancer information's next step forward. , 2011, Journal of the National Cancer Institute.

[90]  Marina Vannucci,et al.  Variable selection for discriminant analysis with Markov random field priors for the analysis of microarray data , 2011, Bioinform..

[91]  Jesse M. Engreitz,et al.  ProfileChaser: searching microarray repositories based on genome-wide patterns of differential expression , 2011, Bioinform..

[92]  Israel Steinfeld,et al.  miRNA-mRNA Integrated Analysis Reveals Roles for miRNAs in Primary Breast Tumors , 2011, PloS one.

[93]  Graham W. Horgan,et al.  Exploratory Analysis of Multiple Omics Datasets Using the Adjusted RV Coefficient , 2011, Statistical applications in genetics and molecular biology.

[94]  Sampsa Hautaniemi,et al.  CNAmet: an R package for integrating copy number, methylation and expression data , 2011, Bioinform..

[95]  J. Mesirov,et al.  Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific dependencies in ovarian cancer , 2011, Proceedings of the National Academy of Sciences.

[96]  Krishna R. Kalari,et al.  Integrated Analysis of Gene Expression, CpG Island Methylation, and Gene Copy Number in Breast Cancer Cells by Deep Sequencing , 2011, PloS one.

[97]  R. Tibshirani,et al.  A fused lasso latent feature model for analyzing multi-sample aCGH data. , 2011, Biostatistics.

[98]  Florian Markowetz,et al.  Patient-Specific Data Fusion Defines Prognostic Cancer Subtypes , 2011, PLoS Comput. Biol..

[99]  Sylvia Richardson,et al.  Bayesian Detection of Expression Quantitative Trait Loci Hot Spots , 2011, Genetics.

[100]  G. Mills,et al.  DNA-PK mediates AKT activation and apoptosis inhibition in clinically acquired platinum resistance. , 2011, Neoplasia.

[101]  Chuan-Yun Li,et al.  KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases , 2011, Nucleic Acids Res..

[102]  Vessela Kristensen,et al.  Methylation profiling with a panel of cancer related genes: Association with estrogen receptor, TP53 mutation status and expression subtypes in sporadic breast cancer , 2011, Molecular oncology.

[103]  Joshua M. Stuart,et al.  Subtype and pathway specific responses to anticancer compounds in breast cancer , 2011, Proceedings of the National Academy of Sciences.

[104]  Chris Sander,et al.  Time to Recurrence and Survival in Serous Ovarian Tumors Predicted from Integrated Genomic Profiles , 2011, PloS one.

[105]  David Haussler,et al.  Integrated molecular profiles of invasive breast tumors and ductal carcinoma in situ (DCIS) reveal differential vascular and interleukin signaling , 2011, Proceedings of the National Academy of Sciences.

[106]  Mike Martin Rewriting the mathematics of tumor growth. , 2011, Journal of the National Cancer Institute.

[107]  K. V. Donkena,et al.  Batch effect correction for genome-wide methylation data with Illumina Infinium platform , 2011, BMC Medical Genomics.

[108]  P. Spellman,et al.  Subtypes of Pancreatic Ductal Adenocarcinoma and Their Differing Responses to Therapy , 2011, Nature Medicine.

[109]  V. Brower Epigenetics: Unravelling the cancer code , 2011, Nature.

[110]  Jannik N. Andersen,et al.  Cancer genomics: from discovery science to personalized medicine , 2011, Nature Medicine.

[111]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[112]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[113]  Michael J. E. Sternberg,et al.  AMBIENT: Active Modules for Bipartite Networks - using high-throughput transcriptomic data to dissect metabolic response , 2013, BMC Systems Biology.

[114]  E. Schadt Eric Schadt , 2012, Nature Biotechnology.

[115]  A. Gonzalez-Perez,et al.  Functional impact bias reveals cancer drivers , 2012, Nucleic acids research.

[116]  T. Ideker,et al.  Subnetwork-based analysis of chronic lymphocytic leukemia identifies pathways that associate with disease progression. , 2011, Blood.

[117]  G. Bhanot,et al.  Potential tumorigenic programs associated with TP53 mutation status reveal role of VEGF pathway , 2012, British Journal of Cancer.

[118]  A. Butte,et al.  Leveraging models of cell regulation and GWAS data in integrative network-based association studies , 2012, Nature Genetics.

[119]  A. Børresen-Dale,et al.  The landscape of cancer genes and mutational processes in breast cancer , 2012, Nature.

[120]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[121]  Irmtraud M. Meyer,et al.  The clonal and mutational evolution spectrum of primary triple-negative breast cancers , 2012, Nature.

[122]  Chris T. A. Evelo,et al.  WikiPathways: building research communities on biological pathways , 2011, Nucleic Acids Res..

[123]  Alissa M. Weaver,et al.  Network Analysis of the Focal Adhesion to Invadopodia Transition Identifies a PI3K-PKCα Invasive Signaling Axis , 2012, Science Signaling.

[124]  T. Eberlein,et al.  Improved Survival with Vemurafenib in Melanoma with BRAF V600E Mutation , 2012 .

[125]  Santo Fortunato,et al.  Consensus clustering in complex networks , 2012, Scientific Reports.

[126]  Kuo-Wang Tsai,et al.  Comprehensive analysis of microRNAs in breast cancer , 2012, BMC Genomics.

[127]  Carlos Caldas,et al.  A sparse regulatory network of copy-number driven expression reveals putative breast cancer oncogenes , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[128]  Zoubin Ghahramani,et al.  Bayesian correlated clustering to integrate multiple datasets , 2012, Bioinform..

[129]  Paul Bertone,et al.  Digital transcriptome profiling of normal and glioblastoma-derived neural stem cells identifies genes associated with patient survival , 2012, Genome Medicine.

[130]  Casey S. Greene,et al.  IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks , 2012, Nucleic Acids Res..

[131]  F. Markowetz,et al.  Quantitative Image Analysis of Cellular Heterogeneity in Breast Tumors Complements Genomic Profiling , 2012, Science Translational Medicine.

[132]  Andre Dekker,et al.  Radiomics: the process and the challenges. , 2012, Magnetic resonance imaging.

[133]  C. Sander,et al.  Integrative Subtype Discovery in Glioblastoma Using iCluster , 2012, PloS one.

[134]  Charles Auffray,et al.  Editorial: Systems biology and personalized medicine – the future is now , 2012, Biotechnology journal.

[135]  L. Hood,et al.  Systems cancer medicine: towards realization of predictive, preventive, personalized and participatory (P4) medicine , 2012, Journal of internal medicine.

[136]  F. Markowetz,et al.  The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups , 2012, Nature.

[137]  Michael A Newton,et al.  A Model-Based Analysis to Infer the Functional Content of a Gene List , 2012, Statistical applications in genetics and molecular biology.

[138]  T. Ideker,et al.  Differential network biology , 2012, Molecular systems biology.

[139]  Sampsa Hautaniemi,et al.  Comparative analysis of algorithms for integration of copy number and expression data , 2012, Nature Methods.

[140]  Gary D Bader,et al.  A travel guide to Cytoscape plugins , 2012, Nature Methods.

[141]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[142]  David Tamborero,et al.  Oncodrive-CIS: A Method to Reveal Likely Driver Genes Based on the Impact of Their Copy Number Changes on Expression , 2013, PloS one.

[143]  Daniel W. A. Buchan,et al.  A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.

[144]  Michal Sheffer,et al.  Pathway-based personalized analysis of cancer , 2013, Proceedings of the National Academy of Sciences.

[145]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[146]  Dimitris Anastassiou,et al.  Biomolecular Events in Cancer Revealed by Attractor Metagenes , 2012, PLoS Comput. Biol..

[147]  Z. Yakhini,et al.  Identifying In-Trans Process Associated Genes in Breast Cancer by Integrated Analysis of Copy Number and Expression Data , 2013, PloS one.

[148]  Mark A. van de Wiel,et al.  PLRS: a flexible tool for the joint analysis of DNA copy number and mRNA expression data , 2013, Bioinform..

[149]  Adam A. Margolin,et al.  Systematic Analysis of Challenge-Driven Improvements in Molecular Prognostic Models for Breast Cancer , 2013, Science Translational Medicine.

[150]  Tai-Hsien Ou Yang,et al.  Development of a Prognostic Model for Breast Cancer Survival in an Open Challenge Environment , 2013, Science Translational Medicine.

[151]  Chris Sander,et al.  Emerging landscape of oncogenic signatures across human cancers , 2013, Nature Genetics.

[152]  S. Gabriel,et al.  Pan-cancer patterns of somatic copy-number alteration , 2013, Nature Genetics.

[153]  Kerstin B. Meyer,et al.  Master regulators of FGFR2 signalling and breast cancer risk , 2013, Nature Communications.

[154]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[155]  Adam A. Margolin,et al.  Prognostic Models for Breast Cancer Systematic Analysis of Challenge-Driven Improvements in Molecular , 2013 .

[156]  A. Rosenwald,et al.  Whole-genome integrative analysis reveals expression signatures predicting transformation in follicular lymphoma. , 2014, Blood.

[157]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[158]  A. Børresen-Dale,et al.  The 5p12 breast cancer susceptibility locus affects MRPS30 expression in estrogen‐receptor positive tumors , 2014, Molecular oncology.