Interactome-transcriptome integration for predicting distant metastasis in breast cancer

MOTIVATION High-throughput gene expression profiling yields genomic signatures that allow the prediction of clinical conditions including patient outcome. However, these signatures have limitations, such as dependency on the training set, and worse, lack of generalization. RESULTS We propose a novel algorithm called ITI (interactome-transcriptome integration), to extract a genomic signature predicting distant metastasis in breast cancer by superimposition of large-scale protein-protein interaction data over a compendium of several gene expression datasets. Training on two different compendia showed that the estrogen receptor-specific signatures obtained are more stable (11-35% stability), can be generalized on independent data and performs better than previously published methods (53-74% accuracy). AVAILABILITY The ITI algorithm source code from analysis are available under CeCILL from the ITI companion website: http://bioinformatique.marseille.inserm.fr/iti. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[2]  Maria Victoria Schneider,et al.  MINT: a Molecular INTeraction database. , 2002, FEBS letters.

[3]  Daniel Hanisch,et al.  Co-clustering of biological networks and gene expression data , 2002, ISMB.

[4]  Eytan Ruppin,et al.  Meta-analysis of gene expression data: a predictor-based approach , 2007, Bioinform..

[5]  Daniel Birnbaum,et al.  A gene expression signature identifies two prognostic subgroups of basal breast cancer , 2011, Breast Cancer Research and Treatment.

[6]  Dennis B. Troup,et al.  NCBI GEO: archive for high-throughput functional genomic data , 2008, Nucleic Acids Res..

[7]  Daniel Q. Naiman,et al.  Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data , 2005, Bioinform..

[8]  Stefan Michiels,et al.  Prediction of cancer outcome with microarrays: a multiple random validation strategy , 2005, The Lancet.

[9]  F. Bertucci,et al.  Gene expression profiling and clinical outcome in breast cancer. , 2006, Omics : a journal of integrative biology.

[10]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[11]  Jesse Gillis,et al.  Gene function analysis in complex data sets using ErmineJ , 2010, Nature Protocols.

[12]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2009 update , 2009, Nucleic Acids Res..

[13]  H. Kölbl,et al.  The humoral immune system has a key prognostic impact in node-negative breast cancer. , 2008, Cancer research.

[14]  M. J. van de Vijver,et al.  Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. , 2006, Journal of the National Cancer Institute.

[15]  Gianluca Bontempi,et al.  Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen , 2008, BMC Genomics.

[16]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[18]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[19]  Natalie Wilson,et al.  Human Protein Reference Database , 2004, Nature Reviews Molecular Cell Biology.

[20]  L. Ein-Dor,et al.  Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Nicolas Stransky,et al.  Visualizing chromosomes as transcriptome correlation maps: evidence of chromosomal domains containing co-expressed genes--a study of 130 invasive ductal breast carcinomas. , 2005, Cancer research.

[22]  A. Nobel,et al.  Concordance among Gene-Expression – Based Predictors for Breast Cancer , 2011 .

[23]  Giorgio Valle,et al.  The Gene Ontology in 2010: extensions and refinements , 2009, Nucleic Acids Res..

[24]  Razvan C. Bunescu,et al.  Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome , 2005, Genome Biology.

[25]  Yingdong Zhao,et al.  How Large a Training Set is Needed to Develop a Classifier for Microarray Data? , 2008, Clinical Cancer Research.

[26]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[27]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[28]  Gianluca Bontempi,et al.  Biological Processes Associated with Breast Cancer Clinical Outcome Depend on the Molecular Subtypes , 2008, Clinical Cancer Research.

[29]  Susmita Datta,et al.  A statistical framework for differential network analysis from microarray data , 2010, BMC Bioinformatics.

[30]  Rainer Breitling,et al.  A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments , 2008, Bioinform..

[31]  Wei Pan,et al.  Network-based support vector machine for classification of microarray samples , 2009, BMC Bioinformatics.

[32]  Emmanuel Barillot,et al.  Classification of microarray data using gene networks , 2007, BMC Bioinformatics.

[33]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.