Linking Interactome to Disease: A Network-Based Analysis of Metastatic Relapse in Breast Cancer

The introduction of high-throughput gene expression profiling technologies (DNA microarrays) in molecular biology and their expected applications to the clinic have allowed the design of predictive signatures linked to a particular clinical condition or patient outcome in a given clinical setting. However, it has been shown that such signatures are prone to several problems: (i) they are heavily unstable and linked to the set of patients chosen for training; (ii) data topology is problematic with regard to the data dimensionality (too many variables for too few samples); (iii) diseases such as cancer are provoked by subtle misregulations which cannot be readily detected by current analysis methods. To find a predictive signature generalizable for multiple datasets, a strategy of superimposition of a large scale of proteinprotein interaction data (human interactome) was devised over several gene expression datasets (a total of 2,464 breast cancer tumors were integrated), to find discriminative regions in the interactome (subnetworks) predicting metastatic relapse in breast cancer. This method, Interactome-Transcriptome Integration (ITI), was applied to several breast cancer DNA microarray datasets and allowed the extraction of a signature constituted by 119 subnetworks. All subnetworks have been stored in a relational database and linked to Gene Ontology and NCBI EntrezGene annotation databases for analysis. Exploration of annotations has shown that this set of subnetworks reflects several biological processes linked to cancer and is a good candidate for establishing a network-based signature for prediction of metastatic relapse in breast cancer.