Network-regularized bi-clique finding for tumor stratification

Complex diseases such as cancer are known to be highly heterogeneous which results in tumor subtypes that show varying behavior, including different survival time, treatment responses, and recurrence rates. One important problem in biomedical research is to identify tumor subtypes as well as specific genetic markers associated with corresponding subtypes. This tumor stratification problem has been studied using computational approaches, including traditional clustering and bi-clustering algorithms based on available genomic data. In this study we discuss the issues and challenges in existing computational approaches for tumor stratification. We show that the problem can be formulated as finding densely connected sub-graphs (bi-cliques) in a bipartite graph representation of genomic data. We propose a novel algorithm that takes advantage of prior biology knowledge through gene-gene interaction network to find such sub-graphs, which helps simultaneously identify both tumor subtypes and their corresponding genetic markers. Our experimental results show that our proposed method outperforms current state-of-the-art methods for tumor stratification.