Network-based Identification of Novel Cancer Genes

Genes involved in cancer susceptibility and progression can serve as templates for searching protein networks for novel cancer genes. To this end, we introduce a general network searching method, MaxLink, and apply it to find and rank cancer gene candidates by their connectivity to known cancer genes. Using a comprehensive protein interaction network, we searched for genes connected to known cancer genes. First, we compiled a new set of 812 genes involved in cancer, more than twice the number in the Cancer Gene Census. Their network neighbors were then extracted. This candidate list was refined by selecting genes with unexpectedly high levels of connectivity to cancer genes and without previous association to cancer. This produced a list of 1891 new cancer candidates with up to 55 connections to known cancer genes. We validated our method by cross-validation, Gene Ontology term bias, and differential expression in cancer versus normal tissue. An example novel cancer gene candidate is presented with detailed analysis of the local network and neighbor annotation. Our study provides a ranked list of high priority targets for further s tudies in cancer research. Supplemental material is included.

[1]  Frances S. Turner,et al.  POCUS: mining genomic sequence annotation to predict disease genes , 2003, Genome Biology.

[2]  A. Fraser,et al.  A first-draft human protein-interaction map , 2004, Genome Biology.

[3]  R. Sharan,et al.  Protein networks in disease. , 2008, Genome research.

[4]  L. Feuk,et al.  SNP association studies in Alzheimer's disease highlight problems for complex disease analysis. , 2001, Trends in genetics : TIG.

[5]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[6]  Igor Jurisica,et al.  Online Predicted Human Interaction Database , 2005, Bioinform..

[7]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[8]  Andreas Prlic,et al.  Ensembl 2008 , 2007, Nucleic Acids Res..

[9]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[10]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[11]  Martijn A. Huynen,et al.  Conserved co-expression for candidate disease gene prioritization , 2008, BMC Bioinformatics.

[12]  T. Barrette,et al.  Probabilistic model of the human protein-protein interaction network , 2005, Nature Biotechnology.

[13]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[14]  Jason Y. Liu,et al.  Analysis of protein sequence and interaction data for candidate disease gene prediction , 2006, Nucleic acids research.

[15]  E. Lundberg,et al.  A Genecentric Human Protein Atlas for Expression Profiles Based on Antibodies* , 2008, Molecular & Cellular Proteomics.

[16]  Erik L. L. Sonnhammer,et al.  jSquid: a Java applet for graphical on-line network exploration , 2008, Bioinform..

[17]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[18]  Razvan C. Bunescu,et al.  Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome , 2005, Genome Biology.

[19]  L. Chin,et al.  Mutation in Rpa1 results in defective DNA double-strand break repair, chromosomal instability and cancer in mice , 2005, Nature Genetics.

[20]  Robert Hoffmann,et al.  HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms , 2005, BMC Bioinformatics.

[21]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[22]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[23]  E. Sonnhammer,et al.  Global networks of functional coupling in eukaryotes from comprehensive data integration. , 2009, Genome research.

[24]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[25]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[26]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[27]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[28]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[29]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[30]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.