Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets

Significance Large-scale biological network analyses often use concepts used in social networks analysis (e.g. finding “communities,” “hubs,” etc.). However, mathematically advanced engineering concepts have only been applied to analyze small and well-characterized networks so far in biology. Here, we applied a sophisticated engineering tool, from control theory, to analyze a large-scale directed human protein–protein interaction network. Our analysis revealed that the proteins that are indispensable, from a network controllability perspective, are also commonly targeted by disease-causing mutations and human viruses or have been identified as drug targets. Furthermore, we used the controllability analysis to prioritize novel cancer genes from cancer genomic datasets. Altogether, we demonstrated an application of network controllability analysis to identify new disease genes and drug targets. The protein–protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as “indispensable,” “neutral,” or “dispensable,” which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a network’s control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets.

[1]  R. Kálmán Mathematical description of linear dynamical systems , 1963 .

[2]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[3]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[4]  Ching-tai Lin Structural controllability , 1974 .

[5]  F. Fairman Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.

[6]  A. Isidori Nonlinear Control Systems , 1985 .

[7]  Eduardo D. Sontag,et al.  Mathematical Control Theory: Deterministic Finite Dimensional Systems , 1990 .

[8]  Weiping Li,et al.  Applied Nonlinear Control , 1991 .

[9]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[10]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[11]  T. Hunter,et al.  Oncogenic kinase signalling , 2001, Nature.

[12]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[13]  T. Hunter,et al.  The Protein Kinase Complement of the Human Genome , 2002, Science.

[14]  A. Hopkins,et al.  The druggable genome , 2002, Nature Reviews Drug Discovery.

[15]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[16]  S. Mangan,et al.  Structure and function of the feed-forward loop network motif , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Izhar Ben-Shlomo,et al.  Signaling Receptome: A Genomic and Evolutionary Perspective of Plasma Membrane Receptors Involved in Signal Transduction , 2003, Science's STKE.

[18]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[19]  David N. Messina,et al.  An ORFeome-based analysis of human transcription factor genes and the construction of a microarray to interrogate their expression. , 2004, Genome research.

[20]  F. Vannberg,et al.  Building a human kinase gene repository: bioinformatics, molecular cloning, and functional validation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Chris Sander,et al.  CancerGenes: a gene selection resource for cancer genome projects , 2006, Nucleic Acids Res..

[22]  E. Lander,et al.  Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma , 2007, Proceedings of the National Academy of Sciences.

[23]  L. Hood,et al.  Reverse Engineering of Biological Complexity , 2007 .

[24]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[25]  A. Vazquez,et al.  Epstein–Barr virus and virus human protein interaction maps , 2007, Proceedings of the National Academy of Sciences.

[26]  Amy S. Espeseth,et al.  Genome-scale RNAi screen for host factors required for HIV replication. , 2008, Cell host & microbe.

[27]  Andrea Ciliberto,et al.  Low duplicability and network fragility of cancer genes. , 2008, Trends in genetics : TIG.

[28]  R. König,et al.  Global Analysis of Host-Pathogen Interactions that Regulate Early-Stage HIV-1 Replication , 2008, Cell.

[29]  J. Lieberman,et al.  Identification of Host Proteins Required for HIV Infection Through a Functional Genomic Screen , 2007, Science.

[30]  S. Goff,et al.  Knockdown Screens to Knockout HIV-1 , 2008, Cell.

[31]  M. McCarthy,et al.  Genome-wide association studies: potential next steps on a genetic journey. , 2008, Human molecular genetics.

[32]  Kuan-Teh Jeang,et al.  A Genome-wide Short Hairpin RNA Screening of Jurkat T-cells for Human Proteins Contributing to Productive HIV-1 Replication* , 2009, The Journal of Biological Chemistry.

[33]  Jorge Goncalves,et al.  Control theory and systems biology , 2009 .

[34]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[35]  David Warde-Farley,et al.  Dynamic modularity in protein interaction networks predicts breast cancer outcome , 2009, Nature Biotechnology.

[36]  A. Barabasi,et al.  An empirical framework for binary interactome mapping , 2008, Nature Methods.

[37]  Christian Gautier,et al.  VirHostNet: a knowledge base for the management and the analysis of proteome-wide virus–host interaction networks , 2008, Nucleic Acids Res..

[38]  L. Serrano,et al.  Engineering Signal Transduction Pathways , 2010, Cell.

[39]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[40]  Rainer Breitling,et al.  What is Systems Biology? , 2010, Front. Physiology.

[41]  J. Ellenberg,et al.  The quantitative proteome of a human cell line , 2011, Molecular systems biology.

[42]  A. Vinayagam,et al.  A Directed Protein Interaction Network for Investigating Intracellular Signal Transduction , 2011, Science Signaling.

[43]  Bonnie Berger,et al.  An integrative approach to ortholog prediction for disease-focused and other functional studies , 2011, BMC Bioinformatics.

[44]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[45]  Tamás Vicsek,et al.  Controlling edge dynamics in complex networks , 2011, Nature Physics.

[46]  Albert-László Barabási,et al.  Controllability of complex networks , 2011, Nature.

[47]  A. Oudenaarden,et al.  Cellular Decision Making and Biological Noise: From Microbes to Mammals , 2011, Cell.

[48]  T. Miyata,et al.  Faculty Opinions recommendation of A systems approach identifies HIPK2 as a key regulator of kidney fibrosis. , 2012 .

[49]  Tatsuya Akutsu,et al.  Dominating scale-free networks with variable scaling exponent: heterogeneous networks are not difficult to control , 2012 .

[50]  M. Meyerson,et al.  Recurrent Hemizygous Deletions in Cancers May Optimize Proliferative Potential , 2012, Science.

[51]  Peer Bork,et al.  OGEE: an online gene essentiality database , 2011, Nucleic Acids Res..

[52]  Rahul C. Deo,et al.  Interpreting cancer genomes using systematic host network perturbations by tumour virus proteins - eScholarship , 2012 .

[53]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[54]  James J. Collins,et al.  Genetic switchboard for synthetic biology applications , 2012, Proceedings of the National Academy of Sciences.

[55]  Nadezhda T. Doncheva,et al.  Topological analysis and interactive visualization of biological networks and protein structures , 2012, Nature Protocols.

[56]  Illés J. Farkas,et al.  SignaLink 2 – a signaling pathway resource with multi-layered regulatory networks , 2013, BMC Systems Biology.

[57]  John H. Morris,et al.  Global landscape of HIV–human protein complexes , 2011, Nature.

[58]  Charles M. Perou Comprehensive molecular characterization of clear cell renal cell carcinoma , 2013 .

[59]  Benjamin E. Gross,et al.  Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal , 2013, Science Signaling.

[60]  Endre Csóka,et al.  Emergence of bimodality in controlling complex networks , 2013, Nature Communications.

[61]  Ulrich Stelzl,et al.  Dual Coordination of Post Translational Modifications in Human Protein Networks , 2013, PLoS Comput. Biol..

[62]  Steven J. M. Jones,et al.  Integrated genomic characterization of endometrial carcinoma , 2013, Nature.

[63]  D. Vecchio,et al.  Biomolecular Feedback Systems , 2014 .

[64]  Tatsuya Akutsu,et al.  Analysis of critical and redundant nodes in controlling directed and undirected complex networks using dominating sets , 2014, J. Complex Networks.

[65]  Stefan Wuchty,et al.  Controllability in protein interaction networks , 2014, Proceedings of the National Academy of Sciences.

[66]  Yan Lin,et al.  DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements , 2013, Nucleic Acids Res..

[67]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .