Multi-label multi-instance transfer learning for simultaneous reconstruction and cross-talk modeling of multiple human signaling pathways

BackgroundSignaling pathways play important roles in the life processes of cell growth, cell apoptosis and organism development. At present the signal transduction networks are far from complete. As an effective complement to experimental methods, computational modeling is suited to rapidly reconstruct the signaling pathways at low cost. To our knowledge, the existing computational methods seldom simultaneously exploit more than three signaling pathways into one predictive model for the discovery of novel signaling components and the cross-talk modeling between signaling pathways.ResultsIn this work, we propose a multi-label multi-instance transfer learning method to simultaneously reconstruct 27 human signaling pathways and model their cross-talks. Computational results show that the proposed method demonstrates satisfactory multi-label learning performance and rational proteome-wide predictions. Some predicted signaling components or pathway targeted proteins have been validated by recent literature. The predicted signaling components are further linked to pathways using the experimentally derived PPIs (protein-protein interactions) to reconstruct the human signaling pathways. Thus the map of the cross-talks via common signaling components and common signaling PPIs is conveniently inferred to provide valuable insights into the regulatory and cooperative relationships between signaling pathways. Lastly, gene ontology enrichment analysis is conducted to gain statistical knowledge about the reconstructed human signaling pathways.ConclusionsMulti-label learning framework has been demonstrated effective in this work to model the phenomena that a signaling protein belongs to more than one signaling pathway. As results, novel signaling components and pathways targeted proteins are predicted to simultaneously reconstruct multiple human signaling pathways and the static map of their cross-talks for further biomedical research.

[1]  Shuigeng Zhou,et al.  Gene ontology based transfer learning for protein subcellular localization , 2011, BMC Bioinformatics.

[2]  Lincoln Stein,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Res..

[3]  Jason Weston,et al.  Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins , 2010, Bioinform..

[4]  Peter Nürnberg,et al.  Mutations in POGLUT1, encoding protein O-glucosyltransferase 1, cause autosomal-dominant Dowling-Degos disease. , 2014, American journal of human genetics.

[5]  Ujjwal Maulik,et al.  Incorporating the type and direction information in predicting novel regulatory interactions between HIV-1 and human proteins using a biclustering approach , 2014, BMC Bioinformatics.

[6]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[7]  Jenni Kallio,et al.  Not4 enhances JAK/STAT pathway‐dependent gene expression in Drosophila and in human cells , 2012, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[8]  Anupam Gupta,et al.  Discovering pathways by orienting edges in protein interaction networks , 2010, Nucleic acids research.

[9]  Suyu Mei,et al.  SVM ensemble based transfer learning for large-scale membrane proteins discrimination. , 2014, Journal of theoretical biology.

[10]  Xiang Guo,et al.  Transducin β-like 1 X-linked receptor 1 suppresses cisplatin sensitivity in Nasopharyngeal Carcinoma via activation of NF-κB pathway , 2014, Molecular Cancer.

[11]  F. Real,et al.  Interaction between Hhex and SOX13 Modulates Wnt/TCF Activity , 2009, The Journal of Biological Chemistry.

[12]  Ney Lemke,et al.  Prediction of Oncogenic Interactions and Cancer-Related Signaling Networks Based on Network Topology , 2013, PloS one.

[13]  Dirk Elewaut,et al.  Tumor necrosis factor α-induced proteins: natural brakes on inflammation. , 2012, Arthritis and rheumatism.

[14]  I. Farkas,et al.  Signalogs: Orthology-Based Identification of Novel Signaling Pathway Components in Three Metazoans , 2011, PloS one.

[15]  Severine I. Gharbi,et al.  Translocation dynamics of sorting nexin 27 in activated T cells , 2011, Journal of Cell Science.

[16]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[17]  Jian-xiong Dong,et al.  Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Suyu Mei,et al.  Probability Weighted Ensemble Transfer Learning for Predicting Interactions between HIV-1 and Human Proteins , 2013, PloS one.

[19]  Illés J. Farkas,et al.  SignaLink 2 – a signaling pathway resource with multi-layered regulatory networks , 2013, BMC Systems Biology.

[20]  R. Sharan,et al.  A Method for Predicting Protein-Protein Interaction Types , 2014, PloS one.

[21]  Duane Szafron,et al.  Predicting homologous signaling pathways using machine learning , 2009, Bioinform..

[22]  C. Schwartz,et al.  MED12 mutations link intellectual disability syndromes with dysregulated GLI3-dependent Sonic Hedgehog signaling , 2012, Proceedings of the National Academy of Sciences.

[23]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[24]  Hao Zhang,et al.  Potentiation of Smad-mediated transcriptional activation by the RNA-binding protein RBPMS , 2006, Nucleic acids research.

[25]  Marco Punta,et al.  PROTEIN INTERACTIONS AND DISEASE , 2007 .

[26]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[27]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[28]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[29]  Yanhui Hu,et al.  Integrating protein-protein interaction networks with phenotypes reveals signs of interactions , 2013, Nature Methods.

[30]  Suyu Mei,et al.  AdaBoost Based Multi-Instance Transfer Learning for Predicting Proteome-Wide Interactions between Salmonella and Human Proteins , 2014, PloS one.

[31]  J. Trimarchi,et al.  The E2F6 transcription factor is a component of the mammalian Bmi1-containing polycomb complex. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Jacob J Hughey,et al.  Computational modeling of mammalian signaling networks , 2010, Wiley interdisciplinary reviews. Systems biology and medicine.

[33]  Holger Fröhlich,et al.  Predicting pathway membership via domain signatures , 2008, Bioinform..

[34]  Maricel G. Kann,et al.  Chapter 4: Protein Interactions and Disease , 2012, PLoS Comput. Biol..

[35]  R. Khokha,et al.  Simultaneous Transforming Growth Factor β-Tumor Necrosis Factor Activation and Cross-talk Cause Aberrant Remodeling Response and Myocardial Fibrosis in Timp3-deficient Heart* , 2009, The Journal of Biological Chemistry.

[36]  E. Fraenkel,et al.  Integrating Proteomic, Transcriptional, and Interactome Data Reveals Hidden Components of Signaling and Regulatory Networks , 2009, Science Signaling.

[37]  Gary D Bader,et al.  NetPath: a public resource of curated signal transduction pathways , 2010, Genome Biology.

[38]  Kate Hardy,et al.  Investigations of TGF-β signaling in preantral follicles of female mice reveal differential roles for bone morphogenetic protein 15. , 2013, Endocrinology.

[39]  Suyu Mei,et al.  Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization , 2012, PloS one.

[40]  Christian Borgs,et al.  Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem , 2012, J. Comput. Biol..

[41]  Ziv Bar-Joseph,et al.  Identifying proteins controlling key disease signaling pathways , 2013, Bioinform..

[42]  Muffy Calder,et al.  Modular modelling of signalling pathways and their cross-talk , 2012, Theor. Comput. Sci..

[43]  David Zhang,et al.  The crosstalk between EGF, IGF, and Insulin cell signaling pathways - computational and experimental analysis , 2009, BMC Systems Biology.

[44]  Atsushi Hijikata,et al.  Maintenance of Undifferentiated State and Self‐Renewal of Embryonic Neural Stem Cells by Polycomb Protein Ring1B , 2009, Stem cells.

[45]  Jonathan Schug,et al.  Two novel type 2 diabetes loci revealed through integration of TCF7L2 DNA occupancy and SNP association data , 2014, BMJ Open Diabetes Research and Care.