Semantic Mining based on graph theory and ontologies. Case Study: Cell Signaling Pathways

In this paper we use concepts from graph theory and cellular biology represented as ontologies, to carry out semantic mining tasks on signaling pathway networks. Specifically, the paper describes the semantic enrichment of signaling pathway networks. A cell signaling network describes the basic cellular activities and their interactions. The main contribution of this paper is in the signaling pathway research area, it proposes a new technique to analyze and understand how changes in these networks may affect the transmission and flow of information, which produce diseases such as cancer and diabetes. Our approach is based on three concepts from graph theory (modularity, clustering and centrality) frequently used on social networks analysis. Our approach consists into two phases: the first uses the graph theory concepts to determine the cellular groups in the network, which we will call them communities; the second uses ontologies for the semantic enrichment of the cellular communities. The measures used from the graph theory allow us to determine the set of cells that are close (for example, in a disease), and the main cells in each community. We analyze our approach in two cases: TGF-β and the Alzheimer Disease.

[1]  Sergio Gómez,et al.  Size reduction of complex networks preserving modularity , 2007, ArXiv.

[2]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[3]  Jimeng Sun,et al.  A Survey of Models and Algorithms for Social Influence Analysis , 2011, Social Network Data Analytics.

[4]  U. Leser,et al.  Comparing semantically enriched experimental protein networks in colorectal cancer , 2011 .

[5]  Ashok Reddy Dinasarapu,et al.  Signaling gateway molecule pages - a data model perspective , 2011, Bioinform..

[6]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[7]  Charu C. Aggarwal,et al.  Managing and Mining Graph Data , 2010, Managing and Mining Graph Data.

[8]  Gregory Asmolov Crowdsourcing and the folksonomy of emergency response: the construction of a mediated subject , 2015 .

[9]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[10]  Jin Wang,et al.  CePa: an R package for finding significant pathways weighted by multiple network centralities , 2013, Bioinform..

[11]  Kyuseok Shim,et al.  Web Technologies and Applications , 2014, Lecture Notes in Computer Science.

[12]  Kirk D. Borne,et al.  Collaborative annotation for scientific data discovery and reuse , 2013 .

[13]  Xiang-Sun Zhang,et al.  NOA: a novel Network Ontology Analysis method , 2011, Nucleic acids research.

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  Olivier Dameron,et al.  Semantic Particularity Measure for Functional Characterization of Gene Sets Using Gene Ontology , 2014, PloS one.

[16]  Isabella Peters,et al.  Folksonomies - Indexing and Retrieval in Web 2.0 , 2009, Knowledge and Information.

[17]  Surajit Chaudhuri,et al.  DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[18]  Béatrice Bouchou-Markhoff,et al.  Towards tailored domain ontologies , 2010, OM.

[19]  Luonan Chen,et al.  Biomolecular Networks: Methods and Applications in Systems Biology , 2009 .

[20]  Lior Rokach,et al.  Clustering Methods , 2005, The Data Mining and Knowledge Discovery Handbook.

[21]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[22]  Aoying Zhou,et al.  Web Technologies and Applications , 2016, Lecture Notes in Computer Science.

[23]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[24]  J. Altamiranda,et al.  Similarity of Amyloid Protein Motif using an Hybrid Intelligent System , 2011, IEEE Latin America Transactions.

[25]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[26]  Farren J. Isaacs,et al.  Computational studies of gene regulatory networks: in numero molecular biology , 2001, Nature Reviews Genetics.

[27]  Graham Cormode,et al.  Applying link-based classification to label blogs , 2007, WebKDD/SNA-KDD '07.

[28]  M. Cerrada,et al.  An Approach for the Emerging Ontology Alignment based on the Bees Colonies , 2015 .

[29]  Gert Sabidussi,et al.  The centrality index of a graph , 1966 .

[30]  Kazuyuki Aihara,et al.  Modeling Biomolecular Networks in Cells: Structures and Dynamics , 2010 .

[31]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[32]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[33]  O Mason,et al.  Graph theory and networks in Biology. , 2006, IET systems biology.

[34]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[35]  Stephen P. Borgatti,et al.  Centrality and network flow , 2005, Soc. Networks.

[36]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[37]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[38]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.