Multi-Domain Networks Association for Biological Data Using Block Signed Graph Clustering

Multi-domain biological network association and clustering have attracted a lot of attention in biological data integration and understanding, which can provide a more global and accurate understanding of biological phenomenon. In many problems, different domains may have different cluster structures. Due to rapid growth of data collection from different sources, some domains may be strongly or weakly associated with the other domains. A key challenge is how to determine the degree of association among different domains, and to achieve accurate clustering results by data integration. In this paper, we propose an unsupervised learning approach for multi-domain network association by using block signed graph clustering. In particular, with consistency weights calculation, the proposed algorithm automatically identify domains relevant to each other strongly (or weakly) by assigning them larger (or smaller) weights. This approach not only significantly improve clustering accuracy but also understand multi-domain networks association. In each iteration of the proposed algorithm, we update consistency weights based on cluster structure of each domain, and then make use of different sets of eigenvectors to obtain different cluster structures in each domain. Experimental results on both synthetic data sets and real data sets (including neuron activity data and gene expression data) empirically demonstrate the effectiveness of the proposed algorithm in clustering performance and in domain association capability.

[1]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[2]  Masayuki Karasuyama,et al.  Multiple Graph Label Propagation by Sparse Integration , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Zhuowen Tu,et al.  Similarity network fusion for aggregating data types on a genomic scale , 2014, Nature Methods.

[4]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[5]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[6]  C. Guillemette,et al.  Quantitative Profiling of Human Renal UDP-glucuronosyltransferases and Glucuronidation Activity: A Comparison of Normal and Tumoral Kidney Tissues , 2015, Drug Metabolism and Disposition.

[7]  Jean Gallier,et al.  Spectral Theory of Unsigned and Signed Graphs. Applications to Graph Clustering: a Survey , 2016, ArXiv.

[8]  Michael K. Ng,et al.  Multiple networks modules identification by a multi-dimensional Markov chain method , 2015, Network Modeling Analysis in Health Informatics and Bioinformatics.

[9]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[10]  Bart De Moor,et al.  Multiview Partitioning via Tensor Methods , 2013, IEEE Transactions on Knowledge and Data Engineering.

[11]  Andrew Rowland,et al.  The UDP-glucuronosyltransferases: their role in drug metabolism and detoxification. , 2013, The international journal of biochemistry & cell biology.

[12]  Pascal Frossard,et al.  Clustering With Multi-Layer Graphs: A Spectral Perspective , 2011, IEEE Transactions on Signal Processing.

[13]  Huan Liu,et al.  Community detection via heterogeneous interaction analysis , 2012, Data Mining and Knowledge Discovery.

[14]  Haifeng Li,et al.  Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation , 2011, PLoS Comput. Biol..

[15]  R. Vanholder,et al.  Uremic toxins inhibit renal metabolic capacity through interference with glucuronidation and mitochondrial respiration. , 2013, Biochimica et biophysica acta.

[16]  Michael K. Ng,et al.  Gene-microRNA network module analysis for ovarian cancer , 2016, BMC Systems Biology.

[17]  Michael K. Ng,et al.  Functional Module Analysis for Gene Coexpression Networks with Network Integration , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Takayuki Teramoto,et al.  Accurate Automatic Detection of Densely Distributed Cell Nuclei in 3D Space , 2016, PLoS Comput. Biol..

[19]  K. Knights,et al.  Renal UDP-glucuronosyltransferases and the glucuronidation of xenobiotics and endogenous mediators , 2010, Drug metabolism reviews.

[20]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[21]  Sahin Albayrak,et al.  Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization , 2010, SDM.

[22]  Pauline M. Rudd,et al.  Glycomic and glycoproteomic analysis of serum from patients with stomach cancer reveals potential markers arising from host defense response mechanisms. , 2011, Journal of proteome research.

[23]  Shiliang Sun,et al.  Multi-view clustering ensembles , 2013, 2013 International Conference on Machine Learning and Cybernetics.

[24]  Wei Cheng,et al.  CGC: A Flexible and Robust Approach to Integrating Co-Regularized Multi-Domain Graph for Clustering , 2016, ACM Trans. Knowl. Discov. Data.

[25]  Chuan Chen,et al.  Block spectral clustering methods for multiple graphs , 2017, Numer. Linear Algebra Appl..

[26]  Bernhard Schölkopf,et al.  Fast protein classification with multiple networks , 2005, ECCB/JBI.

[27]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[28]  Y. Iino,et al.  Concentration memory-dependent synaptic plasticity of a taste circuit regulates salt concentration chemotaxis in Caenorhabditis elegans , 2013, Nature Communications.

[29]  Bharti Odhav,et al.  Immune responses in cancer. , 2003, Pharmacology & therapeutics.

[30]  Giancarlo Raiconi,et al.  MVDA: a multi-view genomic data integration methodology , 2015, BMC Bioinformatics.

[31]  Meng Wang,et al.  Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[33]  Shuqin Zhang,et al.  Drug-target interaction prediction by integrating multiview network data , 2017, Comput. Biol. Chem..

[34]  V. Adam,et al.  Serum and Tissue Zinc in Epithelial Malignancies: A Meta-Analysis , 2014, PloS one.