Computational Analysis of Muscular Dystrophy Sub-types Using a Novel Integrative Scheme

To construct biologically interpretable features and facilitate Muscular Dystrophy (MD) sub-types classification, we propose a novel integrative scheme utilizing PPI network, functional gene sets information, and mRNA profiling. The workflow of the proposed scheme includes three major steps: First, by combining protein–protein interaction network structure and gene co-expression relationship into new distance metric, we apply affinity propagation clustering to build gene sub-networks. Secondly, we further incorporate functional gene sets knowledge to complement the physical interaction information. Finally, based on constructed sub-network and gene set features, we apply multi-class support vector machine (MSVM) for MD sub-type classification, and highlight the biomarkers contributing to the sub-type prediction. The experimental results show that our scheme could construct sub-networks that are more relevant to MD than those constructed by conventional approach. Furthermore, our integrative strategy substantially improved the prediction accuracy, especially for those hard-to-classify sub-types.

[1]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[2]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[3]  Takeaki Uno,et al.  Enumeration of condition-dependent dense modules in protein interaction networks , 2009, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[4]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[5]  J. Tidball Inflammatory processes in muscle injury and repair. , 2005, American journal of physiology. Regulatory, integrative and comparative physiology.

[6]  R. Mulligan,et al.  Dystrophin expression in the mdx mouse restored by stem cell transplantation , 1999, Nature.

[7]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[8]  Michele Leone,et al.  Clustering by Soft-constraint Affinity Propagation: Applications to Gene-expression Data , 2022 .

[9]  A. Emery,et al.  The muscular dystrophies , 2002, The Lancet.

[10]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[11]  E. Olson,et al.  Signaling pathways in skeletal muscle remodeling. , 2006, Annual review of biochemistry.

[12]  Natalie Wilson,et al.  Human Protein Reference Database , 2004, Nature Reviews Molecular Cell Biology.

[13]  Robert Clarke,et al.  Motif-directed network component analysis for regulatory network inference , 2008, BMC Bioinformatics.

[14]  Daniel Hanisch,et al.  Co-clustering of biological networks and gene expression data , 2002, ISMB.

[15]  K. Becker,et al.  Analysis of microarray data using Z score transformation. , 2003, The Journal of molecular diagnostics : JMD.

[16]  Robert Clarke,et al.  caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data , 2008, BMC Bioinformatics.

[17]  Nathan Blow,et al.  Systems biology: Untangling the protein web , 2009, Nature.

[18]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[19]  B. Shneiderman,et al.  Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb-MyoD pathways in muscle regeneration. , 2006, Brain : a journal of neurology.

[20]  Robert Clarke,et al.  Knowledge-guided gene ranking by coordinative component analysis , 2010, BMC Bioinformatics.

[21]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[22]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[23]  E. Hoffman,et al.  Dysferlin deficiency enhances monocyte phagocytosis: a model for the inflammatory onset of limb-girdle muscular dystrophy 2B. , 2008, The American journal of pathology.

[24]  K. Campbell,et al.  Muscular dystrophies involving the dystrophin-glycoprotein complex: an overview of current mouse models. , 2002, Current opinion in genetics & development.

[25]  Aidong Zhang,et al.  Clustering Methods in a Protein–Protein Interaction Network , 2007 .

[26]  E. Gehan,et al.  The properties of high-dimensional data spaces: implications for exploring gene and protein expression data , 2008, Nature Reviews Cancer.

[27]  Paul Pavlidis,et al.  Activation of MAPK pathways links LMNA mutations to cardiomyopathy in Emery-Dreifuss muscular dystrophy. , 2007, The Journal of clinical investigation.

[28]  Y. W. Chen,et al.  Early onset of inflammation and later involvement of TGFβ in Duchenne muscular dystrophy , 2005, Neurology.

[29]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[30]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[31]  K. Campbell,et al.  Muscular dystrophies and the dystrophin-glycoprotein complex. , 1997, Current opinion in neurology.

[32]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .