Applications of node-based resilience graph theoretic framework to clustering autism spectrum disorders phenotypes

With the growing ubiquity of data in network form, clustering in the context of a network, represented as a graph, has become increasingly important. Clustering is a very useful data exploratory machine learning tool that allows us to make better sense of heterogeneous data by grouping data with similar attributes based on some criteria. This paper investigates the application of a novel graph theoretic clustering method, Node-Based Resilience clustering (NBR-Clust), to address the heterogeneity of Autism Spectrum Disorder (ASD) and identify meaningful subgroups. The hypothesis is that analysis of these subgroups would reveal relevant biomarkers that would provide a better understanding of ASD phenotypic heterogeneity useful for further ASD studies. We address appropriate graph constructions suited for representing the ASD phenotype data. The sample population is drawn from a very large rigorous dataset: Simons Simplex Collection (SSC). Analysis of the results performed using graph quality measures, internal cluster validation measures, and clinical analysis outcome demonstrate the potential usefulness of resilience measure clustering for biomedical datasets. We also conduct feature extraction analysis to characterize relevant biomarkers that delineate the resulting subgroups. The optimal results obtained favored predominantly a 5-cluster configuration.

[1]  Ricardo J. G. B. Campello,et al.  Relative clustering validity criteria: A comparative overview , 2010, Stat. Anal. Data Min..

[2]  Giulio Costantini,et al.  Grandiose and entitled, but still fragile: A network analysis of pathological narcissistic traits , 2019, Personality and Individual Differences.

[3]  O. Ousley,et al.  Autism Spectrum Disorder: Defining Dimensions and Subgroups , 2014, Current Developmental Disorders Reports.

[4]  Hui Xiong,et al.  Understanding of Internal Clustering Validation Measures , 2010, 2010 IEEE International Conference on Data Mining.

[5]  Judith H Miles,et al.  Defining Autism Subgroups: A Taxometric Solution , 2008, Journal of autism and developmental disorders.

[6]  Shi-Qing Xin,et al.  Facial Structure Analysis Separates Autism Spectrum Disorders into Meaningful Clinical Subgroups , 2014, Journal of Autism and Developmental Disorders.

[7]  Chi-Ren Shyu,et al.  Heritable genotype contrast mining reveals novel gene associations specific to autism subgroups , 2018, J. Biomed. Informatics.

[8]  Michael H. Boyle,et al.  Importance of studying heterogeneity in autism , 2013 .

[9]  Z. Warren,et al.  Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years — Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014 , 2018, Morbidity and mortality weekly report. Surveillance summaries.

[10]  Zachary Warren,et al.  A multisite study of the clinical diagnosis of different autism spectrum disorders. , 2012, Archives of general psychiatry.

[11]  Steven M. Southwick,et al.  Default mode network abnormalities in posttraumatic stress disorder: A novel network-restricted topology approach , 2018, NeuroImage.

[12]  Pat Mirenda,et al.  Investigating phenotypic heterogeneity in children with autism spectrum disorder: a factor mixture modeling approach. , 2012, Journal of child psychology and psychiatry, and allied disciplines.

[13]  L. Eaves,et al.  Subtypes of autism by cluster analysis , 1994, Journal of autism and developmental disorders.

[14]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[15]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[16]  Hui Xiong,et al.  Understanding and Enhancement of Internal Clustering Validation Measures , 2013, IEEE Transactions on Cybernetics.

[17]  David H. Laidlaw,et al.  Neuroimaging biomarkers of cognitive decline in healthy older adults via unified learning , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[18]  M. Stevens,et al.  Subgroups of children with autism by cluster analysis: a longitudinal examination. , 2000, Journal of the American Academy of Child and Adolescent Psychiatry.

[19]  M. Weinstein,et al.  Economic Burden of Childhood Autism Spectrum Disorders , 2012, Pediatrics.

[20]  Eric M. Morrow,et al.  A Genome-wide Association Study of Autism Using the Simons Simplex Collection: Does Reducing Phenotypic Heterogeneity in Autism Increase Genetic Homogeneity? , 2015, Biological Psychiatry.

[21]  C. Lord,et al.  Austism diagnostic observation schedule: A standardized observation of communicative and social behavior , 1989, Journal of autism and developmental disorders.

[22]  Vasant Honavar,et al.  Microbiomarkers Discovery in Inflammatory Bowel Diseases using Network-Based Feature Selection , 2018, BCB.

[23]  Tanya Y. Berger-Wolf,et al.  Network Structure Inference, A Survey , 2016, ACM Comput. Surv..

[24]  Gunes Ercal On Vertex Attack Tolerance in Regular Graphs , 2014, ArXiv.

[25]  James C McPartland,et al.  Considerations in biomarker development for neurodevelopmental disorders. , 2016, Current opinion in neurology.

[26]  Nan Liu,et al.  Knowledge Acquisition and Representation Using Fuzzy Evidential Reasoning and Dynamic Adaptive Fuzzy Petri Nets , 2013, IEEE Transactions on Cybernetics.

[27]  Gunes Ercal,et al.  A Graph-Theoretic Clustering Methodology Based on Vertex-Attack Tolerance , 2015, FLAIRS Conference.

[28]  Gayla R. Olbricht,et al.  Ensemble statistical and subspace clustering model for analysis of autism spectrum disorder phenotypes , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[29]  Tayo Obafemi-Ajayi,et al.  Ensemble validation paradigm for intelligent data analysis in autism spectrum disorders , 2018, 2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[30]  Gunes Ercal,et al.  Analysis of grapevine gene expression data using node-based resilience clustering , 2018, 2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[31]  Satish Rao,et al.  Expander flows, geometric embeddings and graph partitioning , 2004, STOC '04.

[32]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[33]  Vasant Honavar,et al.  Biomarker discovery in inflammatory bowel diseases using network-based feature selection , 2019, bioRxiv.

[34]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[35]  M. Pericak-Vance,et al.  Genetically meaningful phenotypic subgroups in autism spectrum disorders , 2014, Genes, brain, and behavior.

[36]  David J. Foran,et al.  Using Betweenness Centrality to Identify Manifold Shortcuts , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[37]  O. N. Mesquita,et al.  Graph analysis of cell clusters forming vascular networks , 2018, Royal Society Open Science.

[38]  Edward R. Dougherty,et al.  Model-based evaluation of clustering validation measures , 2007, Pattern Recognit..

[39]  Judith H. Miles,et al.  Autism Subgroups from a Medical Genetics Perspective , 2011 .

[40]  Gunes Ercal,et al.  The vertex attack tolerance of complex networks , 2017, RAIRO Oper. Res..

[41]  Joydeep Ghosh,et al.  Data Clustering Algorithms And Applications , 2013 .

[42]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[43]  C. Lord,et al.  The Simons Simplex Collection: A Resource for Identification of Autism Genetic Risk Factors , 2010, Neuron.

[44]  Gunes Ercal,et al.  Robust Graph-Theoretic Clustering Approaches Using Node-Based Resilience Measures , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[45]  Donald C. Wunsch,et al.  Sorting the phenotypic heterogeneity of autism spectrum disorders: A hierarchical clustering model , 2015, 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[46]  Ami Radunskaya,et al.  Graph complexity analysis identifies an ETV5 tumor-specific network in human and murine low-grade glioma , 2018, PloS one.

[47]  Virgil Zeigler-Hill,et al.  A Network of Dark Personality Traits: What Lies at the Heart of Darkness? , 2018 .

[48]  Andrew B. Kahng,et al.  Spectral Partitioning with Multiple Eigenvectors , 1999, Discret. Appl. Math..

[49]  Catherine Lord,et al.  Standardizing ADOS Domain Scores: Separating Severity of Social Affect and Restricted and Repetitive Behaviors , 2014, Journal of autism and developmental disorders.

[50]  Csaba Legány,et al.  Cluster validity measurement techniques , 2006 .

[51]  Jonathan L. Haines,et al.  Exploring the Relationship Between Autism Spectrum Disorder and Epilepsy Using Latent Class Cluster Analysis , 2011, Journal of Autism and Developmental Disorders.

[52]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[53]  John Matta,et al.  A Comparison of Approaches to Computing Betweenness Centrality for Large Graphs , 2017, COMPLEX NETWORKS.

[54]  A. Couteur,et al.  Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders , 1994, Journal of autism and developmental disorders.

[55]  Olatz Arbelaitz,et al.  An extensive comparative study of cluster validity indices , 2013, Pattern Recognit..