Ensemble validation paradigm for intelligent data analysis in autism spectrum disorders

Cluster analysis is an important exploratory tool for a broad range of applications including data analysis of biomedical datasets to uncover meaningful subgroups such as in autism spectrum disorder (ASD). For a given clustering algorithm, multiple results can be obtained on the same dataset by varying the algorithm parameters. In biomedical applications, discovering meaningful subgroups, not just the optimal number of clusters, is expedient. It is imperative to develop quality measures capable of identifying optimal partitions for a given dataset. In this paper, we apply varied clustering methods to subgroup an ASD simplex sample based on relevant phenotype features that may uncover meaningful subtypes. We present a detailed cluster validation analysis using an ensemble validation paradigm and visualization techniques. We present a rigorous clinical/behavioral analysis of the top highly ranked results. The evaluation demonstrated that both configurations yielded similar clinical significance results: 2-subgroups configuration with distinct clinical profile.

[1]  Gayla R. Olbricht,et al.  Ensemble statistical and subspace clustering model for analysis of autism spectrum disorder phenotypes , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[2]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  C. Lord,et al.  Austism diagnostic observation schedule: A standardized observation of communicative and social behavior , 1989, Journal of autism and developmental disorders.

[4]  Hui Xiong,et al.  Understanding and Enhancement of Internal Clustering Validation Measures , 2013, IEEE Transactions on Cybernetics.

[5]  H. Aiandhealt Subtyping : What It Is and Its Role in Precision Medicine , 2015 .

[6]  M. Pericak-Vance,et al.  Genetically meaningful phenotypic subgroups in autism spectrum disorders , 2014, Genes, brain, and behavior.

[7]  Pat Mirenda,et al.  Investigating phenotypic heterogeneity in children with autism spectrum disorder: a factor mixture modeling approach. , 2012, Journal of child psychology and psychiatry, and allied disciplines.

[8]  Michael H. Boyle,et al.  Importance of studying heterogeneity in autism , 2013 .

[9]  Judith H Miles,et al.  Defining Autism Subgroups: A Taxometric Solution , 2008, Journal of autism and developmental disorders.

[10]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11]  Edwin H Cook,et al.  Autism as a paradigmatic complex genetic disorder. , 2004, Annual review of genomics and human genetics.

[12]  M. Weinstein,et al.  Economic Burden of Childhood Autism Spectrum Disorders , 2012, Pediatrics.

[13]  C. Lord,et al.  The Simons Simplex Collection: A Resource for Identification of Autism Genetic Risk Factors , 2010, Neuron.

[14]  Francisco Azuaje,et al.  Cluster validation techniques for genome expression data , 2003, Signal Process..

[15]  Suchi Saria,et al.  A $3 Trillion Challenge to Computational Scientists: Transforming Healthcare Delivery , 2014, IEEE Intelligent Systems.

[16]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[17]  James C McPartland,et al.  Considerations in biomarker development for neurodevelopmental disorders. , 2016, Current opinion in neurology.

[18]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[19]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[20]  Divya Tomar,et al.  A survey on Data Mining approaches for Healthcare , 2013, BSBT 2013.

[21]  Hui Xiong,et al.  Understanding of Internal Clustering Validation Measures , 2010, 2010 IEEE International Conference on Data Mining.

[22]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  David H. Laidlaw,et al.  Neuroimaging biomarkers of cognitive decline in healthy older adults via unified learning , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[24]  Donald C. Wunsch,et al.  Sorting the phenotypic heterogeneity of autism spectrum disorders: A hierarchical clustering model , 2015, 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[25]  A. Couteur,et al.  Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders , 1994, Journal of autism and developmental disorders.

[26]  Tristram H. Smith,et al.  A Review of Subtyping in Autism and Proposed Dimensional Classification Model , 2001, Journal of autism and developmental disorders.

[27]  James C McPartland,et al.  Moving beyond a categorical diagnosis of autism , 2016, The Lancet Neurology.

[28]  G. Arbanas Diagnostic and Statistical Manual of Mental Disorders (DSM-5) , 2015 .

[29]  C. Lord,et al.  The Autism Diagnostic Observation Schedule: Revised Algorithms for Improved Diagnostic Validity , 2007, Journal of autism and developmental disorders.