A multi-level classification framework for multi-site medical data: Application to the ADHD-200 collection

A classification approach to face the heterogeneity of multisite medical databases.A promising learning scheme to develop consistent aid in diagnosis models.A case study on Attention Deficit Hyperactivity Disorder. Recently, the culture of sharing medical data has emerged impressively, reducing significantly the barrier to the development of medical research accordingly. As open-access large datasets result from this significant initiative, data mining techniques can be considered for the development of interpretable expert systems to help in diagnosis. However, the collaborative effort of information gathering yields heterogeneous databases because of technical and geographical factors. Indeed, on the one hand, the harmonization of protocols for data collection is still missing. On the other hand, cultural and social factors impact locally both the epidemiology and etiology of a given disease. Ignoring these factors could weaken the credibility of studies based on multi-site data. Thereby, our work tackles the development of computer-aided diagnosis systems relying on heterogeneous data. For such a purpose, we propose a multi-level approach (inspired by multi-level statistical modeling) based on decision trees (in the sense of machine learning). This framework is applied on the public ADHD-200 collection for the study of Attention Deficit Hyperactivity Disorder (ADHD).

[1]  R. C. Oldfield The assessment and analysis of handedness: the Edinburgh inventory. , 1971, Neuropsychologia.

[2]  Harleen Kaur,et al.  The impact of data mining techniques on medical diagnostics , 2006, Data Sci. J..

[3]  Mohammed Benjelloun,et al.  Fast 3D Spine Reconstruction of Postoperative Patients Using a Multilevel Statistical Model , 2012, MICCAI.

[4]  Grover M. Hutchins,et al.  Effort and demand logic in medical decision making , 1980 .

[5]  Dimitris Samaras,et al.  Deriving reproducible biomarkers from multi-site resting-state data: An Autism-based example , 2016, NeuroImage.

[6]  Haldun Aytug,et al.  Feature selection for support vector machines using Generalized Benders Decomposition , 2015, Eur. J. Oper. Res..

[7]  Ian Witten,et al.  Data Mining , 2000 .

[8]  Habib Benali,et al.  Partial correlation for functional brain interactivity investigation in functional MRI , 2006, NeuroImage.

[9]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[10]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[11]  Bogdan Wilamowski,et al.  Fully Connected Cascade Artificial Neural Network Architecture for Attention Deficit Hyperactivity Disorder Classification From Functional Magnetic Resonance Imaging Data , 2015, IEEE Transactions on Cybernetics.

[12]  Andreas Mueller,et al.  Discriminating between ADHD adults and controls using independent ERP components and a support vector machine: a validation study , 2011, Nonlinear biomedical physics.

[13]  Mark S. Cohen,et al.  Insights into multimodal imaging classification of ADHD , 2012, Front. Syst. Neurosci..

[14]  Joseph S. Ross,et al.  Clinical research data sharing: what an open science world means for researchers involved in evidence synthesis , 2016, Systematic Reviews.

[15]  T. Ford,et al.  The association of attention deficit hyperactivity disorder with socioeconomic disadvantage: alternative explanations and evidence , 2013, Journal of child psychology and psychiatry, and allied disciplines.

[16]  Vidhyasaharan Sethu,et al.  Investigation of spectral centroid features for cognitive load classification , 2011, Speech Commun..

[17]  Swathi P. Iyer,et al.  Distinct neural signatures detected for ADHD subtypes after controlling for micro-movements in resting state functional connectivity MRI data , 2012, Front. Syst. Neurosci..

[18]  M. Milham,et al.  The ADHD-200 Consortium: A Model to Advance the Translational Potential of Neuroimaging in Clinical Neuroscience , 2012, Front. Syst. Neurosci..

[19]  Mark W. Woolrich,et al.  Network modelling methods for FMRI , 2011, NeuroImage.

[20]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[21]  M. D’Esposito,et al.  The Variability of Human, BOLD Hemodynamic Responses , 1998, NeuroImage.

[22]  Nikos K Logothetis,et al.  Interpreting the BOLD signal. , 2004, Annual review of physiology.

[23]  Daoqiang Zhang,et al.  Identification of MCI individuals using structural and functional connectivity networks , 2012, NeuroImage.

[24]  Harlan M. Krumholz,et al.  Ushering in a new era of open science through data sharing: the wall must come down. , 2013, JAMA.

[25]  O. Rosso,et al.  The Australian EEG Database , 2005, Clinical EEG and neuroscience.

[26]  S C Matthews,et al.  Decreased frontal regulation during pain anticipation in unmedicated subjects with major depressive disorder , 2013, Translational Psychiatry.

[27]  Alan H Wilman,et al.  Procedural learning in first episode schizophrenia investigated with functional magnetic resonance imaging. , 2011, Neuropsychology.

[28]  Aaron Trefler,et al.  The Future of Medical Diagnostics: Large Digitized Databases , 2012, The Yale journal of biology and medicine.

[29]  Lianghua He,et al.  ADHD-200 Classification Based on Social Network Method , 2014, ICIC.

[30]  Paul J. Laurienti,et al.  Neuroinformatics Original Research Article Materials and Methods Study Participants , 2022 .

[31]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[32]  Cezmi A Akdis,et al.  Categorization of allergic disorders in the new World Health Organization International Classification of Diseases , 2014, Clinical and Translational Allergy.

[33]  Nick C Fox,et al.  The Alzheimer's disease neuroimaging initiative (ADNI): MRI methods , 2008, Journal of magnetic resonance imaging : JMRI.

[34]  Daniel P. Kennedy,et al.  The Autism Brain Imaging Data Exchange: Towards Large-Scale Evaluation of the Intrinsic Brain Architecture in Autism , 2013, Molecular Psychiatry.

[35]  Mark B. Sandler,et al.  Classification of audio signals using statistical features on time and wavelet transform domains , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[36]  Amir-Masoud Eftekhari-Moghadam,et al.  Knowledge discovery in medicine: Current issue and future trend , 2014, Expert Syst. Appl..

[37]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[38]  Nada Lavrac,et al.  Selected techniques for data mining in medicine , 1999, Artif. Intell. Medicine.

[39]  Heather A. Piwowar,et al.  Towards a Data Sharing Culture: Recommendations for Leadership from Academic Health Centers , 2008, PLoS medicine.

[40]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[41]  M. B. Nebel,et al.  Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging , 2012, Front. Syst. Neurosci..

[42]  Bharat B. Biswal,et al.  Making data sharing work: The FCP/INDI experience , 2013, NeuroImage.

[43]  S. Faraone,et al.  Multilevel analysis of ADHD, anxiety and depression symptoms aggregation in families , 2015, European Child & Adolescent Psychiatry.

[44]  Chien-Chang Ho,et al.  ADHD classification by a texture analysis of anatomical brain MRI data , 2012, Front. Syst. Neurosci..

[45]  G. Church,et al.  The Personal Genome Project , 2005, Molecular systems biology.

[46]  N. Tzourio-Mazoyer,et al.  Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain , 2002, NeuroImage.

[47]  Russell Greiner,et al.  Kernel Principal Component Analysis for dimensionality reduction in fMRI-based diagnosis of ADHD , 2012, Front. Syst. Neurosci..

[48]  Andreas Schulze-Bonhage,et al.  EPILEPSIAE - A European epilepsy database , 2012, Comput. Methods Programs Biomed..

[49]  H. Akiskal,et al.  Spectrum concepts in major mental disorders. , 2002, The Psychiatric clinics of North America.

[50]  B. K. Tripathy,et al.  Diagnosis of ADHD using SVM algorithm , 2010, Bangalore Compute Conf..

[51]  Marieke E Timmerman,et al.  Multilevel component analysis. , 2006, The British journal of mathematical and statistical psychology.

[52]  Ruxandra Stoean,et al.  Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection , 2013, Expert Syst. Appl..

[53]  Kavishwar B. Wagholikar,et al.  Modeling Paradigms for Medical Diagnostic Decision Support: A Survey and Future Directions , 2012, Journal of Medical Systems.

[54]  Jan M. Zytkow,et al.  Knowledge discovery in databases: the purpose, necessity, and challenges , 2002 .

[55]  Russell Greiner,et al.  ADHD-200 Global Competition: diagnosing ADHD using personal characteristic data can outperform resting state fMRI measurements , 2012, Front. Syst. Neurosci..

[56]  Age K. Smilde,et al.  Multilevel component analysis of time-resolved metabolic fingerprinting data , 2005 .

[57]  C. M. Adkins,et al.  Pulse Decomposition Analysis of the digital arterial pulse during hemorrhage simulation , 2011, Nonlinear biomedical physics.

[58]  Krzysztof J. Cios,et al.  Uniqueness of medical data mining , 2002, Artif. Intell. Medicine.

[59]  Bruce G. Link,et al.  Social Conditions as Fundamental Causes of Disease , 1995 .

[60]  J. Trostle Epidemiology and culture , 2005 .

[61]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[62]  Olaf Sporns,et al.  Complex network measures of brain connectivity: Uses and interpretations , 2010, NeuroImage.

[63]  O. Sporns,et al.  Complex brain networks: graph theoretical analysis of structural and functional systems , 2009, Nature Reviews Neuroscience.

[64]  Eduardo Alonso,et al.  Phenotypic Integrated Framework for Classification of ADHD Using fMRI , 2016, ICIAR.

[65]  M. Milham Open Neuroscience Solutions for the Connectome-wide Association Era , 2012, Neuron.

[66]  Daniel S. Margulies,et al.  The Neuro Bureau ADHD-200 Preprocessed repository , 2016, NeuroImage.

[67]  Huiguang He,et al.  Classification of ADHD children through multimodal magnetic resonance imaging , 2012, Front. Syst. Neurosci..