ANMerge: A Comprehensive and Accessible Alzheimer's Disease Patient-Level Dataset.

BACKGROUND Accessible datasets are of fundamental importance to the advancement of Alzheimer's disease (AD) research. The AddNeuroMed consortium conducted a longitudinal observational cohort study with the aim to discover AD biomarkers. During this study, a broad selection of data modalities was measured including clinical assessments, magnetic resonance imaging, genotyping, transcriptomic profiling, and blood plasma proteomics. Some of the collected data were shared with third-party researchers. However, this data was incomplete, erroneous, and lacking in interoperability. OBJECTIVE To provide the research community with an accessible, multimodal, patient-level AD cohort dataset. METHODS We systematically addressed several limitations of the originally shared data and provided additional unreleased data to enhance the patient-level dataset. RESULTS In this work, we publish and describe ANMerge, a new version of the AddNeuroMed dataset. ANMerge includes multimodal data from 1,702 study participants and is accessible to the research community via a centralized portal. CONCLUSION ANMerge is an information rich patient-level data resource that can serve as a discovery and validation cohort for data-driven AD research, such as, for example, machine learning and artificial intelligence approaches.

[1]  Holger Fröhlich,et al.  From hype to reality: data science enabling personalized medicine , 2018, BMC Medicine.

[2]  M. Folstein,et al.  Clinical diagnosis of Alzheimer's disease , 1984, Neurology.

[3]  A. Simmons,et al.  MRI Measures of Alzheimer's Disease and the AddNeuroMed Study , 2009, Annals of the New York Academy of Sciences.

[4]  Imre Lengyel,et al.  Deep and Frequent Phenotyping study protocol: an observational study in prodromal Alzheimer’s disease , 2019, BMJ Open.

[5]  Clifford R Jack,et al.  Comparison of imaging biomarkers in the Alzheimer Disease Neuroimaging Initiative and the Mayo Clinic Study of Aging. , 2012, Archives of neurology.

[6]  Joel S. Parker,et al.  Adjustment of systematic microarray data biases , 2004, Bioinform..

[7]  Magda Tsolaki,et al.  Genetic Predisposition to Increased Blood Cholesterol and Triglyceride Lipid Levels and Risk of Alzheimer Disease: A Mendelian Randomization Analysis , 2014, PLoS medicine.

[8]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..

[9]  R. Petersen Mild cognitive impairment as a diagnostic entity , 2004, Journal of internal medicine.

[10]  Magda Tsolaki,et al.  Identification of cis-regulatory variation influencing protein abundance levels in human plasma. , 2012, Human molecular genetics.

[11]  José Luís Oliveira,et al.  EMIF Catalogue: A collaborative platform for sharing and reusing biomedical data , 2019, Int. J. Medical Informatics.

[12]  Magda Tsolaki,et al.  The AddNeuroMed framework for multi‐centre MRI assessment of Alzheimer's disease : experience from the first 24 months , 2011, International journal of geriatric psychiatry.

[13]  P. Brennan,et al.  Genetics , 1994, Schizophrenia Research.

[14]  John Gallacher,et al.  Challenges for Optimizing Real-World Evidence in Alzheimer’s Disease: The ROADMAP Project , 2018, Journal of Alzheimer's disease : JAD.

[15]  Magda Tsolaki,et al.  Candidate blood proteome markers of Alzheimer's disease onset and progression: a systematic review and replication study. , 2013, Journal of Alzheimer's disease : JAD.

[16]  Magda Tsolaki,et al.  Alzheimer's disease biomarker discovery using SOMAscan multiplexed protein technology , 2014, Alzheimer's & Dementia.

[17]  S Lovestone,et al.  Biomarkers for disease modification trials--the innovative medicines initiative and AddNeuroMed. , 2007, The journal of nutrition, health & aging.

[18]  Magda Tsolaki,et al.  Inflammatory biomarkers in Alzheimer's disease plasma , 2019, Alzheimer's & Dementia.

[19]  M. Mallar Chakravarty,et al.  Creation of an Open Science Dataset from PREVENT-AD, a Longitudinal Cohort Study of Pre-symptomatic Alzheimer’s Disease , 2020, bioRxiv.

[20]  Magda Tsolaki,et al.  Plasma proteins predict conversion to dementia from prodromal disease , 2014, Alzheimer's & Dementia.

[21]  Magda Tsolaki,et al.  A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer’s Disease Diagnosis , 2015, Journal of Alzheimer's disease : JAD.

[22]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[23]  Mohammad Asif Emon,et al.  Differences in cohort study data affect external validation of artificial intelligence models for predictive diagnostics of dementia - lessons for translation into clinical practice , 2020, EPMA Journal.

[24]  J. Morris,et al.  The Consortium to Establish a Registry for Alzheimer's Disease (CERAD). Part I. Clinical and neuropsychological assesment of Alzheimer's disease , 1989, Neurology.

[25]  Clifford R Jack,et al.  Testing the Right Target and Right Drug at the Right Stage , 2011, Science Translational Medicine.

[26]  Tracy R. Keeney,et al.  Aptamer-based multiplexed proteomic technology for biomarker discovery , 2010, Nature Precedings.

[27]  S. Lovestone,et al.  Proteome-based plasma biomarkers for Alzheimer's disease. , 2006, Brain : a journal of neurology.

[28]  C. Jack,et al.  Ways toward an early diagnosis in Alzheimer’s disease: The Alzheimer’s Disease Neuroimaging Initiative (ADNI) , 2005, Alzheimer's & Dementia.

[29]  José Luis Molinuevo,et al.  European Prevention of Alzheimer’s Dementia Longitudinal Cohort Study (EPAD LCS): study protocol , 2018, BMJ Open.