FDR-Corrected Sparse Canonical Correlation Analysis With Applications to Imaging Genomics

Reducing the number of false discoveries is presently one of the most pressing issues in the life sciences. It is of especially great importance for many applications in neuroimaging and genomics, where data sets are typically high-dimensional, which means that the number of explanatory variables exceeds the sample size. The false discovery rate (FDR) is a criterion that can be employed to address that issue. Thus it has gained great popularity as a tool for testing multiple hypotheses. Canonical correlation analysis (CCA) is a statistical technique that is used to make sense of the cross-correlation of two sets of measurements collected on the same set of samples (e.g., brain imaging and genomic data for the same mental illness patients), and sparse CCA extends the classical method to high-dimensional settings. Here, we propose a way of applying the FDR concept to sparse CCA, and a method to control the FDR. The proposed FDR correction directly influences the sparsity of the solution, adapting it to the unknown true sparsity level. Theoretical derivation as well as simulation studies show that our procedure indeed keeps the FDR of the canonical vectors below a user-specified target level. We apply the proposed method to an imaging genomics data set from the Philadelphia Neurodevelopmental Cohort. Our results link the brain connectivity profiles derived from brain activity during an emotion identification task, as measured by functional magnetic resonance imaging, to the corresponding subjects’ genomic data.

[1]  A. Spiro,et al.  A genome screen of successful aging without cognitive decline identifies LRP1B by haplotype analysis , 2009, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[2]  J. L. Haines,et al.  Examination of association of genes in the serotonin system to autism , 2009, neurogenetics.

[3]  Georg C. Terstappen,et al.  Molecular Cloning and Characterization of the Human Diacylglycerol Kinase β (DGKβ) Gene , 2002, The Journal of Biological Chemistry.

[4]  Daniela M Witten,et al.  Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data , 2009, Statistical applications in genetics and molecular biology.

[5]  A. Rotter,et al.  Intracellular substrates of brain-enriched receptor protein tyrosine phosphatase rho (RPTPρ/PTPRT) , 2006, Brain Research.

[6]  J. Gusella,et al.  Disruption of a synaptotagmin (SYT14) associated with neurodevelopmental abnormalities , 2007, American journal of medical genetics. Part A.

[7]  N. Tzourio-Mazoyer,et al.  Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain , 2002, NeuroImage.

[8]  Nicolette Marshall,et al.  Neural response to specific components of fearful faces in healthy and schizophrenic adults , 2010, NeuroImage.

[9]  Minyi Xu,et al.  Sex Hormones in Autism: Androgens and Estrogens Differentially and Reciprocally Regulate RORA, a Novel Candidate Gene for Autism , 2011, PloS one.

[10]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[11]  Larry J. Siever,et al.  The CSMD1 genome-wide associated schizophrenia risk variant rs10503253 affects general cognitive ability and executive function in healthy males , 2014, Schizophrenia Research.

[12]  Joseph T. Glessner,et al.  Analysis of GWAS top hits in ADHD suggests association to two polymorphisms located in genes expressed in the cerebellum , 2010, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[13]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[14]  R. Plomin,et al.  A Three-Stage Genome-Wide Association Study of General Cognitive Ability: Hunting the Small Effects , 2010, Behavior genetics.

[15]  D. Bishop,et al.  CNTNAP2 variants affect early language development in the general population , 2011, Genes, brain, and behavior.

[16]  N. Skokauskas,et al.  Identifying a consistent pattern of neural function in attention deficit hyperactivity disorder: a meta-analysis , 2013, Psychological Medicine.

[17]  Masahiko Watanabe,et al.  Diacylglycerol kinase β promotes dendritic outgrowth and spine maturation in developing hippocampal neurons , 2009, BMC Neuroscience.

[18]  S Cichon,et al.  Genome-wide analysis of rare copy number variations reveals PARK2 as a candidate gene for attention-deficit/hyperactivity disorder , 2012, Molecular Psychiatry.

[19]  S. Christian,et al.  Disruption of contactin 4 in three subjects with autism spectrum disorder , 2008, Journal of Medical Genetics.

[20]  Naoaki Saito,et al.  Diacylglycerol Kinase β Knockout Mice Exhibit Lithium-Sensitive Behavioral Abnormalities , 2010, PloS one.

[21]  Brent L Fogel,et al.  Orchestration of neurodevelopmental programs by RBFOX1: implications for autism spectrum disorder. , 2013, International review of neurobiology.

[22]  Karl J. Friston,et al.  A neuromodulatory role for the human amygdala in processing emotional facial expressions. , 1998, Brain : a journal of neurology.

[23]  Jian Fei,et al.  Cognitive impairment in mice over-expressing &ggr;-aminobutyric acid transporter I (GAT1) , 2004, Neuroreport.

[24]  Christos Davatzikos,et al.  Neuroimaging of the Philadelphia Neurodevelopmental Cohort , 2014, NeuroImage.

[25]  J. Sebat,et al.  Linkage, association, and gene-expression analyses identify CNTNAP2 as an autism-susceptibility gene. , 2008, American journal of human genetics.

[26]  R A White,et al.  Identification and characterization of RPTP rho, a novel RPTP mu/kappa-like receptor protein tyrosine phosphatase whose expression is restricted to the central nervous system. , 1998, Brain research. Molecular brain research.

[27]  Jonathan A. Cooper,et al.  High affinity binding of Dab1 to Reelin receptors promotes normal positioning of upper layer cortical plate neurons. , 2004, Brain research. Molecular brain research.

[28]  N. Sadato,et al.  Age‐related differences in the medial temporal lobe responses to emotional faces as revealed by fMRI , 2002, Hippocampus.

[29]  Susann M Brady-Kalnay,et al.  Receptor protein tyrosine phosphatases regulate neural development and axon guidance. , 2004, Developmental biology.

[30]  J. Gilbert,et al.  Variants in several genomic regions associated with asperger disorder , 2010, Autism research : official journal of the International Society for Autism Research.

[31]  Libing Feng,et al.  Dual Functions of Dab1 during Brain Development , 2008, Molecular and Cellular Biology.

[32]  S. Horvath,et al.  Protein interaction network of alternatively spliced isoforms from brain links genetic risk factors for autism , 2014, Nature Communications.

[33]  C. Croux,et al.  Sparse canonical correlation analysis from a predictive point of view , 2015, Biometrical journal. Biometrische Zeitschrift.

[34]  Jeffrey A Lieberman,et al.  Genome-Wide Pharmacogenomic Study of Neurocognition As an Indicator of Antipsychotic Treatment Response in Schizophrenia , 2011, Neuropsychopharmacology.

[35]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[36]  A. Zwinderman,et al.  Statistical Applications in Genetics and Molecular Biology Quantifying the Association between Gene Expressions and DNA-Markers by Penalized Canonical Correlation Analysis , 2011 .

[37]  T. Abe,et al.  Biochemical characterization and expression analysis of neural thrombospondin-1-like proteins NELL1 and NELL2. , 1999, Biochemical and biophysical research communications.

[38]  Richard P Lifton,et al.  Disruption of Contactin 4 (CNTN4) results in developmental delay and other features of 3p deletion syndrome. , 2004, American journal of human genetics.

[39]  D. Pinto,et al.  Rare deletions at the neurexin 3 locus in autism spectrum disorder. , 2012, American journal of human genetics.

[40]  D. Tritchler,et al.  Sparse Canonical Correlation Analysis with Application to Genomic Data Integration , 2009, Statistical applications in genetics and molecular biology.

[41]  O. Andreassen,et al.  Delayed stabilization and individualization in connectome development are related to psychiatric disorders , 2017, Nature Neuroscience.

[42]  P. Visscher,et al.  A versatile gene-based test for genome-wide association studies. , 2010, American journal of human genetics.

[43]  Vince D. Calhoun,et al.  Correspondence between fMRI and SNP data by group sparse canonical correlation analysis , 2014, Medical Image Anal..

[44]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[45]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[46]  You Yang,et al.  Role of metabotropic glutamate receptor 7 in autism spectrum disorders: a pilot study. , 2013, Life sciences.

[47]  Aribert Rothenberger,et al.  Genome‐wide association scan of quantitative traits for attention deficit hyperactivity disorder identifies novel associations and confirms candidate gene associations , 2008, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[48]  Mark A Marzinke,et al.  14-3-3ε and NAV2 interact to regulate neurite outgrowth and axon elongation. , 2013, Archives of biochemistry and biophysics.

[49]  David A. Pearce,et al.  Reelin signaling is impaired in autism , 2005, Biological Psychiatry.

[50]  Irene Lee,et al.  No association between a common single nucleotide polymorphism, rs4141463, in the MACROD2 gene and autism spectrum disorder , 2011, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[51]  John Blangero,et al.  MACROD2 gene associated with autistic-like traits in a general population sample , 2014, Psychiatric genetics.

[52]  Steven Wiltshire,et al.  Impact of the Reelin signaling cascade (Ligands–Receptors–Adaptor Complex) on cognition in schizophrenia , 2012, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[53]  Yogesh K. Dwivedi,et al.  A decrease of reelin expression as a putative vulnerability factor in schizophrenia. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Richard S. Frackowiak,et al.  Normal variation in fronto-occipital circuitry and cerebellar structure with an autism-associated polymorphism of CNTNAP2 , 2010, NeuroImage.

[55]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[56]  Heather C Whalley,et al.  Genetic variation in CNTNAP2 alters brain function during linguistic processing in healthy individuals , 2011, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[57]  Hyo-Won Kim,et al.  Association between the GRM7 rs3792452 polymorphism and attention deficit hyperacitiveity disorder in a Korean sample , 2013, Behavioral and Brain Functions.

[58]  M. Posner,et al.  Cognitive and emotional influences in anterior cingulate cortex , 2000, Trends in Cognitive Sciences.

[59]  T. Egner,et al.  Emotional processing in anterior cingulate and medial prefrontal cortex , 2011, Trends in Cognitive Sciences.

[60]  Paul M. Thompson,et al.  Altered Structural Brain Connectivity in Healthy Carriers of the Autism Risk Gene, CNTNAP2 , 2011, Brain Connect..

[61]  A. Rotter,et al.  Identification and characterization of RPTPρ, a novel RPTPμ/κ-like receptor protein tyrosine phosphatase whose expression is restricted to the central nervous system , 1998 .

[62]  W. McMahon,et al.  Superior Temporal Gyrus, Language Function, and Autism , 2007, Developmental neuropsychology.

[63]  M. L. Eaton Multivariate statistics : a vector space approach , 1985 .

[64]  Tianzi Jiang,et al.  Hyper-coupling between working memory task-evoked activations and amplitude of spontaneous fluctuations in first-episode schizophrenia , 2014, Schizophrenia Research.

[65]  Susan Shur-Fen Gau,et al.  Genome-wide analysis of copy number variations identifies PARK2 as a candidate gene for autism spectrum disorder , 2016, Molecular Autism.

[66]  Philippe Besse,et al.  Sparse canonical methods for biological data integration: application to a cross-platform study , 2009, BMC Bioinformatics.

[67]  Renza Roncarati,et al.  Molecular cloning and characterization of the human diacylglycerol kinase beta (DGKbeta) gene: alternative splicing generates DGKbeta isotypes with different properties. , 2002, The Journal of biological chemistry.

[68]  Margaret A. Pericak-Vance,et al.  A genome-wide scan for common alleles affecting risk for autism , 2010, Human molecular genetics.

[69]  Li Yao,et al.  Working memory load-dependent spatio-temporal activity of single-trial P3 response detected with an adaptive wavelet denoiser , 2017, Neuroscience.

[70]  Guoyin Feng,et al.  CNTNAP2 is significantly associated with schizophrenia and major depression in the Han Chinese population , 2013, Psychiatry Research.

[71]  Nir Giladi,et al.  Parkin mutation associated parkinsonism and cognitive decline, comparison to early onset Parkinson’s disease , 2004, Journal of Neural Transmission.

[72]  Fuad Al Mutairi,et al.  The fragile site WWOX gene and the developing brain , 2015, Experimental biology and medicine.

[73]  Jing Liu,et al.  Association study between genes in Reelin signaling pathway and autism identifies DAB1 as a susceptibility gene in a Chinese Han population , 2013, Progress in Neuro-Psychopharmacology and Biological Psychiatry.

[74]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[75]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[76]  Olga Peñagarikano,et al.  What does CNTNAP2 reveal about autism spectrum disorder? , 2012, Trends in molecular medicine.

[77]  M A Stein,et al.  Low intelligence but not attention deficit hyperactivity disorder is associated with resistance to thyroid hormone caused by mutation R316H in the thyroid hormone receptor beta gene. , 1994, The Journal of clinical endocrinology and metabolism.

[78]  A. Pramatarova,et al.  A genetic interaction between the APP and Dab1 genes influences brain development , 2008, Molecular and Cellular Neuroscience.

[79]  G. Blatt,et al.  Increased GAD67 mRNA expression in cerebellar interneurons in autism: Implications for Purkinje cell dysfunction , 2008, Journal of neuroscience research.

[80]  Mahdi Sarmady,et al.  Exome sequencing expands the mechanism of SOX5‐associated intellectual disability: A case presentation with review of sox‐related disorders , 2015, American journal of medical genetics. Part A.

[81]  P. Ellen Grant,et al.  Developmental neural networks in children performing a Categorical N-Back Task , 2006, NeuroImage.

[82]  Shannon L. Risacher,et al.  Structured sparse canonical correlation analysis for brain imaging genetics: an improved GraphNet method , 2016, Bioinform..

[83]  D J Porteous,et al.  Disruption of a brain transcription factor, NPAS3, is associated with schizophrenia and learning disability , 2005, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[84]  Andreas Buchmann,et al.  Calmodulin-binding transcription activator 1 (CAMTA1) alleles predispose human episodic memory performance. , 2007, Human molecular genetics.