Gene co-expression network connectivity is an important determinant of selective constraint

While several studies have investigated general properties of the genetic architecture of natural variation in gene expression, few of these have considered natural, outbreeding populations. In parallel, systems biology has established that a general feature of biological networks is that they are scale-free, rendering them buffered against random mutations. To date, few studies have attempted to examine the relationship between the selective processes acting to maintain natural variation of gene expression and the associated co-expression network structure. Here we utilised RNA-Sequencing to assay gene expression in winter buds undergoing bud flush in a natural population of Populus tremula, an outbreeding forest tree species. We performed expression Quantitative Trait Locus (eQTL) mapping and identified 164,290 significant eQTLs associating 6,241 unique genes (eGenes) with 147,419 unique SNPs (eSNPs). We found approximately four times as many local as distant eQTLs, with local eQTLs having significantly higher effect sizes. eQTLs were primarily located in regulatory regions of genes (UTRs or flanking regions), regardless of whether they were local or distant. We used the gene expression data to infer a co-expression network and investigated the relationship between network topology, the genetic architecture of gene expression and signatures of selection. Within the co-expression network, eGenes were underrepresented in network module cores (hubs) and overrepresented in the periphery of the network, with a negative correlation between eQTL effect size and network connectivity. We additionally found that module core genes have experienced stronger selective constraint on coding and non-coding sequence, with connectivity associated with signatures of selection. Our integrated genetics and genomics results suggest that purifying selection is the primary mechanism underlying the genetic architecture of natural variation in gene expression assayed in flushing leaf buds of P. tremula and that connectivity within the co-expression network is linked to the strength of purifying selection.

[1]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[2]  Craig R. Primmer,et al.  Gene pleiotropy constrains gene expression changes in fish adapted to different thermal conditions , 2014, Nature Communications.

[3]  S. Hunt,et al.  Genome-Wide Associations of Gene Expression Variation in Humans , 2005, PLoS genetics.

[4]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[5]  Xiaohong Yang,et al.  RNA sequencing reveals the complex regulatory network in the maize kernel , 2013, Nature Communications.

[6]  R. Doerge,et al.  Natural Variation among Arabidopsis thaliana Accessions for Transcriptome Response to Exogenous Salicylic Acid[W][OA] , 2007, The Plant Cell Online.

[7]  Daniel Shriner,et al.  Moving toward System Genetics through Multiple Trait Analysis in Genome-Wide Association Studies , 2011, Front. Gene..

[8]  S. Carroll Endless Forms The Evolution of Gene Regulation and Morphological Diversity , 2000, Cell.

[9]  R. O’Hara,et al.  QST–FST comparisons: evolutionary and ecological insights from genomic heterogeneity , 2013, Nature Reviews Genetics.

[10]  Andreas Wagner,et al.  Genotype networks shed light on evolutionary constraints. , 2011, Trends in ecology & evolution.

[11]  Y. Pawitan,et al.  The pursuit of genome-wide association studies: where are we now? , 2010, Journal of Human Genetics.

[12]  E. Stone,et al.  The genetics of quantitative traits: challenges and prospects , 2009, Nature Reviews Genetics.

[13]  C. Douglas,et al.  Populus: a model system for plant biology. , 2007, Annual review of plant biology.

[14]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[15]  Jingyuan Fu,et al.  Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci , 2007, Proceedings of the National Academy of Sciences.

[16]  H. Nijhout,et al.  Transcriptome analysis reveals novel patterning and pigmentation genes underlying Heliconius butterfly wing pattern variation , 2012, BMC Genomics.

[17]  Dan Nettleton,et al.  Genetic Regulation of Gene Expression During Shoot Development in Arabidopsis , 2006, Genetics.

[18]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[19]  Douglas G. Scofield,et al.  Variation in Linked Selection and Recombination Drive Genomic Divergence during Allopatric Speciation of European and American Aspens , 2016, bioRxiv.

[20]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[21]  Jun Wang,et al.  gKaKs: the pipeline for genome-level Ka/Ks calculation , 2013, Bioinform..

[22]  Joseph E. Powell,et al.  Congruence of Additive and Non-Additive Effects on Gene Expression Estimated from Pedigree and SNP Data , 2013, PLoS genetics.

[23]  Scott A. Rifkin,et al.  Revealing the architecture of gene regulation: the promise of eQTL studies. , 2008, Trends in genetics : TIG.

[24]  K. Hughes,et al.  Segregating Variation in the Transcriptome: Cis Regulation and Additivity of Effects , 2006, Genetics.

[25]  R. Sederoff,et al.  Genetic Architecture of Transcript-Level Variation in Differentiating Xylem of a Eucalyptus Hybrid , 2005, Genetics.

[26]  R. Durbin,et al.  Joint Genetic Analysis of Gene Expression Data with Inferred Cellular Phenotypes , 2011, PLoS genetics.

[27]  N. Street,et al.  Towards integration of population and comparative genomics in forest trees. , 2016, The New phytologist.

[28]  S. Dehaene,et al.  Distinct cortical codes and temporal dynamics for conscious and unconscious percepts , 2015, eLife.

[29]  Tuuli Lappalainen,et al.  Functional genomics bridges the gap between quantitative genetics and molecular biology , 2015, Genome research.

[30]  A. Myburg,et al.  Genetic dissection of growth, wood basic density and gene expression in interspecific backcrosses of Eucalyptus grandis and E. urophylla , 2012, BMC Genetics.

[31]  Jan Karlsson,et al.  Natural phenological variation in aspen (Populus tremula): the SwAsp collection , 2008, Tree Genetics & Genomes.

[32]  Jian-Rong Yang,et al.  Determinants of the rate of protein sequence evolution , 2015, Nature Reviews Genetics.

[33]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[34]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[35]  J. Stinchcombe,et al.  Association mapping reveals the role of purifying selection in the maintenance of genomic variation in gene expression , 2015, Proceedings of the National Academy of Sciences.

[36]  Bairong Shen,et al.  New genes drive the evolution of gene interaction networks in the human and mouse genomes , 2015, Genome Biology.

[37]  M. Aguadé,et al.  Network-level molecular evolutionary analysis of the insulin/TOR signal transduction pathway across 12 Drosophila genomes. , 2008, Genome research.

[38]  Stefan Jansson,et al.  ADAPTIVE POPULATION DIFFERENTIATION IN PHENOLOGY ACROSS A LATITUDINAL GRADIENT IN EUROPEAN ASPEN (POPULUS TREMULA, L.): A COMPARISON OF NEUTRAL MARKERS, CANDIDATE GENES AND PHENOTYPIC TRAITS , 2007, Evolution; international journal of organic evolution.

[39]  L. A. Vøllestad,et al.  Plastic and Evolutionary Gene Expression Responses Are Correlated in European Grayling (Thymallus thymallus) Subpopulations Adapted to Different Thermal Environments. , 2016, The Journal of heredity.

[40]  Jody Hey,et al.  The limits of selection during maize domestication , 1999, Nature.

[41]  Gerald A Tuskan,et al.  The willow genome and divergent evolution from poplar after the common genome duplication , 2014, Cell Research.

[42]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[43]  R W Doerge,et al.  Genomic Survey of Gene Expression Diversity in Arabidopsis thaliana , 2006, Genetics.

[44]  Andrew Whitehead,et al.  Variation within and among species in gene expression: raw material for evolution , 2006, Molecular ecology.

[45]  Willem Kruijer,et al.  Marker-Based Estimation of Heritability in Immortal Populations , 2014, Genetics.

[46]  P. Ingvarsson Nucleotide Polymorphism and Linkage Disequilibrium Within and Among Natural Populations of European Aspen (Populus tremula L., Salicaceae) , 2005, Genetics.

[47]  R. Doerge Multifactorial genetics: Mapping and analysis of quantitative trait loci in experimental populations , 2002, Nature Reviews Genetics.

[48]  J. Cairney,et al.  A simple and efficient method for isolating RNA from pine trees , 1993, Plant Molecular Biology Reporter.

[49]  Axel Bender,et al.  Networked buffering: a basic mechanism for distributed robustness in complex adaptive systems , 2009, Theoretical Biology and Medical Modelling.

[50]  Andrew D. Johnson,et al.  Six Novel Loci Associated with Circulating VEGF Levels Identified by a Meta-analysis of Genome-Wide Association Studies , 2016, PLoS genetics.

[51]  S. Palumbi,et al.  Intraspecific divergence in sperm morphology of the green sea urchin, Strongylocentrotus droebachiensis: implications for selection in broadcast spawners , 2008, BMC Evolutionary Biology.

[52]  R. Sederoff,et al.  Coordinated Genetic Regulation of Growth and Lignin Revealed by Quantitative Trait Locus Analysis of cDNA Microarray Data in an Interspecific Backcross of Eucalyptus1 , 2004, Plant Physiology.

[53]  G. Churchill,et al.  Variation in gene expression within and among natural populations , 2002, Nature Genetics.

[54]  Ronald W. Davis,et al.  The core meiotic transcriptome in budding yeasts , 2000, Nature Genetics.

[55]  N. Delhomme,et al.  Populus tremula (European aspen) shows no evidence of sexual dimorphism , 2014, BMC Plant Biology.

[56]  Ge Gao,et al.  PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors , 2013, Nucleic Acids Res..

[57]  J. Mezey,et al.  Adaptive Gene Expression Divergence Inferred from Population Genomics , 2007, PLoS genetics.

[58]  Stefan Jansson,et al.  The Populus Genome Integrative Explorer (PopGenIE): a new resource for exploring the Populus genome. , 2009, The New phytologist.

[59]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[60]  Lindsey J. Leach,et al.  Genome-wide eQTLs and heritability for gene expression traits in unrelated individuals , 2014, BMC Genomics.

[61]  Gunnar Rätsch,et al.  DNA methylation in Arabidopsis has a genetic basis and shows evidence of local adaptation , 2015, eLife.

[62]  Daphne Koller,et al.  Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge , 2013, PloS one.

[63]  James Michael Whitacre,et al.  Biological Robustness: Paradigms, Mechanisms, and Systems Principles , 2012, Front. Gene..

[64]  Ying Cheng,et al.  The European Nucleotide Archive , 2010, Nucleic Acids Res..

[65]  G. Gibson,et al.  Insights from GWAS into the quantitative genetics of transcription in humans. , 2010, Genetics research.

[66]  J. Nap,et al.  Genetical genomics : the added value from segregation , 2001 .

[67]  M. Daly,et al.  Network Analysis of Genome-Wide Selective Constraint Reveals a Gene Network Active in Early Fetal Brain Intolerant of Mutation , 2015, bioRxiv.

[68]  Y. van de Peer,et al.  The Plant Genome Integrative Explorer Resource: PlantGenIE.org. , 2015, The New phytologist.

[69]  Torgeir R. Hvidsten,et al.  Guidelines for RNA-Seq data analysis , 2014 .

[70]  P. Schnable,et al.  Paternal Dominance of Trans-eQTL Influences Gene Expression Patterns in Maize Hybrids , 2009, Science.

[71]  K. Spitze Population structure in Daphnia obtusa: quantitative genetic and allozymic variation. , 1993, Genetics.

[72]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[73]  Michael R. Dohm,et al.  Repeatability estimates do not always set an upper limit to heritability , 2002 .

[74]  D. Bates,et al.  fitting linear mixed effects models using lme 4 arxiv , 2014 .

[75]  Armita Nourmohammad,et al.  Pervasive adaptation of gene expression in Drosophila , 2015, 1502.06406.

[76]  Jonathan Flint,et al.  Genetic architecture of quantitative traits in mice, flies, and humans. , 2009, Genome research.

[77]  M. Purugganan,et al.  Genome-Wide Patterns of Arabidopsis Gene Expression in Nature , 2012, PLoS genetics.

[78]  E. Stone,et al.  Systems Genetics of Complex Traits in Drosophila melanogaster , 2009, Nature Genetics.

[79]  L. MacNeil,et al.  Gene regulatory networks and the role of robustness and stochasticity in the control of gene expression. , 2011, Genome research.

[80]  Beth Holloway,et al.  Genome-wide expression quantitative trait loci (eQTL) analysis in maize , 2011, BMC Genomics.

[81]  Stefan Jansson,et al.  Genetic Variation in Functional Traits Influences Arthropod Community Composition in Aspen (Populus tremula L.) , 2012, PloS one.

[82]  Matthew Stephens,et al.  The genetic architecture of gene expression levels in wild baboons , 2014, bioRxiv.

[83]  Chun Jimmie Ye,et al.  Accurate Discovery of Expression Quantitative Trait Loci Under Confounding From Spurious and Genuine Regulatory Hotspots , 2008, Genetics.

[84]  L. Liang,et al.  A genome-wide association study of global gene expression , 2007, Nature Genetics.

[85]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[86]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[87]  Cai-guo Xu,et al.  A global analysis of QTLs for expression variations in rice shoots at the early seedling stage. , 2010, The Plant journal : for cell and molecular biology.

[88]  G. Wray The evolutionary significance of cis-regulatory mutations , 2007, Nature Reviews Genetics.

[90]  L. Wodicka,et al.  Regional and strain-specific gene expression mapping in the adult mouse brain. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[91]  T. Juenger,et al.  Expression Quantitative Trait Locus Mapping across Water Availability Environments Reveals Contrasting Associations with Genomic Features in Arabidopsis[C][W][OPEN] , 2013, Plant Cell.

[92]  Anders Albrechtsen,et al.  ANGSD: Analysis of Next Generation Sequencing Data , 2014, BMC Bioinformatics.

[93]  John Parsch,et al.  Rapid evolution of male-biased gene expression in Drosophila , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[94]  Meng-Pin Weng,et al.  Contrasting genetic paths to morphological and physiological evolution , 2010, Proceedings of the National Academy of Sciences.

[95]  Robbie Waugh,et al.  Gene expression quantitative trait locus analysis of 16 000 barley genes reveals a complex pattern of genome-wide transcriptional regulation. , 2008, The Plant journal : for cell and molecular biology.

[96]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[97]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[98]  F. Tajima Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. , 1989, Genetics.

[99]  Joshua T. Burdick,et al.  Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[100]  Russell D. Wolfinger,et al.  The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster , 2001, Nature Genetics.

[101]  P. Sullivan,et al.  Heritability and Genomics of Gene Expression in Peripheral Blood , 2014, Nature Genetics.

[102]  Justin O Borevitz,et al.  Genetic architecture of regulatory variation in Arabidopsis thaliana. , 2011, Genome research.

[103]  J. Merilä,et al.  The Evolution and Adaptive Potential of Transcriptional Variation in Sticklebacks—Signatures of Selection and Widespread Heritability , 2014, Molecular biology and evolution.

[104]  E. Petretto,et al.  Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease , 2005, Nature Genetics.

[105]  Sebastian M. Waszak,et al.  Genomic Variation and Its Impact on Gene Expression in Drosophila melanogaster , 2012, PLoS genetics.

[106]  Joseph K. Pickrell,et al.  Understanding mechanisms underlying human gene expression variation with RNA sequencing , 2010, Nature.

[107]  Jinghua Xiao,et al.  An expression quantitative trait loci-guided co-expression analysis for constructing regulatory network using a rice recombinant inbred line population , 2014, Journal of experimental botany.

[108]  David Heckerman,et al.  Correction for hidden confounders in the genetic analysis of gene expression , 2010, Proceedings of the National Academy of Sciences.

[109]  E. Dermitzakis,et al.  Gene age predicts the strength of purifying selection acting on gene expression variation in humans. , 2014, American journal of human genetics.

[110]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[111]  Hélène Touzet,et al.  SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data , 2012, Bioinform..

[112]  Chung-I Wu,et al.  Decoupled differentiation of gene expression and coding sequence among Drosophila populations. , 2008, Genes & genetic systems.

[113]  Matthew D. Schultz,et al.  Patterns of Population Epigenomic Diversity , 2013, Nature.

[114]  Evandro Novaes,et al.  Diversification in the genetic architecture of gene expression and transcriptional networks in organ differentiation of Populus , 2010, Proceedings of the National Academy of Sciences.

[115]  R. Doerge,et al.  Global eQTL Mapping Reveals the Complex Genetic Architecture of Transcript-Level Variation in Arabidopsis , 2007, Genetics.

[116]  Douglas G. Scofield,et al.  Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species , 2015, Genetics.

[117]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.