A global co-expression network approach for connecting genes to specialized metabolic pathways in plants

Plants produce a tremendous diversity of specialized metabolites (SMs) to interact with and manage their environment. A major challenge hindering efforts to tap this seemingly boundless source of pharmacopeia is the identification of SM pathways and their constituent genes. Given the well-established observation that the genes comprising a SM pathway are co-regulated in response to specific environmental conditions, we hypothesized that genes from a given SM pathway would form tight associations (modules) with each other in gene co-expression networks, facilitating their identification. To evaluate this hypothesis, we used 10 global co-expression datasets—each a meta-analysis of hundreds to thousands of expression experiments—across eight plant model organisms to identify hundreds of modules of co-expressed genes for each species. In support of our hypothesis, 15.3-52.6% of modules contained two or more known SM biosynthetic genes (e.g., cytochrome P450s, terpene synthases, and chalcone synthases), and module genes were enriched in SM functions (e.g., glucoside and flavonoid biosynthesis). Moreover, modules recovered many experimentally validated SM pathways in these plants, including all six known to form biosynthetic gene clusters (BGCs). In contrast, genes predicted based on physical proximity on a chromosome to form plant BGCs were no more co-expressed than the null distribution for neighboring genes. These results not only suggest that most predicted plant BGCs do not represent genuine SM pathways but also argue that BGCs are unlikely to be a hallmark of plant specialized metabolism. We submit that global gene co-expression is a rich, but largely untapped, data source for discovering the genetic basis and architecture of plant natural products, which can be applied even without knowledge of the genome sequence.

[1]  M. Purugganan,et al.  Genome-Wide Patterns of Arabidopsis Gene Expression in Nature , 2012, PLoS genetics.

[2]  Nicole K Clay,et al.  A new cyanogenic metabolite in Arabidopsis required for inducible pathogen defense , 2015, Nature.

[3]  Paul Jebb,et al.  Safety in numbers? , 2002, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[4]  Anne Osbourn,et al.  Plant metabolic clusters - from genetics to genomics. , 2016, The New phytologist.

[5]  M. Hirai,et al.  Omics-based identification of Arabidopsis Myb transcription factors regulating aliphatic glucosinolate biosynthesis , 2007, Proceedings of the National Academy of Sciences.

[6]  M. Marra,et al.  Conifer defence against insects: microarray gene expression profiling of Sitka spruce (Picea sitchensis) induced by mechanical wounding or feeding by spruce budworms (Choristoneura occidentalis) or white pine weevils (Pissodes strobi) reveals large-scale changes of the host transcriptome. , 2006, Plant, cell & environment.

[7]  Dana J Morrone,et al.  CYP76M7 Is an ent-Cassadiene C11α-Hydroxylase Defining a Second Multifunctional Diterpenoid Biosynthetic Gene Cluster in Rice[W][OA] , 2009, The Plant Cell Online.

[8]  Rachel E. Kerwin,et al.  Natural genetic variation in Arabidopsis thaliana defense metabolism genes modulates field fitness , 2015, eLife.

[9]  T. Hartmann,et al.  Biosynthesis and Metabolism of Pyrrolizidine Alkaloids in Plants and Specialized Insect Herbivores , 2000 .

[10]  B. Winkel-Shirley,et al.  Flavonoid biosynthesis. A colorful model for genetics, biochemistry, cell biology, and biotechnology. , 2001, Plant physiology.

[11]  V. De Luca,et al.  Mining the Biodiversity of Plants: A Revolution in the Making , 2012, Science.

[12]  Ted C. J. Turlings,et al.  Indole is an essential herbivore-induced volatile priming signal in maize , 2015, Nature Communications.

[13]  C. Olsen,et al.  The biosynthetic gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor contains its co-expressed vacuolar MATE transporter , 2016, Scientific Reports.

[14]  D. Haft,et al.  SMURF: Genomic mapping of fungal secondary metabolite clusters. , 2010, Fungal genetics and biology : FG & B.

[15]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[16]  K. Kinoshita,et al.  ALCOdb: Gene Coexpression Database for Microalgae , 2015, Plant & cell physiology.

[17]  Clay C C Wang,et al.  Two separate gene clusters encode the biosynthetic pathway for the meroterpenoids austinol and dehydroaustinol in Aspergillus nidulans. , 2012, Journal of the American Chemical Society.

[18]  J. Gershenzon,et al.  Biosynthesis of 8-O-Methylated Benzoxazinoid Defense Compounds in Maize , 2016, Plant Cell.

[19]  M. Kolesnikova,et al.  An effective strategy for exploring unknown metabolic pathways by genome mining. , 2013, Journal of the American Chemical Society.

[20]  L. Johnston Growing like a weed , 1997 .

[21]  Peter D. Karp,et al.  The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases , 2015, Nucleic Acids Res..

[22]  I. Sønderby,et al.  Biosynthesis of glucosinolates--gene discovery and beyond. , 2010, Trends in plant science.

[23]  M. Cox,et al.  Fragmentation of an aflatoxin-like gene cluster in a forest pathogen. , 2013, The New phytologist.

[24]  K. Kinoshita,et al.  Rank of Correlation Coefficient as a Comparable Measure for Biological Significance of Gene Coexpression , 2009, DNA research : an international journal for rapid publication of reports on genes and genomes.

[25]  Kazunori Okada,et al.  Identification of a Biosynthetic Gene Cluster in Rice for Momilactones* , 2007, Journal of Biological Chemistry.

[26]  W. K. Wilson,et al.  An oxidosqualene cyclase makes numerous products by diverse mechanisms: a challenge to prevailing concepts of triterpene biosynthesis. , 2007, Journal of the American Chemical Society.

[27]  Hadi Quesneville,et al.  Formation of plant metabolic gene clusters within dynamic chromosomal regions , 2011, Proceedings of the National Academy of Sciences.

[28]  L. Mueller,et al.  Dynamic Maize Responses to Aphid Feeding Are Revealed by a Time Series of Transcriptomic and Metabolomic Assays1[OPEN] , 2015, Plant Physiology.

[29]  E. Grotewold Plant metabolic diversity: a regulatory perspective. , 2005, Trends in Plant Science.

[30]  A. Osbourn,et al.  Metabolic Diversification—Independent Assembly of Operon-Like Gene Clusters in Different Plants , 2008, Science.

[31]  Tomas Hruz,et al.  Genevestigator transcriptome meta-analysis and biomarker search using rice and barley gene expression databases. , 2008, Molecular plant.

[32]  L S Simon,et al.  The regulatory perspective , 2014, Journal of the peripheral nervous system : JPNS.

[33]  M. Reichelt,et al.  Disruption of Adenosine-5′-Phosphosulfate Kinase in Arabidopsis Reduces Levels of Sulfated Secondary Metabolites[W] , 2009, The Plant Cell Online.

[34]  B. Halkier,et al.  Glucosinolate engineering identifies a gamma-glutamyl peptidase. , 2009, Nature chemical biology.

[35]  Alvis Brazma,et al.  Genomic clustering and co-regulation of transcriptional networks in the pathogenic fungus Fusarium graminearum , 2013, BMC Systems Biology.

[36]  Peng Huang,et al.  Convergence and divergence of bitterness biosynthesis and regulation in Cucurbitaceae , 2016, Nature Plants.

[37]  A. Fernie,et al.  Co-expression and co-responses: within and beyond transcription , 2012, Front. Plant Sci..

[38]  M. Sue,et al.  Dispersed Benzoxazinone Gene Cluster: Molecular Characterization and Chromosomal Localization of Glucosyltransferase and Glucosidase Genes in Wheat and Rye1[W] , 2011, Plant Physiology.

[39]  W. Nierman,et al.  Tight control of mycotoxin biosynthesis gene expression in Aspergillus flavus by temperature as revealed by RNA-Seq. , 2011, FEMS microbiology letters.

[40]  Kriston L. McGary,et al.  The Evolutionary Imprint of Domestication on Genome Variation and Function of the Filamentous Fungus Aspergillus oryzae , 2012, Current Biology.

[41]  Warren Lau,et al.  Six enzymes from mayapple that complete the biosynthetic pathway to the etoposide aglycone , 2015, Science.

[42]  R. Peters,et al.  Cytochrome P450 promiscuity leads to a bifurcating biosynthetic pathway for tanshinones. , 2016, The New phytologist.

[43]  Yuji Sawada,et al.  Arabidopsis bile acid:sodium symporter family protein 5 is involved in methionine-derived glucosinolate biosynthesis. , 2009, Plant & cell physiology.

[44]  Eran Pichersky,et al.  Convergent evolution in plant specialized metabolism. , 2011, Annual review of plant biology.

[45]  Kai Blin,et al.  antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters , 2015, Nucleic Acids Res..

[46]  A. Aharoni,et al.  GAME9 regulates the biosynthesis of steroidal alkaloids and upstream isoprenoids in the plant mevalonate pathway , 2016, Nature Communications.

[47]  B. Haas,et al.  Comparative Genomics of Brassica oleracea and Arabidopsis thaliana Reveal Gene Loss, Fragmentation, and Dispersal after Polyploidy[W][OA] , 2006, The Plant Cell Online.

[48]  A. Osbourn,et al.  Making new molecules - evolution of pathways for novel metabolites in plants. , 2012, Current opinion in plant biology.

[49]  J. Gershenzon,et al.  The secondary metabolism of Arabidopsis thaliana: growing like a weed. , 2005, Current opinion in plant biology.

[50]  Eleanore T. Wurtzel,et al.  Plant metabolism, the diverse chemistry set of the future , 2016, Science.

[51]  J. Gershenzon,et al.  Natural Variation in Maize Aphid Resistance Is Associated with 2,4-Dihydroxy-7-Methoxy-1,4-Benzoxazin-3-One Glucoside Methyltransferase Activity[C][W] , 2013, Plant Cell.

[52]  Kriston L. McGary,et al.  Global Transcriptome Changes Underlying Colony Growth in the Opportunistic Human Pathogen Aspergillus fumigatus , 2011, Eukaryotic Cell.

[53]  Kazuki Saito,et al.  Comprehensive Flavonol Profiling and Transcriptome Coexpression Analysis Leading to Decoding Gene–Metabolite Correlations in Arabidopsis[W][OA] , 2008, The Plant Cell Online.

[54]  Anne Osbourn,et al.  Secondary metabolic gene clusters: evolutionary toolkits for chemical innovation. , 2010, Trends in genetics : TIG.

[55]  A. Rokas,et al.  Regulation of Secondary Metabolism by the Velvet Complex Is Temperature-Responsive in Aspergillus , 2016, G3: Genes, Genomes, Genetics.

[56]  J. Tumlinson,et al.  An herbivore elicitor activates the gene for indole emission in maize. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Victor M. Markowitz,et al.  IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites , 2015, mBio.

[58]  Thomas Hartmann,et al.  From waste products to ecochemicals: fifty years research of plant secondary metabolism. , 2007, Phytochemistry.

[59]  C. Shelton,et al.  Annotating Genes of Known and Unknown Function by Large-Scale Coexpression Analysis1[W][OA] , 2008, Plant Physiology.

[60]  A. Rokas,et al.  The Fumagillin Gene Cluster, an Example of Hundreds of Genes under veA Control in Aspergillus fumigatus , 2013, PloS one.

[61]  A. Aharoni,et al.  Biosynthesis of Antinutritional Alkaloids in Solanaceous Crops Is Mediated by Clustered Genes , 2013, Science.

[62]  R. Dixon,et al.  Genomic and Coexpression Analyses Predict Multiple Genes Involved in Triterpene Saponin Biosynthesis in Medicago truncatula[C][W] , 2010, Plant Cell.

[63]  J. Gershenzon,et al.  Two sesquiterpene synthases are responsible for the complex mixture of sesquiterpenes emitted from Arabidopsis flowers. , 2005, The Plant journal : for cell and molecular biology.

[64]  M. Reichelt,et al.  Gene Duplication in the Diversification of Secondary Metabolism: Tandem 2-Oxoglutarate–Dependent Dioxygenases Control Glucosinolate Biosynthesis in Arabidopsis , 2001, Plant Cell.

[65]  N. Dudareva,et al.  Prephenate aminotransferase directs plant phenylalanine biosynthesis via arogenate. , 2011, Nature chemical biology.

[66]  L. Mueller,et al.  Spodoptera exigua caterpillar feeding induces rapid defense responses in maize leaves , 2017, bioRxiv.

[67]  Alexander Platt,et al.  Coselected genes determine adaptive variation in herbivore resistance throughout the native range of Arabidopsis thaliana , 2015, Proceedings of the National Academy of Sciences.

[68]  M Frey,et al.  Analysis of a chemical plant defense mechanism in grasses. , 1997, Science.

[69]  Eve Syrkin Wurtele,et al.  Regulon organization of Arabidopsis , 2008, BMC Plant Biology.

[70]  Julie A. Dickerson,et al.  Arabidopsis gene co-expression network and its functional modules , 2009, BMC Bioinformatics.

[71]  E. Pichersky,et al.  Metabolomics, genomics, proteomics, and the identification of enzymes and their substrates and products. , 2005, Current opinion in plant biology.

[72]  M. Haslbeck,et al.  Elucidation of the Final Reactions of DIMBOA-Glucoside Biosynthesis in Maize: Characterization of Bx6 and Bx71[W][OA] , 2008, Plant Physiology.

[73]  J. Mcchesney,et al.  Plant natural products: back to the future or into extinction? , 2007, Phytochemistry.

[74]  B. Oakley,et al.  Genome-based deletion analysis reveals the prenyl xanthone biosynthesis pathway in Aspergillus nidulans. , 2011, Journal of the American Chemical Society.

[75]  Roger G. Linington,et al.  Insights into Secondary Metabolism from a Global Analysis of Prokaryotic Biosynthetic Gene Clusters , 2014, Cell.

[76]  Seung Y. Rhee,et al.  Genomic Signatures of Specialized Metabolism in Plants , 2014, Science.

[77]  P. Zimmermann,et al.  Large-scale gene expression profiling data for the model moss Physcomitrella patens aid understanding of developmental progression, culture and stress conditions. , 2014, The Plant journal : for cell and molecular biology.

[78]  J A Eisen,et al.  Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. , 1998, Genome research.

[79]  Anne Osbourn,et al.  Investigation of terpene diversification across multiple sequenced plant genomes , 2014, Proceedings of the National Academy of Sciences.

[80]  H. Paris,et al.  The biosynthetic pathway of the nonsugar, high-intensity sweetener mogroside V from Siraitia grosvenorii , 2016, Proceedings of the National Academy of Sciences.

[81]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[82]  T. Ghosh,et al.  Evolutionary Rate Heterogeneity of Primary and Secondary Metabolic Pathway Genes in Arabidopsis thaliana , 2015, Genome biology and evolution.

[83]  Sara Ballouz,et al.  Guidance for RNA-seq co-expression network construction and analysis: safety in numbers , 2015, Bioinform..

[84]  Anne Osbourn,et al.  Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways , 2016, Natural product reports.

[85]  Kengo Kinoshita,et al.  ATTED-II in 2016: A Plant Coexpression Database Towards Lineage-Specific Coexpression , 2015, Plant & cell physiology.

[86]  Xiaowu Wang,et al.  Glucosinolate biosynthetic genes in Brassica rapa. , 2011, Gene.

[87]  E. Sonnhammer,et al.  Genomic gene clustering analysis of pathways in eukaryotes. , 2003, Genome research.

[88]  C. Pál,et al.  The evolutionary dynamics of eukaryotic gene order , 2004, Nature Reviews Genetics.

[89]  I. Raskin,et al.  Plants and human health in the twenty-first century. , 2002, Trends in biotechnology.