Evaluation of Sirtuin-3 probe quality and co-expressed genes using literature cohesion

BackgroundGene co-expression studies can provide important insights into molecular and cellular signaling pathways. The GeneNetwork database is a unique resource for co-expression analysis using data from a variety of tissues across genetically distinct inbred mice. However, extraction of biologically meaningful co-expressed gene sets is challenging due to variability in microarray platforms, probe quality, normalization methods, and confounding biological factors. In this study, we tested whether literature derived functional cohesion could be used as an objective metric in lieu of ‘ground truth’ to evaluate the quality of probes and microarray datasets.ResultsWe examined Sirtuin-3 (Sirt3) co-expressed gene sets extracted from either liver or brain tissues of BXD recombinant inbred mice in the GeneNetwork database. Depending on the microarray platform, there were as many as 26 probes that targeted different regions of Sirt3 primary transcript. Co-expressed gene sets (ranging from 100–1000 genes) associated with each Sirt3 probe were evaluated using the previously developed literature-derived cohesion p-value (LPv) and benchmarked against ‘gold standards’ derived from proteomic studies or Gene Ontology classifications. We found that the maximal F-measure was obtained at an average window size of 535 genes. Using set size of 500 genes, the Pearson correlations between LPv and F-measure as well as between LPv and mitochondrial gene enrichment p-values were 0.90 and 0.93, respectively. Importantly, we found that the LPv approach can distinguish high quality Sirt3 probes. Analysis of the most functionally cohesive Sirt3 co-expressed gene set revealed core metabolic pathways that were shared between hippocampus and liver as well as distinct pathways which were unique to each tissue. These results are consistent with other studies that suggest Sirt3 is a key metabolic regulator and has distinct functions in energy-producing vs. energy-demanding tissues.ConclusionsOur results provide proof-of-concept that literature cohesion analysis is useful for evaluating the quality of probes and microarray datasets, particularly when experimentally derived gold standards are unavailable. Our approach would enable researchers to rapidly identify biologically meaningful co-expressed gene sets and facilitate discovery from high throughput genomic data.

[1]  S. Horvath,et al.  Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks , 2006, BMC Genomics.

[2]  R. Fisher 019: On the Interpretation of x2 from Contingency Tables, and the Calculation of P. , 1922 .

[3]  William B. Langdon,et al.  Widespread existence of uncorrelated probe intensities from within the same probeset on Affymetrix GeneChips , 2008, J. Integr. Bioinform..

[4]  Ramin Homayouni,et al.  Literature aided determination of data quality and statistical significance threshold for gene expression studies , 2012, BMC Genomics.

[5]  Enrico Petretto,et al.  Leveraging gene co-expression networks to pinpoint the regulation of complex traits and disease, with a focus on cardiovascular traits. , 2014, Briefings in functional genomics.

[6]  Michael W. Berry,et al.  Latent Semantic Indexing of PubMed abstracts for identification of transcription factor candidates from microarray derived gene sets , 2011, BMC Bioinformatics.

[7]  Andrey A. Puretskiy,et al.  Nonnegative Tensor Factorization of Biomedical Literature for Analysis of Genomic Data , 2014 .

[8]  Wei Yu,et al.  Calorie restriction and SIRT3 trigger global reprogramming of the mitochondrial protein acetylome. , 2013, Molecular cell.

[9]  Michael W. Berry,et al.  Functional Cohesion of Gene Sets Determined by Latent Semantic Indexing of PubMed Abstracts , 2011, PloS one.

[10]  Robert W. Williams,et al.  Genetic Networks in Mouse Retinal Ganglion Cells , 2016, Front. Genet..

[11]  R. Einspanier,et al.  Quantification and accurate normalisation of small RNAs through new custom RT-qPCR arrays demonstrates Salmonella-induced microRNAs in human monocytes , 2012, BMC Genomics.

[12]  Jing Fan,et al.  SIRT3 mediates multi-tissue coupling for metabolic fuel switching. , 2015, Cell metabolism.

[13]  John Quackenbush,et al.  Multiple-laboratory comparison of microarray platforms , 2005, Nature Methods.

[14]  R. Fisher On the Interpretation of χ2 from Contingency Tables, and the Calculation of P , 2010 .

[15]  Evan G. Williams,et al.  Multilayered Genetic and Omics Dissection of Mitochondrial Activity in a Mouse Reference Population , 2014, Cell.

[16]  Shiwei Song,et al.  A role for the mitochondrial deacetylase Sirt3 in regulating energy homeostasis , 2008, Proceedings of the National Academy of Sciences.

[17]  Lih-Yuan Deng,et al.  Navigating the Functional Landscape of Transcription Factors via Non-Negative Tensor Factorization Analysis of MEDLINE Abstracts , 2017, Front. Bioeng. Biotechnol..

[18]  Behrouz Madahian,et al.  Prioritization, clustering and functional annotation of MicroRNAs using latent semantic indexing of MEDLINE abstracts , 2016, BMC Bioinformatics.

[19]  Jintao Wang,et al.  Genetic correlates of gene expression in recombinant inbred strains , 2007, Neuroinformatics.

[20]  G. Tseng,et al.  Beyond modules and hubs: the potential of gene coexpression networks for investigating molecular mechanisms of complex brain disorders , 2014, Genes, brain, and behavior.

[21]  Michael W. Berry,et al.  Gene clustering by Latent Semantic Indexing of MEDLINE abstracts , 2005, Bioinform..

[22]  Robert W. Williams,et al.  Genetics of gene expression in CNS. , 2014, International review of neurobiology.

[23]  Graham J. G. Upton,et al.  On the causes of outliers in Affymetrix GeneChip data. , 2009, Briefings in functional genomics & proteomics.

[24]  Robert W. Williams,et al.  Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function , 2005, Nature Genetics.

[25]  Mark P Mattson,et al.  Mitochondrial SIRT3 Mediates Adaptive Responses of Neurons to Exercise and Metabolic and Excitatory Challenges. , 2016, Cell metabolism.

[26]  Jonathan M Blackburn,et al.  Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity , 2013, BMC Genomics.

[27]  E. Bossy‐Wetzel,et al.  Forever young: SIRT3 a shield against mitochondrial meltdown, aging, and neurodegeneration , 2013, Front. Aging Neurosci..

[28]  Eleazar Eskin,et al.  Gene networks associated with conditional fear in mice identified using a systems genetics approach , 2011, BMC Systems Biology.

[29]  Johan Auwerx,et al.  Sirtuins as regulators of metabolism and healthspan , 2012, Nature Reviews Molecular Cell Biology.

[30]  G. Church,et al.  Expression dynamics of a cellular metabolic network , 2005, Molecular systems biology.

[31]  Sean D. Mooney,et al.  Label-free quantitative proteomics of the lysine acetylome in mitochondria identifies substrates of SIRT3 in metabolic pathways , 2013, Proceedings of the National Academy of Sciences.

[32]  Robert W. Williams,et al.  Literature-based Evaluation of Microarray Normalization Procedures , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[33]  Jie Luo,et al.  Integrating Genetic and Gene Co-expression Analysis Identifies Gene Networks Involved in Alcohol and Stress Responses , 2018, Front. Mol. Neurosci..

[34]  Klaus Schughart,et al.  Systems Genetics of Liver Fibrosis: Identification of Fibrogenic and Expression Quantitative Trait Loci in the BXD Murine Reference Population , 2014, PloS one.