PGS: a tool for association study of high-dimensional microRNA expression data with repeated measures

MOTIVATION MicroRNAs (miRNAs) are short single-stranded non-coding molecules that usually function as negative regulators to silence or suppress gene expression. Owning to the dynamic nature of miRNA and reduced microarray and sequencing costs, a growing number of researchers are now measuring high-dimensional miRNA expression data using repeated or multiple measures in which each individual has more than one sample collected and measured over time. However, the commonly used univariate association testing or the site-by-site (SBS) testing may underutilize the longitudinal feature of the data, leading to underpowered results and less biologically meaningful results. RESULTS We propose a penalized regression model incorporating grid search method (PGS), for analyzing associations of high-dimensional miRNA expression data with repeated measures. The development of this analytical framework was motivated by a real-world miRNA dataset. Comparisons between PGS and the SBS testing revealed that PGS provided smaller phenotype prediction errors and higher enrichment of phenotype-related biological pathways than the SBS testing. Our extensive simulations showed that PGS provided more accurate estimates and higher sensitivity than the SBS testing with comparable specificities. AVAILABILITY AND IMPLEMENTATION R source code for PGS algorithm, implementation example and simulation study are available for download at https://github.com/feizhe/PGS.

[1]  Anton J. Enright,et al.  Human MicroRNA Targets , 2004, PLoS biology.

[2]  D. Balding,et al.  Epigenome-wide association studies for common human diseases , 2011, Nature Reviews Genetics.

[3]  Martin Reczko,et al.  DIANA-microT web server v5.0: service integration into miRNA functional analysis workflows , 2013, Nucleic Acids Res..

[4]  Annie Qu,et al.  Penalized Generalized Estimating Equations for High‐Dimensional Longitudinal Data Analysis , 2012, Biometrics.

[5]  Martin Reczko,et al.  DIANA miRPath v.2.0: investigating the combinatorial effect of microRNAs in pathways , 2012, Nucleic Acids Res..

[6]  P. Cagle,et al.  Molecular pathology of lung diseases , 2008 .

[7]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[8]  P. Albert,et al.  Models for longitudinal data: a generalized estimating equation approach. , 1988, Biometrics.

[9]  N. Rajewsky,et al.  Widespread changes in protein synthesis induced by microRNAs , 2008, Nature.

[10]  I. Behrmann,et al.  Dynamic regulation of microRNA expression following Interferon-γ-induced gene transcription , 2012, RNA biology.

[11]  C. Burge,et al.  Most mammalian mRNAs are conserved targets of microRNAs. , 2008, Genome research.

[12]  J. Schwartz,et al.  Effects of short‐term exposure to inhalable particulate matter on DNA methylation of tandem repeats , 2014, Environmental and molecular mutagenesis.

[13]  J. Schwartz,et al.  Inhalable particulate matter and mitochondrial DNA copy number in highly exposed individuals in Beijing, China: a repeated-measure study , 2013, Particle and Fibre Toxicology.

[14]  Hans-Jürgen Thiesen,et al.  MicroRNA Expression Changes during Interferon-Beta Treatment in the Peripheral Blood of Multiple Sclerosis Patients , 2013, International journal of molecular sciences.

[15]  S. R. Searle,et al.  The estimation of environmental and genetic trends from records subject to culling. , 1959 .

[16]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[17]  L. Hou,et al.  Effects of airborne pollutants on mitochondrial DNA Methylation , 2013, Particle and Fibre Toxicology.

[18]  J. Schwartz,et al.  Effects of particulate air pollution on blood pressure in a highly exposed population in Beijing, China: a repeated-measure study , 2011, Environmental health : a global access science source.

[19]  X. Guan,et al.  Changes in microRNA expression profile in hippocampus during the acquisition and extinction of cocaine-induced conditioned place preference in rats , 2013, Journal of Biomedical Science.

[20]  Douglas A. Hosack,et al.  Identifying biological themes within lists of genes with EASE , 2003, Genome Biology.

[21]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[22]  J. Schwartz,et al.  Air pollution exposure and telomere length in highly exposed subjects in Beijing, China: a repeated-measure study. , 2012, Environment international.

[23]  U. Bhadra,et al.  MicroRNAs – micro in size but macro in function , 2008, The FEBS journal.

[24]  M. Hendrix,et al.  Microenvironment alters epigenetic and gene expression profiles in Swarm rat chondrosarcoma tumors , 2010, BMC Cancer.

[25]  J. Schwartz,et al.  Altered methylation in tandem repeat element and elemental component levels in inhalable air particles , 2014, Environmental and molecular mutagenesis.

[26]  Ana Kozomara,et al.  miRBase: integrating microRNA annotation and deep-sequencing data , 2010, Nucleic Acids Res..

[27]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  John D. Storey A direct approach to false discovery rates , 2002 .

[29]  D. Bartel,et al.  The impact of microRNAs on protein output , 2008, Nature.