Discovery of miR-mRNA interactions via simultaneous Bayesian inference of gene networks and clusters using sequence-based predictions and expression data

MicroRNAs (miRs) are known to interfere with mRNA expression, and much work has been put into predicting and inferring miR-mRNA interactions. Both sequence-based interaction predictions as well as interaction inference based on expression data have been proven somewhat successful; furthermore, models that combine the two methods have had even more success. In this paper, I further refine and enrich the methods of miRmRNA interaction discovery by integrating a Bayesian clustering algorithm into a model of prediction-enhanced miR-mRNA target inference, creating an algorithm called PEACOAT, which is written in the R language. I show that PEACOAT improves the inference of miR-mRNA target interactions using both simulated data and a data set of microarrays from samples of multiple myeloma patients. In simulated networks of 25 miRs and mRNAs, our methods using clustering can improve inference in roughly two-thirds of cases, and in the multiple myeloma data set, KEGG pathway enrichment was found to be more significant with clustering than without. Our findings are consistent with previous work in clustering of non-miR genetic networks and indicate that there could be a significant advantage to clustering of miR and mRNA expression data as a part of interaction inference.

[1]  Anjali J. Koppal,et al.  Supplementary data: Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites , 2010 .

[2]  Hugues Sicotte,et al.  Genome-Wide Transcriptional Profiling Reveals MicroRNA-Correlated Genes and Biological Processes in Human Lymphoblastoid Cell Lines , 2009, PloS one.

[3]  Ana Kozomara,et al.  miRBase: integrating microRNA annotation and deep-sequencing data , 2010, Nucleic Acids Res..

[4]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[5]  Dennis B. Troup,et al.  NCBI GEO: archive for high-throughput functional genomic data , 2008, Nucleic Acids Res..

[6]  Christopher A. Penfold,et al.  How to infer gene networks from expression profiles, revisited , 2011, Interface Focus.

[7]  Brian Godsey,et al.  Improved Inference of Gene Regulatory Networks through Integrated Bayesian Clustering and Dynamic Modeling of Time-Course Expression Data , 2013, PloS one.

[8]  B. Frey,et al.  Using expression profiling data to identify human microRNA targets , 2007, Nature Methods.

[9]  Igor Goryanin,et al.  Journal of Integrative Bioinformatics , 2015 .

[10]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[11]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[12]  Sophie Lèbre,et al.  Statistical Applications in Genetics and Molecular Biology Inferring Dynamic Genetic Networks with Low Order Independencies Inferring Dynamic Genetic Networks with Low Order Independencies ∗ , 2009 .

[13]  A. Luttun,et al.  Quantification of miRNA-mRNA Interactions , 2012, PloS one.

[14]  D. Bartel,et al.  Weak Seed-Pairing Stability and High Target-Site Abundance Decrease the Proficiency of lsy-6 and Other miRNAs , 2011, Nature Structural &Molecular Biology.

[15]  Angel Rubio,et al.  Joint analysis of miRNA and mRNA expression data , 2013, Briefings Bioinform..

[16]  Zoubin Ghahramani,et al.  A Bayesian approach to reconstructing genetic regulatory networks with hidden factors , 2005, Bioinform..

[17]  C. Civin,et al.  Inferring MicroRNA Regulation of mRNA with Partially Ordered Samples of Paired Expression Data and Exogenous Prediction Algorithms , 2012, PloS one.

[18]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[19]  João Ricardo Sato,et al.  Modeling gene expression regulatory networks with the sparse vector autoregressive model , 2007, BMC Systems Biology.

[20]  Brendan J. Frey,et al.  Comparing Sequence and Expression for Predicting microRNA Targets Using GenMIR3 , 2007, Pacific Symposium on Biocomputing.

[21]  Andrew E. Teschendorff,et al.  A variational Bayesian mixture modelling framework for cluster analysis of gene-expression data , 2005, Bioinform..

[22]  L. Lim,et al.  MicroRNA targeting specificity in mammals: determinants beyond seed pairing. , 2007, Molecular cell.

[23]  Anton J. Enright,et al.  Human MicroRNA Targets , 2004, PLoS biology.

[24]  M. Gustafsson,et al.  Constructing and analyzing a large-scale gene-to-gene regulatory network Lasso-constrained inference and biological validation , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[26]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[27]  C. Burge,et al.  Most mammalian mRNAs are conserved targets of microRNAs. , 2008, Genome research.

[28]  Francisco Tirado,et al.  GeneCodis: interpreting gene lists through enrichment analysis and integration of diverse biological information , 2009, Nucleic Acids Res..

[29]  Anton J. Enright,et al.  MicroRNA targets in Drosophila , 2003, Genome Biology.

[30]  Gabriele Sales,et al.  Identification of microRNA expression patterns and definition of a microRNA/mRNA regulatory network in distinct molecular groups of multiple myeloma. , 2009, Blood.

[31]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[32]  Brendan J. Frey,et al.  Bayesian Inference of MicroRNA Targets from Sequence and Expression Data , 2007, J. Comput. Biol..

[33]  John M. Winn,et al.  Variational Message Passing and its Applications , 2004 .

[34]  G. Mufti,et al.  A functional assay for microRNA target identification and validation , 2010, Nucleic acids research.

[35]  Peter F Stadler,et al.  Molecular evolution of a microRNA cluster. , 2004, Journal of molecular biology.

[36]  J. Carazo,et al.  GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists , 2007, Genome Biology.

[37]  Mikael Gustafsson,et al.  Constructing and Analyzing a Large-Scale Gene-to-Gene Regulatory Network-Lasso-Constrained Inference and Biological Validation , 2005, IEEE ACM Trans. Comput. Biol. Bioinform..