BioVLAB-MMIA: A Cloud Environment for microRNA and mRNA Integrated Analysis (MMIA) on Amazon EC2

MicroRNAs, by regulating the expression of hundreds of target genes, play critical roles in developmental biology and the etiology of numerous diseases, including cancer. As a vast amount of microRNA expression profile data are now publicly available, the integration of microRNA expression data sets with gene expression profiles is a key research problem in life science research. However, the ability to conduct genome-wide microRNA-mRNA (gene) integration currently requires sophisticated, high-end informatics tools, significant expertise in bioinformatics and computer science to carry out the complex integration analysis. In addition, increased computing infrastructure capabilities are essential in order to accommodate large data sets. In this study, we have extended the BioVLAB cloud workbench to develop an environment for the integrated analysis of microRNA and mRNA expression data, named BioVLAB-MMIA. The workbench facilitates computations on the Amazon EC2 and S3 resources orchestrated by the XBaya Workflow Suite. The advantages of BioVLAB-MMIA over the web-based MMIA system include: 1) readily expanded as new computational tools become available; 2) easily modifiable by re-configuring graphic icons in the workflow; 3) on-demand cloud computing resources can be used on an “as needed” basis; 4) distributed orchestration supports complex and long running workflows asynchronously. We believe that BioVLAB-MMIA will be an easy-to-use computing environment for researchers who plan to perform genome-wide microRNA-mRNA (gene) integrated analysis tasks.

[1]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[2]  Curtis Balch,et al.  MicroRNA and mRNA integrated analysis (MMIA): a web tool for examining biological functions of microRNA expression , 2009, Nucleic Acids Res..

[3]  Peter J. Tonellato,et al.  Cloud computing for comparative genomics , 2010, BMC Bioinformatics.

[4]  Marlon E. Pierce,et al.  BioVLAB-Microarray: Microarray Data Analysis in Virtual Environment , 2008, 2008 IEEE Fourth International Conference on eScience.

[5]  Lang Li,et al.  Diverse gene expression and DNA methylation profiles correlate with differential adaptation of breast cancer cells to the antiestrogens tamoxifen and fulvestrant. , 2006, Cancer research.

[6]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[7]  Michael C. Schatz,et al.  CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..

[8]  Yi Huang,et al.  Building web services for scientific grid applications , 2006, IBM J. Res. Dev..

[9]  L. Stein The case for cloud computing in genome informatics , 2010, Genome Biology.

[10]  Danish Sayed,et al.  MicroRNAs in development and disease. , 2011, Physiological reviews.

[11]  Yunlong Liu,et al.  Computational analysis of microRNA profiles and their target genes suggests significant involvement in breast cancer antiestrogen resistance , 2009, Bioinform..

[12]  M. Schatz,et al.  Searching for SNPs with cloud computing , 2009, Genome Biology.

[13]  C. Croce Causes and consequences of microRNA dysregulation in cancer , 2009, Nature Reviews Genetics.

[14]  W. Filipowicz,et al.  Mechanisms of post-transcriptional regulation by microRNAs: are the answers in sight? , 2008, Nature Reviews Genetics.

[15]  Yadong Wang,et al.  miR2Disease: a manually curated database for microRNA deregulation in human disease , 2008, Nucleic Acids Res..

[16]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[17]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[18]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[19]  K. Gunsalus,et al.  Combinatorial microRNA target predictions , 2005, Nature Genetics.

[20]  Dinanath Sulakhe,et al.  Article Withdrawn: GNARE: A Grid-based Server for the Analysis of User Submitted Genomes , 2012, Nucleic acids research.

[21]  Michael C. Schatz,et al.  Cloud Computing and the DNA Data Race , 2010, Nature Biotechnology.

[22]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[23]  Carole A. Goble,et al.  Delivering web service coordination capability to users , 2004, WWW Alt. '04.

[24]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Bertram Ludäscher,et al.  Scientific workflow management and the Kepler system: Research Articles , 2006 .

[26]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[27]  Geoffrey C. Fox,et al.  The Open Grid Computing Environments collaboration: portlets and services for science gateways: Research Articles , 2007 .

[28]  Ian J. Taylor,et al.  Triana: a graphical Web service composition and execution toolkit , 2004, Proceedings. IEEE International Conference on Web Services, 2004..

[29]  Divyakant Agrawal,et al.  CEO a cloud epistasis computing model in GWAS , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[30]  Geoffrey C. Fox,et al.  The Open Grid Computing Environments collaboration: portlets and services for science gateways , 2007, Concurr. Comput. Pract. Exp..

[31]  Dennis Gannon,et al.  V-Lab-Protein: Virtual Collaborative Lab for protein sequence analysis , 2007, 2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[32]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.