Open Reading Frame Phylogenetic Analysis on the Cloud

Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus.

[1]  José A. B. Fortes,et al.  CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications , 2008, 2008 IEEE Fourth International Conference on eScience.

[2]  Zhiqiang Duan,et al.  Molecular evolution of the H6 subtype influenza a viruses from poultry in eastern China from 2002 to 2010 , 2011, Virology Journal.

[3]  M Vijayaraj,et al.  Analysis of the characteristics and trusted security of cloud computing , 2011, CloudCom 2011.

[4]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[5]  Miguel Correia,et al.  Making Hadoop MapReduce Byzantine Fault-Tolerant , 2010, DSN 2010.

[6]  M K Estes,et al.  Norwalk virus genome cloning and characterization , 1990, Science.

[7]  M. Schatz,et al.  Searching for SNPs with cloud computing , 2009, Genome Biology.

[8]  T. Kanda,et al.  Identification of Monomorphic and Divergent Haplotypes in the 2006-2007 Norovirus GII/4 Epidemic Population by Genomewide Tracing of Evolutionary History , 2008, Journal of Virology.

[9]  T. Ando,et al.  Genetic classification of "Norwalk-like viruses.. , 2000, The Journal of infectious diseases.

[10]  Paul Watson,et al.  Cloud Computing for e-Science with CARMEN , 2008 .

[11]  A. M. Hutson,et al.  The 3′ End of Norwalk Virus mRNA Contains Determinants That Regulate the Expression and Stability of the Viral Capsid Protein VP1: a Novel Function for the VP2 Protein , 2003, Journal of Virology.

[12]  T. Gojobori,et al.  Phylogenetic analysis of the complete genome of 18 Norwalk-like viruses. , 2002, Virology.

[13]  Tomoyuki N. Tanaka,et al.  Genetic and antigenic diversity among noroviruses. , 2006, The Journal of general virology.

[14]  Muli Ben-Yehuda,et al.  The Reservoir model and architecture for open federated cloud computing , 2009, IBM J. Res. Dev..

[15]  B. Noble,et al.  On certain integrals of Lipschitz-Hankel type involving products of bessel functions , 1955, Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences.

[16]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[17]  Rajkumar Buyya,et al.  MapReduce Programming Model for .NET-Based Cloud Computing , 2009, Euro-Par.

[18]  H. Virgin,et al.  Mouse Norovirus Replication Is Associated with Virus-Induced Vesicle Clusters Originating from Membranes Derived from the Secretory Pathway , 2009, Journal of Virology.

[19]  J. Butel,et al.  Phylogenetic and structural analyses of MMTV LTR ORF sequences of exogenous and endogenous origins. , 1993, Virology.

[20]  G. Belliot,et al.  In Vitro Proteolytic Processing of the MD145 Norovirus ORF1 Nonstructural Polyprotein Yields Stable Precursors and Products Similar to Those Detected in Calicivirus-Infected Cells , 2003, Journal of Virology.

[21]  T. Oka,et al.  Coexistence of Multiple Genotypes, Including Newly Identified Genotypes, in Outbreaks of Gastroenteritis Due to Norovirus in Japan , 2004, Journal of Clinical Microbiology.

[22]  Laura J. White,et al.  Norwalk Virus Open Reading Frame 3 Encodes a Minor Structural Protein , 2000, Journal of Virology.

[23]  G. Hausner,et al.  Phylogenetic relationships among group II intron ORFs. , 2001, Nucleic acids research.