Multiple Sequence Alignments with Parallel Computing

The growth of bioinformatics and computational biology industry, multiple sequence alignment (MSA) applications have become an important emerging workload. In spite of the large amount of recent attention given to the MSA software design, there has been little quantitative understanding of the performance of such applications on modern microprocessors and systems. In this paper we try to analyze performance and characteristics of MSA software from the perspective of multicore machines. We use several popular MSA programs employing a wide variety of alignment approaches. The basic workload characteristics and the efficiencies of various multicore machines features are examined . In order to mapping parallelism in multicore machines we try to explore different parallel programming approaches using threads and MPI

[1]  José Márcio Machado,et al.  Improvements in the score matrix calculation method using parallel score estimating algorithm , 2013 .

[2]  Kenli Li,et al.  A Data Parallel Strategy for Aligning Multiple Biological Sequences on Homogeneous Multiprocessor Platform , 2011, 2011 Sixth Annual Chinagrid Conference.

[3]  Nadia Essoussi,et al.  A comparison of MSA tools , 2008, Bioinformation.

[4]  Michael Kaufmann,et al.  DIALIGN P: Fast pair-wise and multiple sequence alignment using parallel processors , 2004, BMC Bioinformatics.

[5]  Oswaldo Trelles,et al.  On the Parallelisation of Bioinformatics Applications , 2001, Briefings Bioinform..

[6]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[7]  Liisa Holm,et al.  COFFEE: an objective function for multiple sequence alignments , 1998, Bioinform..

[8]  David L. Millman,et al.  Parallel geometric algorithms for multi-core computers , 2010, Comput. Geom..

[9]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[10]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[11]  David A. Bader,et al.  On the design of high-performance algorithms for aligning multiple protein sequences on mesh-based multiprocessor architectures , 2007, J. Parallel Distributed Comput..

[12]  J. Thompson,et al.  Multiple Sequence Alignment as a Workbench for Molecular Systems Biology , 2006 .

[13]  Bharadwaj Veeravalli,et al.  Aligning biological sequences on distributed bus networks: a divisible load scheduling approach , 2005, IEEE Transactions on Information Technology in Biomedicine.

[14]  S. A. M. Rizvi,et al.  Solving Sequence Alignment Problem using Pipeline Approach , 2009 .

[15]  Kuo-Bin Li,et al.  ClustalW-MPI: ClustalW analysis using distributed and parallel computing , 2003, Bioinform..

[16]  Tahir Naveed,et al.  Parallel Needleman-Wunsch Algorithm for Grid , .

[17]  Jacek Blazewicz,et al.  G-MSA - A GPU-based, fast and accurate algorithm for multiple sequence alignment , 2013, J. Parallel Distributed Comput..

[18]  Kazutaka Katoh,et al.  Parallelization of the MAFFT multiple sequence alignment program , 2010, Bioinform..

[19]  Siamak Rezaei,et al.  Divide-and-Conquer Algorithm for Clustalw-MPI , 2006, 2006 Canadian Conference on Electrical and Computer Engineering.

[20]  Olivier Poch,et al.  RASCAL: Rapid Scanning and Correction of Multiple Sequence Alignments , 2003, Bioinform..

[21]  Taeho Kim,et al.  ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment , 2010, BMC Bioinformatics.

[22]  Yuichiro Shibata,et al.  Highly efficient mapping of the Smith-Waterman algorithm on CUDA-compatible GPUs , 2010, ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors.

[23]  Fernando Guirado,et al.  Exploiting parallelism on progressive alignment methods , 2011, The Journal of Supercomputing.

[24]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[25]  Ashfaq A. Khokhar,et al.  A domain decomposition strategy for alignment of multiple biological sequences on multiprocessor platforms , 2009, J. Parallel Distributed Comput..

[26]  Bharadwaj Veeravalli,et al.  Handling biological sequence alignments on networked computing systems: A divide-and-conquer approach , 2009, J. Parallel Distributed Comput..

[27]  Azzedine Boukerche,et al.  Parallel strategies for local biological sequence alignment in a cluster of workstations , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[28]  Fernando Guirado,et al.  Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud , 2010, Bioinform..

[29]  Kenli Li,et al.  A data parallel strategy for aligning multiple biological sequences on multi-core computers , 2013, Comput. Biol. Medicine.

[30]  Francisco José Esteban,et al.  Direct approaches to exploit many-core architecture in bioinformatics , 2013, Future Gener. Comput. Syst..

[31]  Srinivas Aluru,et al.  PARALLEL-TCOFFEE: A parallel multiple sequence aligner , 2007, PDCS.

[32]  Jaap Heringa,et al.  Parallelized multiple alignment , 2002, Bioinform..

[33]  Roberto Gomperts,et al.  Performance Optimization of Clustal W : Parallel Clustal W , HT Clustal , and MULTICLUSTAL , 2001 .