Modern Computational Techniques for the HMMER Sequence Analysis

This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies.

[1]  Dong Liu,et al.  Accelerating HMMer on FPGAs using systolic array based architecture , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[2]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[3]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[4]  David R. Butenhof Programming with POSIX threads , 1993 .

[5]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[6]  Guang R. Gao,et al.  Earth: an efficient architecture for running threads , 1999 .

[7]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[8]  Bertil Schmidt,et al.  High Performance Database Searching with HMMer on FPGAs , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[9]  W. Pearson Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. , 1991, Genomics.

[10]  Patrick Crowley,et al.  Exploiting coarse-grained parallelism to accelerate protein motif finding with a network processor , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[11]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[12]  Bertil Schmidt,et al.  MPI-HMMER-Boost: Distributed FPGA Acceleration , 2007, J. VLSI Signal Process..

[13]  K. Gunderson,et al.  Illumina, Inc. , 2005, Pharmacogenomics.

[14]  Tsutomu Maruyama,et al.  Accelerating HMMER search using FPGA , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[15]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[16]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[17]  Brian E. Smith,et al.  An Efficient Parallel Implementation of the Hidden Markov Methods for Genomic Sequence-Search on a Massively Parallel System , 2008, IEEE Transactions on Parallel and Distributed Systems.

[18]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[19]  Jack Dongarra,et al.  PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[20]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[21]  Yimin Zhang,et al.  Characterization and analysis of HMMER and SVM-RFE parallel bioinformatics applications , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[22]  John Paul Walters,et al.  Improving MPI-HMMER's scalability with parallel I/O , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[23]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .

[24]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[25]  Roger D. Chamberlain,et al.  Accelerating HMMER on GPUs by implementing hybrid data and task parallelism , 2010, BCB '10.

[26]  Bertil Schmidt,et al.  Integrating FPGA acceleration into HMMer , 2008, Parallel Comput..

[27]  Michael Kistler,et al.  Exploring the Viability of the Cell Broadband Engine for Bioinformatics Applications , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[28]  Rohit Chandra,et al.  Parallel programming in openMP , 2000 .

[29]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[30]  Bashar Qudah,et al.  Accelerating the HMMER sequence analysis suite using conventional processors , 2006, 20th International Conference on Advanced Information Networking and Applications - Volume 1 (AINA'06).

[31]  John Paul Walters,et al.  Evaluating the use of GPUs in liver image segmentation and HMMER database searches , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[32]  R. Doolittle,et al.  Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[33]  Guang R. Gao,et al.  Implementing parallel hmm-pfam on the EARTH multithreaded architecture , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.