HMMER web server: interactive sequence similarity searching

HMMER is a software suite for protein sequence similarity searches using probabilistic methods. Previously, HMMER has mainly been available only as a computationally intensive UNIX command-line tool, restricting its use. Recent advances in the software, HMMER3, have resulted in a 100-fold speed gain relative to previous versions. It is now feasible to make efficient profile hidden Markov model (profile HMM) searches via the web. A HMMER web server (http://hmmer.janelia.org) has been designed and implemented such that most protein database searches return within a few seconds. Methods are available for searching either a single protein sequence, multiple protein sequence alignment or profile HMM against a target sequence database, and for searching a protein sequence against Pfam. The web server is designed to cater to a range of different user expertise and accepts batch uploading of multiple queries at once. All search methods are also available as RESTful web services, thereby allowing them to be readily integrated as remotely executed tasks in locally scripted workflows. We have focused on minimizing search times and the ability to rapidly display tabular results, regardless of the number of matches found, developing graphical summaries of the search results to provide quick, intuitive appraisement of them.

[1]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[2]  Sean R. Eddy,et al.  A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation , 2008, PLoS Comput. Biol..

[3]  Christine A. Orengo,et al.  Gene3D: merging structure and function for a Thousand genomes , 2009, Nucleic Acids Res..

[4]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[5]  S. Altschul,et al.  The compositional adjustment of amino acid substitution matrices , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[7]  David S. Goodsell,et al.  The RCSB Protein Data Bank: redesigned web site and web services , 2010, Nucleic Acids Res..

[8]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[9]  Anushya Muruganujan,et al.  PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium , 2009, Nucleic Acids Res..

[10]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[11]  S. Altschul Amino acid substitution matrices from an information theoretic perspective , 1991, Journal of Molecular Biology.

[12]  Sean R Eddy,et al.  A new generation of homology search tools based on probabilistic inference. , 2009, Genome informatics. International Conference on Genome Informatics.

[13]  Michael Farrar,et al.  Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .

[14]  Cyrus Chothia,et al.  SUPERFAMILY 1.75 including a domain-centric gene ontology method , 2010, Nucleic Acids Res..

[15]  María Martín,et al.  The Universal Protein Resource (UniProt) in 2010 , 2010 .

[16]  Michelle G. Giglio,et al.  TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes , 2006, Nucleic Acids Res..

[17]  Carole A. Goble,et al.  BioCatalogue: a universal catalogue of web services for the life sciences , 2010, Nucleic Acids Res..

[18]  Tanya Z. Berardini,et al.  PatMatch: a program for finding patterns in peptide and nucleotide sequences , 2005, Nucleic Acids Res..

[19]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[20]  Sameer Velankar,et al.  E-MSD: an integrated data resource for bioinformatics , 2004, Nucleic Acids Res..